"sudo git foo" used to consider a repository owned by the original
user a safe one to access; it now also considers a repository owned
by root a safe one, too (after all, if an attacker can craft a
malicious repository owned by root, the box is 0wned already).
* cb/path-owner-check-with-sudo-plus:
git-compat-util: allow root to access both SUDO_UID and root owned
Previous changes introduced a regression which will prevent root for
accessing repositories owned by thyself if using sudo because SUDO_UID
takes precedence.
Loosen that restriction by allowing root to access repositories owned
by both uid by default and without having to add a safe.directory
exception.
A previous workaround that was documented in the tests is no longer
needed so it has been removed together with its specially crafted
prerequisite.
Helped-by: Johanness Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some config variables are combinations of multiple words, and we
typically write them in camelCase forms in manpage and translatable
strings. It's not easy to find mismatches for these camelCase config
variables during code reviews, but occasionally they are identified
during localization translations.
To check for mismatched config variables, I introduced a new feature
in the helper program for localization[^1]. The following mismatched
config variables have been identified by running the helper program,
such as "git-po-helper check-pot".
Lowercase in manpage should use camelCase:
* Documentation/config/http.txt: http.pinnedpubkey
Lowercase in translable strings should use camelCase:
* builtin/fast-import.c: pack.indexversion
* builtin/gc.c: gc.logexpiry
* builtin/index-pack.c: pack.indexversion
* builtin/pack-objects.c: pack.indexversion
* builtin/repack.c: pack.writebitmaps
* commit.c: i18n.commitencoding
* gpg-interface.c: user.signingkey
* http.c: http.postbuffer
* submodule-config.c: submodule.fetchjobs
Mismatched camelCases, choose the former:
* Documentation/config/transfer.txt: transfer.credentialsInUrl
remote.c: transfer.credentialsInURL
[^1]: https://github.com/git-l10n/git-po-helper
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename fetch.credentialsInUrl to transfer.credentialsInUrl as the
single configuration variable should work both in pushing and
fetching.
* ab/credentials-in-url-more:
transfer doc: move fetch.credentialsInUrl to "transfer" config namespace
fetch doc: note "pushurl" caveat about "credentialsInUrl", elaborate
Recent CI update hides certain failures in test jobs, which has
been corrected.
* js/ci-github-workflow-markup:
ci(github): also mark up compile errors
ci(github): use grouping also in the `win-build` job
ci(github): bring back the 'print test failures' step
Improve test coverage with a handful of tests.
* ds/more-test-coverage:
cache-tree: remove cache_tree_find_path()
pack-write: drop always-NULL parameter
t5329: test 'git gc --cruft' without '--prune=now'
t2107: test 'git update-index --verbose'
The code added 0cc05b044f (usage.c: add a non-fatal bug() function to go
with BUG(), 2022-06-02) sets up two va_list variables: one to output to
stderr, and one to trace2. But the order of initialization is wrong:
va_list ap, cp;
va_copy(cp, ap);
va_start(ap, fmt);
We copy the contents of "ap" into "cp" before it is initialized, meaning
it is full of garbage. The two should be swapped.
However, there's another bug, noticed by Johannes Schindelin: we forget
to call va_end() for the copy. So instead of just fixing the copy's
initialization, let's do two separate start/end pairs. This is allowed
by the standard, and we don't need to use copy here since we have access
to the original varargs. Matching the pairs with the calls makes it more
obvious that everything is being done correctly.
Note that we do call bug_fl() in the tests, but it didn't trigger this
problem because our format string doesn't have any placeholders. So even
though we were passing a garbage va_list through the stack, nobody ever
needed to look at it. We can easily adjust one of the trace2 tests to
trigger this, both for bug() and for BUG(). The latter isn't broken, but
it's nice to exercise both a bit more. Without the fix in this patch
(but with the test change), the bug() case causes a segfault.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 63e95beb08 (submodule: port resolve_relative_url from shell to C,
2016-04-15), we added a loop over `url` where we are looking for `../`
or `./` components.
The loop condition we used is the pointer `url` itself, which is clearly
not what we wanted.
Pointed out by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 94cd775a6c (pack-mtimes: support reading .mtimes files,
2022-05-20), code was added to close the file descriptor corresponding
to the mtimes file.
However, it is possible that opening that file failed, in which case we
are closing a file descriptor with the value `-1`. Let's guard that
`close()` call.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 998330ac2e (read-cache: look for shared index files next to the
index, too, 2021-08-26), we added code that allocates memory to store
the base path of a shared index, but we never released that memory.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c51f8f94e5 (submodule--helper: run update procedures from C,
2021-08-24), we added code that first obtains the default remote, and
then adds that to a `strvec`.
However, we never released the default remote's memory.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 961b130d20 (branch: add --recurse-submodules option for branch
creation, 2022-01-28), a funny pattern was introduced where first some
struct is `xmalloc()`ed, then we resize an array whose element type is
the same struct, and then the first struct's contents are copied into
the last element of that array.
Crucially, the `xmalloc()`ed memory never gets released.
Let's avoid that memory leak and that memory allocation dance altogether
by first reallocating the array, then using a pointer to the last array
element to go forward.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts 080ab56a46 (cache-tree: implement cache_tree_find_path(),
2022-05-23). The cache_tree_find_path() method was never actually called
in the topic that added it. I cannot find any reference to it in any of
my forks, so this appears to not be needed at the moment.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_mtimes_file() takes an mtimes parameter as its first option, but
the only caller passes a NULL constant. Drop this parameter to simplify
logic. This can be reverted if that parameter is needed in the future.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a 'git repack --cruft -d' with the wrapper 'git gc --cruft' to
exercise some logic in builtin/gc.c that adds the '--cruft' option to
the underlying 'git repack' command.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--verbose' option reports what is being added and removed from the
index, but has not been tested up to this point. Augment the tests in
t2107 to check the '--verbose' option in some scenarios.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 5dccd9155f (t/perf: add iteration setup mechanism to perf-lib,
2022-04-04) modified the parameter parsing of test_wrapper() such that
the test title was no longer in $1, and is instead in $test_title_.
We correctly pass the new variable to the code which outputs the title
to the log, but missed the spot in test_wrapper() where the title is
written to the ".descr" file which is used to produce the final output
table. As a result, all of the titles are missing from that table (or
worse, using whatever was left in $1):
$ ./p0000-perf-lib-sanity.sh
[...]
Test this tree
------------------------------
0000.1: 0.01(0.01+0.00)
0000.2: 0.01(0.00+0.01)
0000.4: 0.00(0.00+0.00)
0000.5: true 0.00(0.00+0.00)
0000.7: 0.00(0.00+0.00)
0000.8: 0.00(0.00+0.00)
After this patch, we get the pre-5dccd9155f output:
Test this tree
--------------------------------------------------------------------------
0000.1: test_perf_default_repo works 0.00(0.00+0.00)
0000.2: test_checkout_worktree works 0.01(0.00+0.01)
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
0000.8: test-lib-functions correctly loaded in subshells 0.00(0.00+0.00)
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various error messages that talk about the removal of
"--preserve-merges" in "rebase" have been strengthened, and "rebase
--abort" learned to get out of a state that was left by an earlier
use of the option.
* po/rebase-preserve-merges:
rebase: translate a die(preserve-merges) message
rebase: note `preserve` merges may be a pull config option
rebase: help users when dying with `preserve-merges`
rebase.c: state preserve-merges has been removed
"git revert" learns "--reference" option to use more human-readable
reference to the commit it reverts in the message template it
prepares for the user.
* jc/revert-show-parent-info:
revert: --reference should apply only to 'revert', not 'cherry-pick'
revert: optionally refer to commit in the "reference" format
Add and use a LIBCURL prerequisite for tests added in
6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06).
These tests would get as far as emitting a couple of the warnings we
were testing for, but would then die as we had no "git-remote-https"
program compiled.
It would be more consistent with other prerequisites (e.g. PERL for
NO_PERL) to name this "CURL", but since e9184b0789 (t5561: skip tests
if curl is not available, 2018-04-03) we've had that prerequisite
defined for checking of we have the curl(1) program.
The existing "CURL" prerequisite is only used in one place, and we
should probably name it "CURL_PROGRAM", then rename "LIBCURL" to
"CURL" as a follow-up, but for now (pre-v2.37.0) let's aim for the
most minimal fix possible.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "fetch.credentialsInUrl" configuration variable introduced
in 6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06) to "transfer".
There are existing exceptions, but generally speaking the
"<namespace>.<var>" configuration should only apply to command
described in the "namespace" (and its sub-commands, so e.g. "clone.*"
or "fetch.*" might also configure "git-remote-https").
But in the case of "fetch.credentialsInUrl" we've got a configuration
variable that configures the behavior of all of "clone", "push" and
"fetch", someone adjusting "fetch.*" configuration won't expect to
have the behavior of "git push" altered, especially as we have the
pre-existing "{transfer,fetch,receive}.fsckObjects", which configures
different parts of the transfer dialog.
So let's move this configuration variable to the "transfer" namespace
before it's exposed in a release. We could add all of
"{transfer,fetch,pull}.credentialsInUrl" at some other time, but once
we have "fetch" configure "pull" such an arrangement would would be a
confusing mess, as we'd at least need to have "fetch" configure
"push" (but not the other way around), or change existing behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the documentation and release notes entry for the
"fetch.credentialsInUrl" feature added in 6dcbdc0d66 (remote: create
fetch.credentialsInUrl config, 2022-06-06), it currently doesn't
detect passwords in `remote.<name>.pushurl` configuration. We
shouldn't lull users into a false sense of security, so we need to
mention that prominently.
This also elaborates and clarifies the "exposes the password in
multiple ways" part of the documentation. As noted in [1] a user
unfamiliar with git's implementation won't know what to make of that
scary claim, e.g. git hypothetically have novel git-specific ways of
exposing configured credentials.
The reality is that this configuration is intended as an aid for users
who can't fully trust their OS's or system's security model, so lets
say that's what this is intended for, and mention the most common ways
passwords stored in configuration might inadvertently get exposed.
1. https://lore.kernel.org/git/220524.86ilpuvcqh.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an issue that existed before 0527ccb1b5 (add -i: default to the
built-in implementation, 2021-11-30), but which became the default
with that change, we should not be marking tests that are known to
pass as "TODO" tests.
When GIT_TEST_ADD_I_USE_BUILTIN=1 was made the default we started
passing the tests added in 0f0fba2cc8 (t3701: add a test for advanced
split-hunk editing, 2019-12-06) and 1bf01040f0 (add -p: demonstrate
failure when running 'edit' after a split, 2015-04-16).
Thus we've been emitting this sort of output:
$ prove ./t3701-add-interactive.sh
./t3701-add-interactive.sh .. ok
All tests successful.
Test Summary Report
-------------------
./t3701-add-interactive.sh (Wstat: 0 Tests: 70 Failed: 0)
TODO passed: 45, 47
Files=1, Tests=70, 2 wallclock secs ( 0.03 usr 0.00 sys + 0.86 cusr 0.33 csys = 1.22 CPU)
Result: PASS
Which isn't just cosmetic, but due to issues with
test_expect_failure (see [1]) we could e.g. be hiding something as bad
as a segfault in the new implementation. It makes sense catch that,
especially before we put out a release with the built-in "add -i", so
let's generalize the check we were already doing in 0527ccb1b5 with a
new "ADD_I_USE_BUILTIN" prerequisite.
1. https://lore.kernel.org/git/patch-1.7-4624abc2591-20220318T002951Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to log an error return from wait_or_whine() as process
termination of the waited child, which was incorrect.
* js/wait-or-whine-can-fail:
run-command: don't spam trace2_child_exit()
Sample watchman interface hook sometimes failed to produce
correctly formatted JSON message, which has been corrected.
* sn/fsmonitor-missing-clock:
fsmonitor: query watchman with right valid json
Remove redundant copying (with index v3 and older) or possible
over-reading beyond end of mmapped memory (with index v4) has been
corrected.
* zh/read-cache-copy-name-entry-fix:
read-cache.c: reduce unnecessary cache entry name copying
"git show-ref --heads" (and "--tags") still iterated over all the
refs only to discard refs outside the specified area, which has
been corrected.
* tb/show-ref-optim:
builtin/show-ref.c: avoid over-iterating with --heads, --tags
The "fetch.credentialsInUrl" configuration variable controls what
happens when a URL with embedded login credential is used.
* ds/credentials-in-url:
remote: create fetch.credentialsInUrl config
Updating the graft information invalidates the list of parents of
in-core commit objects that used to be in the graft file.
* jt/unparse-commit-upon-graft-change:
commit,shallow: unparse commits if grafts changed
In Git 2.36 we revamped the way how hooks are invoked. One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release. This is getting corrected.
* ab/hooks-regression-fix:
hook API: fix v2.36.0 regression: hooks should be connected to a TTY
run-command: add an "ungroup" option to run_process_parallel()
"git -c diff.submodule=log range-diff" did not show anything for
submodules that changed in the ranges being compared, and
"git -c diff.submodule=diff range-diff" did not work correctly.
Fix this by including the "--submodule=short" output
unconditionally to be compared.
* pb/range-diff-with-submodule:
range-diff: show submodule changes irrespective of diff.submodule
When GCC produces those helpful errors, we will want to present them in
the GitHub workflow runs in the most helpful manner. To that end, we
want to use workflow commands to render errors and warnings:
https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
In the previous commit, we ensured that grouping is used for the build
in all jobs, and this allows us to piggy-back onto the `group` function
to transmogrify the output.
Note: If `set -o pipefail` was available, we could do this in a little
more elegant way. But since some of the steps are run using `dash`, we
have to do a little `{ ...; echo $? >exit.status; } | ...` dance.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already do the same when building Git in all the other jobs.
This will allow us to piggy-back on top of grouping to mark up compiler
errors in the next commit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
More fsmonitor--daemon.
* jh/builtin-fsmonitor-part3: (30 commits)
t7527: improve implicit shutdown testing in fsmonitor--daemon
fsmonitor--daemon: allow --super-prefix argument
t7527: test Unicode NFC/NFD handling on MacOS
t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd
t/helper/hexdump: add helper to print hexdump of stdin
fsmonitor: on macOS also emit NFC spelling for NFD pathname
t7527: test FSMonitor on case insensitive+preserving file system
fsmonitor: never set CE_FSMONITOR_VALID on submodules
t/perf/p7527: add perf test for builtin FSMonitor
t7527: FSMonitor tests for directory moves
fsmonitor: optimize processing of directory events
fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
fsm-health-win32: force shutdown daemon if worktree root moves
fsm-health-win32: add polling framework to monitor daemon health
fsmonitor--daemon: stub in health thread
fsmonitor--daemon: rename listener thread related variables
fsmonitor--daemon: prepare for adding health thread
fsmonitor--daemon: cd out of worktree root
fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS
unpack-trees: initialize fsmonitor_has_run_once in o->result
...
A misconfigured 'branch..remote' led to a bug in configuration
parsing.
* gc/zero-length-branch-config-fix:
remote.c: reject 0-length branch names
remote.c: don't BUG() on 0-length branch names
Rename .env_array member to .env in the child_process structure.
* ab/env-array:
run-command API users: use "env" not "env_array" in comments & names
run-command API: rename "env_array" to "env"
With a more targetted workaround in http.c in another topic, we may
be able to lift this blanket "GCC12 dangling-pointer warning is
broken and unsalvageable" workaround.
* cb/buggy-gcc-12-workaround:
Revert -Wno-error=dangling-pointer
Using `ssh-add -L` for gpg.ssh.defaultKeyCommand is not a good
recommendation. It might switch keys depending on the order of known
keys and it only supports ssh-* and no ecdsa or other keys.
Clarify that we expect a literal key prefixed by `key::`, give valid
example use cases and refer to `user.signingKey` as the preferred
option.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git now shows better information in the GitHub workflow runs when a test
case failed. However, when a test case was implemented incorrectly and
therefore does not even run, nothing is shown.
Let's bring back the step that prints the full logs of the failed tests,
and to improve the user experience, print out an informational message
for readers so that they do not have to know/remember where to see the
full logs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
source: <xmqqlevl4ysk.fsf@gitster.g>
* jc/clone-remote-name-leak-fix:
clone: plug a miniscule leak
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
source: <pull.1221.v2.git.1650911234.gitgitgadget@gmail.com>
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
Avoid problems from interaction between malloc_check and address
sanitizer.
source: <pull.1210.git.1649507317350.gitgitgadget@gmail.com>
* pw/test-malloc-with-sanitize-address:
tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
source: <c35bd0aa-2e46-e710-2b39-89f18bad0097@web.de>
* rs/commit-summary-wo-break-rewrite:
commit, sequencer: turn off break_opt for commit summary
macOS CI jobs have been occasionally flaky due to tentative version
skew between perforce and the homebrew packager. Instead of
failing the whole CI job, just let it skip the p4 tests when this
happens.
source: <20220512223940.238367-1-gitster@pobox.com>
* cb/ci-make-p4-optional:
ci: use https, not http to download binaries from perforce.com
ci: reintroduce prevention from perforce being quarantined in macOS
ci: avoid brew for installing perforce
ci: make failure to find perforce more user friendly
A bit of test framework fixes with a few fixes to issues found by
valgrind.
source: <20220512223218.237544-1-gitster@pobox.com>
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
"git archive --add-file=<path>" picked up the raw permission bits
from the path and propagated to zip output in some cases, without
normalization, which has been corrected (tar output did not have
this issue).
source: <xmqqmtfme8v6.fsf@gitster.g>
* jc/archive-add-file-normalize-mode:
archive: do not let on-disk mode leak to zip archives
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
source: <xmqqh76mf7s4.fsf_-_@gitster.g>
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>
* jc/http-clear-finished-pointer:
http.c: clear the 'finished' member once we are done with it
Fixes real problems noticed by gcc 12 and works around false
positives.
source: <pull.1238.git.1653351786.gitgitgadget@gmail.com>
* js/ci-gcc-12-fixes:
dir.c: avoid "exceeds maximum object size" error with GCC v12.x
nedmalloc: avoid new compile error
compat/win32/syslog: fix use-after-realloc
Test that "git config --show-scope" shows the "worktree" scope, and add
it to the list of scopes in Documentation/git-config.txt.
"git config --help" does not need to be updated because it already
mentions "worktree".
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8d4d86b0 (cache: remove null_sha1, 2019-08-18) removed the
is_null_sha1() function, rewrite rules to correct callers of the
function to use is_null_oid() instead has become irrelevant, as any
new callers of the function will get caught by the compiler much
more quickly without spending cycles on Coccinelle.
Remove these rules.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A git subcommand like "git add -p" spawns a separate git process
while relaying its command line arguments. A pathspec with only
negative elements was mistakenly passed with an empty string, which
has been corrected.
* jc/all-negative-pathspec:
pathspec: correct an empty string used as a pathspec element
Implementation of "scalar diagnose" subcommand.
* js/scalar-diagnose:
scalar: teach `diagnose` to gather loose objects information
scalar: teach `diagnose` to gather packfile info
scalar diagnose: include disk space information
scalar: implement `scalar diagnose`
scalar: validate the optional enlistment argument
archive --add-virtual-file: allow paths containing colons
archive: optionally add "virtual" files
The documentation on the interaction between "--add-file" and
"--prefix" options of "git archive" has been improved.
* rs/document-archive-prefix:
archive: improve documentation of --prefix
Leakfix.
* fh/transport-push-leakfix:
transport: free local and remote refs in transport_push()
transport: unify return values and exit point from transport_push()
transport: remove unnecessary indenting in transport_push()
Update the GitHub workflow support to make it quicker to get to the
failing test.
* js/ci-github-workflow-markup:
ci: call `finalize_test_case_output` a little later
ci(github): mention where the full logs can be found
ci: use `--github-workflow-markup` in the GitHub workflow
ci(github): avoid printing test case preamble twice
ci(github): skip the logs of the successful test cases
ci: optionally mark up output in the GitHub workflow
ci/run-build-and-tests: add some structure to the GitHub workflow output
ci: make it easier to find failed tests' logs in the GitHub workflow
ci/run-build-and-tests: take a more high-level view
test(junit): avoid line feeds in XML attributes
tests: refactor --write-junit-xml code
ci: fix code style
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
CMake updates.
* yw/cmake-updates:
cmake: remove (_)UNICODE def on Windows in CMakeLists.txt
cmake: add pcre2 support
cmake: fix CMakeLists.txt on Linux
In rare cases[1], wait_or_whine() cannot determine a child process's
status (and will return -1 in this case). This can cause Git to issue
trace2 child_exit events despite the fact that the child may still be
running. In pathological cases, we've seen > 80 million exit events in
our trace logs for a single child process.
Fix this by only issuing trace2 events in finish_command_in_signal() if
we get a value other than -1 from wait_or_whine(). This can lead to
missing child_exit events in such a case, but that is preferable to
duplicating events on a scale that threatens to fill the user's
filesystem with invalid trace logs.
[1]: This can happen when:
* waitpid() returns -1 and errno != EINTR
* waitpid() returns an invalid PID
* the status set by waitpid() has neither the WIFEXITED() nor
WIFSIGNALED() flags
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression reported[1] against f443246b9f (commit: convert
{pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22): Due to
using the run_process_parallel() API in the earlier 96e7225b31 (hook:
add 'run' subcommand, 2021-12-22) we'd capture the hook's stderr and
stdout, and thus lose the connection to the TTY in the case of
e.g. the "pre-commit" hook.
As a preceding commit notes GNU parallel's similar --ungroup option
also has it emit output faster. While we're unlikely to have hooks
that emit truly massive amounts of output (or where the performance
thereof matters) it's still informative to measure the overhead. In a
similar "seq" test we're now ~30% faster:
$ cat .git/hooks/seq-hook; git hyperfine -L rev origin/master,HEAD~0 -s 'make CFLAGS=-O3' './git hook run seq-hook'
#!/bin/sh
seq 100000000
Benchmark 1: ./git hook run seq-hook' in 'origin/master
Time (mean ± σ): 787.1 ms ± 13.6 ms [User: 701.6 ms, System: 534.4 ms]
Range (min … max): 773.2 ms … 806.3 ms 10 runs
Benchmark 2: ./git hook run seq-hook' in 'HEAD~0
Time (mean ± σ): 603.4 ms ± 1.6 ms [User: 573.1 ms, System: 30.3 ms]
Range (min … max): 601.0 ms … 606.2 ms 10 runs
Summary
'./git hook run seq-hook' in 'HEAD~0' ran
1.30 ± 0.02 times faster than './git hook run seq-hook' in 'origin/master'
1. https://lore.kernel.org/git/CA+dzEBn108QoMA28f0nC8K21XT+Afua0V2Qv8XkR8rAeqUCCZw@mail.gmail.com/
Reported-by: Anthony Sottile <asottile@umich.edu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
[jc: minor fix-up to tests for consistency]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in fd3cb0501e (remote: move static variables into
per-repository struct, 2021-11-17) where we'd free(remote->pushurl[i])
after having NULL'd out remote->pushurl. itself. We free
"remote->pushurl" in the next "for"-loop, so doing this appears to
have been a copy/paste error.
Before this change GCC 12's -fanalyzer would correctly note that we'd
dereference NULL in this case, this change fixes that:
remote.c: In function ‘remote_clear’:
remote.c:153:17: error: dereference of NULL ‘*remote.pushurl’ [CWE-476] [-Werror=analyzer-null-dereference]
153 | free((char *)remote->pushurl[i]);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[...]
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove braces that don't follow the CodingGuidelines from code added
in fd3cb0501e (remote: move static variables into per-repository
struct, 2021-11-17). A subsequent commit will edit code adjacent to
this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.
This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:
$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
TTY
TTY
$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
NTTY
NTTY
Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:
$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
Benchmark 1: parallel seq ::: 10000000 >/dev/null
Time (mean ± σ): 220.2 ms ± 9.3 ms [User: 124.9 ms, System: 96.1 ms]
Range (min … max): 212.3 ms … 230.5 ms 3 runs
Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
Time (mean ± σ): 154.7 ms ± 0.9 ms [User: 136.2 ms, System: 25.1 ms]
Range (min … max): 153.9 ms … 155.7 ms 3 runs
Summary
'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
1.42 ± 0.06 times faster than 'parallel seq ::: 10000000 >/dev/null '
A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.
For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.
For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.
We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.
Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.
It's safe to do this with a global variable because:
A) hook.c is the only user of it that sets it to non-zero, and before
we'll get any other API users we'll refactor away this method of
passing in the option, i.e. re-roll [1].
B) Even if hook.c wasn't the only user we don't have callers of this
API that concurrently invoke this parallel process starting API
itself in parallel.
As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rare circumstances where the current git index does not carry the
last_update_token, the fsmonitor v2 hook will be invoked with an
empty string which would caused the final rendered json to be invalid.
["query", "/path/to/my/git/repository/", {
"since": ,
"fields": ["name"],
"expression": ["not", ["dirname", ".git"]]
}]
Which will left user with the following error message
> git status
failed to parse command from stdin: line 2, column 13, position 67: unexpected token near ','
Watchman: command returned no output.
Falling back to scanning...
Hide the "since" field in json query when "last_update_token" is empty.
Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.
In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.
When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:
error: git apply: bad git-diff - inconsistent old filename on line 1
error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
'
error: could not parse log for '@{u}..@{1}'
Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.
Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a commit is parsed, it pretends to have a different (possibly
empty) list of parents if there is graft information for that commit.
But there is a bug that could occur when a commit is parsed, the graft
information is updated (for example, when a shallow file is rewritten),
and the same commit is subsequently used: the parents of the commit do
not conform to the updated graft information, but the information at the
time of parsing.
This is usually not an issue, as a commit is usually introduced into the
repository at the same time as its graft information. That means that
when we try to parse that commit, we already have its graft information.
But it is an issue when fetching a shallow point directly into a
repository with submodules. The function
assign_shallow_commits_to_refs() parses all sought objects (including
the shallow point, which we are directly fetching). In update_shallow()
in fetch-pack.c, assign_shallow_commits_to_refs() is called before
commit_shallow_file(), which means that the shallow point would have
been parsed before graft information is updated. Once a commit is
parsed, it is no longer sensitive to any graft information updates. This
parsed commit is subsequently used when we do a revision walk to search
for submodules to fetch, meaning that the commit is considered to have
parents even though it is a shallow point (and therefore should be
treated as having no parents).
Therefore, whenever graft information is updated, mark the commits that
were previously grafts and the commits that are newly grafts as
unparsed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a user facing message for a situation seen in the wild.
Translate it.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--preserve-merges` option was removed by v2.34.0. However
users may not be aware that it is also a Pull configuration option,
which is still offered by major IDE vendors such as Visual Studio.
Extend the `--preserve-merges` die message to also direct users to
the possible use of the `preserve` option in the `pull.rebase` config.
This is an additional 'belt and braces' information statement.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git would die if a "rebase --preserve-merges" was in progress.
Users could neither --quit, --abort, nor --continue the rebase.
Make the `rebase --abort` option available to allow users to remove
traces of any preserve-merges rebase, even if they had upgraded
during a rebase.
One trigger case was an unexpectedly difficult to resolve conflict, as
reported on the `git-users` group.
(https://groups.google.com/g/git-for-windows/c/3jMWbBlXXHM)
Other potential use-cases include git-experts using the portable
'Git on a stick' to help users with an older git version.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since feebd2d256 (rebase: hide --preserve-merges option, 2019-10-18)
this option is now removed as stated in the subsequent release notes.
Fix and reflow the option tip.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
575fa8a3 (read-cache: read data in a hash-independent way,
2019-02-19) added a new code to copy from the on-disk data into the
name member of the in-core cache entry, which is already done
immediately after that in a way that takes prefix-compression into
account.
Remove this code, as it is not just unnecessary, but also can be
reading beyond the on-disk data, when we are copying very long
prefix string from the previous entry.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: rewrote the log message with Réne's findings]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `show-ref` is combined with the `--heads` or `--tags` options, it
can avoid iterating parts of a repository's references that it doesn't
care about.
But it doesn't take advantage of this potential optimization. When this
command was introduced back in 358ddb62cf (Add "git show-ref" builtin
command, 2006-09-15), `for_each_ref_in()` did exist. But since most
repositories don't have many (any?) references that aren't branches or
tags already, this makes little difference in practice.
Though for repositories with a large imbalance of branches and tags (or,
more likely in the case of server operators, many hidden references),
this can make quite a difference. Take, for example, a repository with
500,000 "hidden" references (all of the form "refs/__hidden__/N"), and
a single branch:
git commit --allow-empty -m "base" &&
seq 1 500000 | sed 's,\(.*\),create refs/__hidden__/\1 HEAD,' |
git update-ref --stdin &&
git pack-refs --all
Outputting the existence of that single branch currently takes on the
order of ~50ms on my machine. The vast majority of this time is wasted
iterating through references that we know we're going to discard.
Instead, teach `show-ref` that it can iterate just "refs/heads" and/or
"refs/tags" when given `--heads` and/or `--tags`, respectively. A few
small interesting things to note:
- When given either option, we can avoid the general-purpose
for_each_ref() call altogether, since we know that it won't give us
any references that we wouldn't filter out already.
- We can make two separate calls to `for_each_fullref_in()` (and
avoid, say, the more specialized `for_each_fullref_in_prefixes()`,
since we know that the set of references enumerated by each is
disjoint, so we'll never see the same reference appear in both
calls.
- We have to use the "fullref" variant (instead of just
`for_each_branch_ref()` and `for_each_tag_ref()`), since we expect
fully-qualified reference names to appear in `show-ref`'s output.
When either of `heads_only` or `tags_only` is set, we can eliminate the
strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we
know that `show_ref()` will never see a non-branch or tag reference.
Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance
`show-ref`'s pattern matching, since `show-ref` patterns match on the
_suffix_ (e.g., the pattern "foo" shows "refs/heads/foo",
"refs/tags/foo", and etc, not "foo/*").
Nonetheless, in our synthetic example above, this provides a significant
speed-up ("git" is roughly v2.36, "git.compile" is this patch):
$ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads'
Benchmark 1: git show-ref --heads
Time (mean ± σ): 49.9 ms ± 6.2 ms [User: 45.6 ms, System: 4.1 ms]
Range (min … max): 46.1 ms … 73.6 ms 43 runs
Benchmark 2: git.compile show-ref --heads
Time (mean ± σ): 2.8 ms ± 0.4 ms [User: 1.4 ms, System: 1.2 ms]
Range (min … max): 1.3 ms … 5.6 ms 957 runs
Summary
'git.compile show-ref --heads' ran
18.03 ± 3.38 times faster than 'git show-ref --heads'
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users sometimes provide a "username:password" combination in their
plaintext URLs. Since Git stores these URLs in plaintext in the
.git/config file, this is a very insecure way of storing these
credentials. Credential managers are a more secure way of storing this
information.
System administrators might want to prevent this kind of use by users on
their machines.
Create a new "fetch.credentialsInUrl" config option and teach Git to
warn or die when seeing a URL with this kind of information. The warning
anonymizes the sensitive information of the URL to be clear about the
issue.
This change currently defaults the behavior to "allow" which does
nothing with these URLs. We can consider changing this behavior to
"warn" by default if we wish. At that time, we may want to add some
advice about setting fetch.credentialsInUrl=ignore for users who still
want to follow this pattern (and not receive the warning).
An earlier version of this change injected the logic into
url_normalize() in urlmatch.c. While most code paths that parse URLs
eventually normalize the URL, that normalization does not happen early
enough in the stack to avoid attempting connections to the URL first. By
inserting a check into the remote validation, we identify the issue
before making a connection. In the old code path, this was revealed by
testing the new t5601-clone.sh test under --stress, resulting in an
instance where the return code was 13 (SIGPIPE) instead of 128 from the
die().
However, we can reuse the parsing information from url_normalize() in
order to benefit from its well-worn parsing logic. We can use the struct
url_info that is created in that method to replace the password with
"<redacted>" in our error messages. This comes with a slight downside
that the normalized URL might look slightly different from the input URL
(for instance, the normalized version adds a closing slash). This should
not hinder users figuring out what the problem is and being able to fix
the issue.
As an attempt to ensure the parsing logic did not catch any
unintentional cases, I modified this change locally to to use the "die"
option by default. Running the test suite succeeds except for the
explicit username:password URLs used in t5550-http-fetch-dumb.sh and
t5541-http-push-smart.sh. This means that all other tested URLs did not
trigger this logic.
The tests show that the proper error messages appear (or do not
appear), but also count the number of error messages. When only warning,
each process validates the remote URL and outputs a warning. This
happens twice for clone, three times for fetch, and once for push.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A mechanism to pack unreachable objects into a "cruft pack",
instead of ejecting them into loose form to be reclaimed later, has
been introduced.
* tb/cruft-packs:
sha1-file.c: don't freshen cruft packs
builtin/gc.c: conditionally avoid pruning objects via loose
builtin/repack.c: add cruft packs to MIDX during geometric repack
builtin/repack.c: use named flags for existing_packs
builtin/repack.c: allow configuring cruft pack generation
builtin/repack.c: support generating a cruft pack
builtin/pack-objects.c: --cruft with expiration
reachable: report precise timestamps from objects in cruft packs
reachable: add options to add_unseen_recent_objects_to_traversal
builtin/pack-objects.c: --cruft without expiration
builtin/pack-objects.c: return from create_object_entry()
t/helper: add 'pack-mtimes' test-tool
pack-mtimes: support writing pack .mtimes files
chunk-format.h: extract oid_version()
pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles'
pack-mtimes: support reading .mtimes files
Documentation/technical: add cruft-packs.txt
Disable the "do not remove the directory the user started Git in"
logic when Git cannot tell where that directory is. Earlier we
refused to run in such a case.
* kl/setup-in-unreadable-worktree:
setup: don't die if realpath(3) fails on getcwd(3)
A workflow change for translators are being proposed.
* jx/l10n-workflow-change:
l10n: Document the new l10n workflow
Makefile: add "po-init" rule to initialize po/XX.po
Makefile: add "po-update" rule to update po/XX.po
po/git.pot: don't check in result of "make pot"
po/git.pot: this is now a generated file
Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES
i18n CI: stop allowing non-ASCII source messages in po/git.pot
Makefile: have "make pot" not "reset --hard"
Makefile: generate "po/git.pot" from stable LOCALIZED_C
Makefile: sort source files before feeding to xgettext
Teach "git repack --geometric" work better with "--keep-pack" and
avoid corrupting the repository when packsize limit is used.
* tb/geom-repack-with-keep-and-max:
builtin/repack.c: ensure that `names` is sorted
t7703: demonstrate object corruption with pack.packSizeLimit
repack: respect --keep-pack with geometric repack
"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
The multi-pack-index code did not protect the packfile it is going
to depend on from getting removed while in use, which has been
corrected.
* tb/midx-race-in-pack-objects:
builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
builtin/pack-objects.c: ensure included `--stdin-packs` exist
builtin/pack-objects.c: avoid redundant NULL check
pack-bitmap.c: check preferred pack validity when opening MIDX bitmap
Preliminary code refactoring around transport and bundle code.
* ds/bundle-uri:
bundle.h: make "fd" version of read_bundle_header() public
remote: allow relative_url() to return an absolute url
remote: move relative_url()
http: make http_get_file() external
fetch-pack: move --keep=* option filling to a function
fetch-pack: add a deref_without_lazy_fetch_extended()
dir API: add a generalized path_match_flags() function
connect.c: refactor sending of agent & object-format
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.
* ns/batch-fsync:
core.fsyncmethod: performance tests for batch mode
t/perf: add iteration setup mechanism to perf-lib
core.fsyncmethod: tests for batch mode
test-lib-functions: add parsing helpers for ls-files and ls-tree
core.fsync: use batch mode and sync loose objects by default on Windows
unpack-objects: use the bulk-checkin infrastructure
update-index: use the bulk-checkin infrastructure
builtin/add: add ODB transaction around add_files_to_cache
cache-tree: use ODB transaction around writing a tree
core.fsyncmethod: batched disk flushes for loose-objects
bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
bulk-checkin: rename 'state' variable and separate 'plugged' boolean
Deprecate non-cone mode of the sparse-checkout feature.
* en/sparse-cone-becomes-default:
Documentation: some sparsity wording clarifications
git-sparse-checkout.txt: mark non-cone mode as deprecated
git-sparse-checkout.txt: flesh out pattern set sections a bit
git-sparse-checkout.txt: add a new EXAMPLES section
git-sparse-checkout.txt: shuffle some sections and mark as internal
git-sparse-checkout.txt: update docs for deprecation of 'init'
git-sparse-checkout.txt: wording updates for the cone mode default
sparse-checkout: make --cone the default
tests: stop assuming --no-cone is the default mode for sparse-checkout
Add a test for the regression introduced in my 9c4d58ff2c (ls-tree:
split up "fast path" callbacks, 2022-03-23) and fixed in
350296cc78 (ls-tree: `-l` should not imply recursive listing,
2022-04-04), and test for the test of ls-tree option/mode combinations
to make sure we don't have other blind spots.
The setup for these tests can be shared with those added in the
1041d58b4d (Merge branch 'tl/ls-tree-oid-only', 2022-04-04) topic, so
let's create a new t/lib-t3100.sh to help them share data.
The existing tests in "t3104-ls-tree-format.sh" didn't deal with a
submodule, which they'll now encounter with as the
setup_basic_ls_tree_data() sets one up.
This extensive testing should give us confidence that there were no
further regressions in this area. The lack of testing was noted back
in [1], but unfortunately we didn't cover that blind-spot before
9c4d58ff2c.
1. https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up on a preceding commit which changed all references to the
"env_array" when referring to the "struct child_process" member. These
changes are all unnecessary for the compiler, but help the code's
human readers.
All the comments that referred to "env_array" have now been updated,
as well as function names and variables that had "env_array" in their
name, they now refer to "env".
In addition the "out" name for the submodule.h prototype was
inconsistent with the function definition's use of "env_array" in
submodule.c. Both of them use "env" now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start following-up on the rename mentioned in c7c4bdeccf (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".
The "env_array" name was picked in 19a583dc39 (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.
This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.
The rest of this is all a result of applying [1]:
* make contrib/coccinelle/run_command.cocci.patch
* patch -p1 <contrib/coccinelle/run_command.cocci.patch
* git add -u
1. cat contrib/coccinelle/run_command.pending.cocci
@@
struct child_process E;
@@
- E.env_array
+ E.env
@@
struct child_process *E;
@@
- E->env_array
+ E->env
I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "BUG" output originally added in a97e4075a1 (Keep
rename/rename conflicts of intermediate merges while doing recursive
merge, 2007-03-31), and later made to say it was a "BUG" in
19c6a4f836 (merge-recursive: do not return NULL only to cause
segfault, 2010-01-21) to use the new bug() function.
This gets the same job done with slightly less code, as we won't need
to prefix lines with "BUG: ". More importantly we'll now log the full
set of messages via trace2, before this we'd only log the one BUG()
invocation.
We don't replace the last "BUG()" invocation with "BUG_if_bug()", as
in this case we're sure that we called bug() earlier, so there's no
need to make it a conditional.
While we're at it let's replace "There" with "there" in the message,
i.e. not start a message with a capital letter, per the
CodingGuidelines.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend code added in a6a8431968 (receive-pack.c: shorten the
execute_commands loop over all commands, 2015-01-07) and amended to
hard die in b6a4788586 (receive-pack.c: die instead of error in case
of possible future bug, 2015-01-07) to use the new bug() function
instead.
Let's also rename the warn_if_*() function that code is in to
BUG_if_*(), its name became outdated in b6a4788586.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the assertions added in bf3ff338a2 (parse-options: stop
abusing 'callback' for lowlevel callbacks, 2019-01-27) to use optbug()
instead of BUG(). At this point we're looping over individual options,
so if we encounter any issues we'd like to report the offending option.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run into bugs in parse-options.c usage it's good to be able to
note all the issues we ran into before dying. This use-case is why we
have the optbug() function introduced in 1e5ce570ca (parse-options:
clearer reporting of API misuse, 2010-12-02)
Let's change this code to use the new bug() API introduced in the
preceding commit, which cuts down on the verbosity of
parse_options_check().
There are existing uses of BUG() in adjacent code that should have
been using optbug() that aren't being changed here. That'll be done in
a subsequent commit. This only changes the optbug() callers.
Since this will invoke BUG() the previous exit(128) code will be
changed, but in this case that's what we want, i.e. to have
encountering a BUG() return the specific "BUG" exit code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a bug() function to use in cases where we'd like to indicate a
runtime BUG(), but would like to defer the BUG() call because we're
possibly accumulating more bug() callers to exhaustively indicate what
went wrong.
We already have this sort of facility in various parts of the
codebase, just in the form of ad-hoc re-inventions of the
functionality that this new API provides. E.g. this will be used to
replace optbug() in parse-options.c, and the 'error("BUG:[...]' we do
in a loop in builtin/receive-pack.c.
Unlike the code this replaces we'll log to trace2 with this new bug()
function (as with other usage.c functions, including BUG()), we'll
also be able to avoid calls to xstrfmt() in some cases, as the bug()
function itself accepts variadic sprintf()-like arguments.
Any caller to bug() can follow up such calls with BUG_if_bug(),
which will BUG() out (i.e. abort()) if there were any preceding calls
to bug(), callers can also decide not to call BUG_if_bug() and leave
the resulting BUG() invocation until exit() time. There are currently
no bug() API users that don't call BUG_if_bug() themselves after a
for-loop, but allowing for not calling BUG_if_bug() keeps the API
flexible. As the tests and documentation here show we'll catch missing
BUG_if_bug() invocations in our exit() wrapper.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the exit() wrapper added in ee4512ed48 (trace2: create new
combined trace facility, 2019-02-22) so that we'll split up the trace2
logging concerns from wanting to wrap the "exit()" function itself for
other purposes.
This makes more sense structurally, as we won't seem to conflate
non-trace2 behavior with the trace2 code. I'd previously added an
explanation for this in 368b584315 (common-main.c: call exit(), don't
return, 2021-12-07), that comment is being adjusted here.
Now the only thing we'll do if we're not using trace2 is to truncate
the "code" argument to the lowest 8 bits.
We only need to do that truncation on non-POSIX systems, but in
ee4512ed48 that "if defined(__MINGW32__)" code added in
47e3de0e79 (MinGW: truncate exit()'s argument to lowest 8 bits,
2009-07-05) was made to run everywhere. It might be good for clarify
to narrow that down by an "ifdef" again, but I'm not certain that in
the interim we haven't had some other non-POSIX systems rely the
behavior. On a POSIX system taking the lowest 8 bits is implicit, see
exit(3)[1] and wait(2)[2]. Let's leave a comment about that instead.
1. https://man7.org/linux/man-pages/man3/exit.3.html
2. https://man7.org/linux/man-pages/man2/wait.2.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the HTML Standard FAQ:
“What is the DOCTYPE for modern HTML documents?
In text/html documents:
<!DOCTYPE html>
In documents delivered with an XML media type: no DOCTYPE is required
and its use is generally unnecessary. However, you may use one if you
want (see the following question). Note that the above is well-formed
XML.”
Source: [1]
Gitweb uses an XHTML 1.0 DOCTYPE:
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
While that DOCTYPE is still valid [2], it has several disadvantages:
1. It’s misleading. If an XML parser uses the DTD at the given link,
then the entities and ⋅ won’t get declared. Instead, the
parser has to use a DTD from the HTML Standard that has nothing to do
with XHTML 1.0 [2].
2. It’s obsolete. XHTML 1.0 was last revised in 2002 and was superseded in
2018 [3].
3. It’s unreliable. Gitweb uses and ⋅ but lets an external file
define them. “[…U]using entity references for characters in XML documents
is unsafe if they are defined in an external file (except for <, >,
&, ", and ').” [4]
[1]: <https://github.com/whatwg/html/blob/main/FAQ.md#what-is-the-doctype-for-modern-html-documents>
[2]: <https://html.spec.whatwg.org/multipage/xhtml.html#parsing-xhtml-documents>
[3]: <https://www.w3.org/TR/xhtml1/#xhtml>
[4]: <https://html.spec.whatwg.org/multipage/xhtml.html#writing-xhtml-documents>
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Branch names can't be empty, so config keys with an empty branch name,
e.g. "branch..remote", are silently ignored.
Since these config keys will never be useful, make it a fatal error when
remote.c finds a key that starts with "branch." and has an empty
subsection.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
4a2dcb1a08 (remote: die if branch is not found in repository,
2021-11-17) introduced a regression where multiple config entries with
an empty branch name, e.g.
[branch ""]
remote = foo
merge = bar
could cause Git to fail when it tries to look up branch tracking
information.
We parse the config key to get (branch name, branch name length), but
when the branch name subsection is empty, we get a bogus branch name,
e.g. "branch..remote" gives (".remote", 0). We continue to use the bogus
branch name as if it were valid, and prior to 4a2dcb1a08, this wasn't an
issue because length = 0 caused the branch name to effectively be ""
everywhere.
However, that commit handles length = 0 inconsistently when we create
the branch:
- When find_branch() is called to check if the branch exists in the
branch hash map, it interprets a length of 0 to mean that it should
call strlen on the char pointer.
- But the code path that inserts into the branch hash map interprets a
length of 0 to mean that the string is 0-length.
This results in the bug described above:
- "branch..remote" looks for ".remote" in the branch hash map. Since we
do not find it, we insert the "" entry into the hash map.
- "branch..merge" looks for ".merge" in the branch hash map. Since we
do not find it, we again try to insert the "" entry into the hash map.
However, the entries in the branch hash map are supposed to be
appended to, not overwritten.
- Since overwriting an entry is a BUG(), Git fails instead of silently
ignoring the empty branch name.
Fix the bug by removing the convenience strlen functionality, so that
0 means that the string is 0-length. We still insert a bogus branch name
into the hash map, but this will be fixed in a later commit.
Reported-by: "Ing. Martin Prantl Ph.D." <perry@ntis.zcu.cz>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 9c539d1027 (config.mak.dev: alternative
workaround to gcc 12 warning in http.c, 2022-04-15).
Let's give GCC12's "dangling-pointer" warning a second chance, as we
have a more focused workaround for this particular compiler glitch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes real problems noticed by gcc 12 and works around false
positives.
* js/ci-gcc-12-fixes:
dir.c: avoid "exceeds maximum object size" error with GCC v12.x
nedmalloc: avoid new compile error
compat/win32/syslog: fix use-after-realloc
As 'revert' and 'cherry-pick' share a lot of code, it is easy to
modify the behaviour of one command and inadvertently affect the
other. An earlier change to teach the '--reference' option and the
'revert.reference' configuration variable to the former was not
careful enough and 'cherry-pick --reference' wasn't rejected as an
error.
It is possible to think 'cherry-pick -x' might benefit from the
'--reference' option, but it is fundamentally different from
'revert' in at least two ways to make it questionable:
- 'revert' names a commit that is ancestor of the resulting commit,
so an abbreviated object name with human readable title is
sufficient to identify the named commit uniquely without using
the full object name. On the other hand, 'cherry-pick'
usually [*] picks a commit that is not an ancestor. It might be
even picking a private commit that never becomes part of the
public history.
- The whole commit message of 'cherry-pick' is a copy of the
original commit, and there is nothing gained to repeat only the
title part on 'cherry-picked from' message.
[*] well, you could revert and then you can pick the original that
was reverted to get back to where you were, but then you can
revert the revert to do the same thing.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -i" was rewritten in C some time ago and has been in
testing; the reimplementation is now exposed to general public by
default.
* js/use-builtin-add-i:
add -i: default to the built-in implementation
t2016: require the PERL prereq only when necessary
The tests that ensured merges stop when interfering local changes
are present did not make sure that local changes are preserved; now
they do.
* jc/t6424-failing-merge-preserve-local-changes:
t6424: make sure a failed merge preserves local changes
With the new http.curloptResolve configuration, the CURLOPT_RESOLVE
mechanism that allows cURL based applications to use pre-resolved
IP addresses for the requests is exposed to the scripts.
* cc/http-curlopt-resolve:
http: add custom hostname to IP address resolutions
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the `--add-virtual-file=<path>:<content>` option, `git archive` now
supports use cases where relatively trivial files need to be added that
do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: tweaked <path> handling]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pathspecs with only negative elements did not work with some
commands that pass the pathspec along to a subprocess. For
instance,
$ git add -p -- ':!*.txt'
should add everything except for paths ending in ".txt", but it gets
complaint from underlying "diff-index" and aborts.
We used to error out when a pathspec with only negative elements in
it, like the one in the above example. Later, 859b7f1d (pathspec:
don't error out on all-exclusionary pathspec patterns, 2017-02-07)
updated the logic to add an empty string as an extra element. The
intention was to let the extra element to match everything and let
the negative ones given by the user to subtract from it.
At around the same time, we were migrating from "an empty string is
a valid pathspec element that matches everything" to "either a dot
or ":/" is used to match all, and an empty string is rejected",
between d426430e (pathspec: warn on empty strings as pathspec,
2016-06-22) and 9e4e8a64 (pathspec: die on empty strings as
pathspec, 2017-06-06). I think 9e4e8a64, which happened long after
859b7f1d happened, was not careful enough to turn the empty string
859b7f1d added to either a dot or ":/".
A care should be taken as the definition of "everything" depends on
subcommand. For the purpose of "add -p", adding a "." to add
everything in the current directory is the right thing to do. But
for some other commands, ":/" (i.e. really really everything, even
things outside the current subdirectory) is the right choice.
We would break commands in a big way if we get this wrong, so add a
handful of test pieces to make sure the resulting code still
excludes the paths that are expected and includes "everything" else.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the interaction between --add-file and --prefix by giving an
example.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks in transport_push(), where remote_refs and local_refs
are never freed.
116 bytes in 1 blocks are definitely lost in loss record 56 of 103
at 0x484486F: malloc (vg_replace_malloc.c:381)
by 0x4938D7E: strdup (strdup.c:42)
by 0x628418: xstrdup (wrapper.c:39)
by 0x4FD454: process_capabilities (connect.c:232)
by 0x4FD454: get_remote_heads (connect.c:354)
by 0x610A38: handshake (transport.c:333)
by 0x612B02: transport_push (transport.c:1302)
by 0x4803D6: push_with_options (push.c:357)
by 0x4811D6: do_push (push.c:414)
by 0x4811D6: cmd_push (push.c:650)
by 0x405210: run_builtin (git.c:465)
by 0x405210: handle_builtin (git.c:719)
by 0x406363: run_argv (git.c:786)
by 0x406363: cmd_main (git.c:917)
by 0x404F17: main (common-main.c:56)
5,912 (388 direct, 5,524 indirect) bytes in 2 blocks are definitely lost in loss record 98 of 103
at 0x4849464: calloc (vg_replace_malloc.c:1328)
by 0x628705: xcalloc (wrapper.c:150)
by 0x5C216D: alloc_ref_with_prefix (remote.c:975)
by 0x5C232A: alloc_ref (remote.c:983)
by 0x5C232A: one_local_ref (remote.c:2299)
by 0x5C232A: one_local_ref (remote.c:2289)
by 0x5BDB03: do_for_each_repo_ref_iterator (iterator.c:418)
by 0x5B4C4F: do_for_each_ref (refs.c:1486)
by 0x5B4C4F: refs_for_each_ref (refs.c:1492)
by 0x5B4C4F: for_each_ref (refs.c:1497)
by 0x5C6ADF: get_local_heads (remote.c:2310)
by 0x612A85: transport_push (transport.c:1286)
by 0x4803D6: push_with_options (push.c:357)
by 0x4811D6: do_push (push.c:414)
by 0x4811D6: cmd_push (push.c:650)
by 0x405210: run_builtin (git.c:465)
by 0x405210: handle_builtin (git.c:719)
by 0x406363: run_argv (git.c:786)
by 0x406363: cmd_main (git.c:917)
Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems there is no reason to return 1 instead of -1 when push_refs()
is not set in transport vtable. Let's unify the error return values and
use the done label as a single exit point from transport_push().
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the big indented block for transport_push() check in transport vtable
and let's just return error immediately. Hopefully this makes the code
more readable.
Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A typical "git revert" commit uses the full title of the original
commit in its title, and starts its body of the message with:
This reverts commit 8fa7f667cf61386257c00d6e954855cc3215ae91.
This does not encourage the best practice of describing not just
"what" (i.e. "Revert X" on the title says what we did) but "why"
(i.e. and it does not say why X was undesirable).
We can instead phrase this first line of the body to be more like
This reverts commit 8fa7f667 (do this and that, 2022-04-25)
so that the title does not have to be
Revert "do this and that"
We can instead use the title to describe "why" we are reverting the
original commit.
Introduce the "--reference" option to "git revert", and also the
revert.reference configuration variable, which defaults to false, to
tweak the title and the first line of the draft commit message for
when creating a "revert" commit.
When this option is in use, the first line of the pre-filled editor
buffer becomes a comment line that tells the user to say _why_. If
the user exits the editor without touching this line by mistake,
what we prepare to become the first line of the body, i.e. "This
reverts commit 8fa7f667 (do this and that, 2022-04-25)", ends up to
be the title of the resulting commit. This behaviour is designed to
help such a user to identify such a revert in "git log --oneline"
easily so that it can be further reworded with "git rebase -i" later.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the tests that exercise implicit shutdown cases
to make them more robust and less racy.
The fsmonitor--daemon will implicitly shutdown in a variety
of situations, such as when the ".git" directory is deleted
or renamed.
The existing tests would delete or rename the directory, sleep
for one second, and then check the status of the daemon. This
is racy, since the client/status command has no way to sync
with the daemon. This was noticed occasionally on very slow
CI build machines where it would cause a random test to fail.
Replace the simple sleep with a sleep-and-retry loop.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a test in t7527 to verify that we get a stray warning from
`git fsmonitor--daemon start` when indirectly called from
`git submodule absorbgitdirs`.
Update `git fsmonitor--daemon` to take (and ignore) the `--super-prefix`
argument to suppress the warning.
When we have:
1. a submodule with a `sub/.git/` directory (rather than a `sub/.git`
file).
2. `core.fsmonitor` is turned on in the submodule, but the daemon is
not yet started in the submodule.
3. and someone does a `git submodule absorbgitdirs` in the super.
Git will recursively invoke `git submodule--helper absorb-git-dirs`
in the submodule. This will read the index and may attempt to start
the fsmonitor--daemon with the `--super-prefix` argument.
`git fsmonitor--daemon start` does not accept the `--super-prefix`
argument and causes a warning to be issued.
This does not cause a problem because the `refresh_index()` code
assumes a trivial response if the daemon does not start.
The net-net is a harmelss, but stray warning. Lets eliminate the
warning.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Confirm that the daemon reports events using the on-disk
spelling for Unicode NFC/NFD characters. On APFS we still
have Unicode aliasing, so we cannot create two files that
only differ by NFC/NFD, but the on-disk format preserves
the spelling used to create the file. On HFS+ we also
have aliasing, but the path is always stored on disk in
NFD.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a set of prereqs to help understand how file names
are handled by the filesystem when they contain NFC and NFD
Unicode characters.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit NFC or NFC and NFD spellings of pathnames on macOS.
MacOS is Unicode composition insensitive, so NFC and NFD spellings are
treated as aliases and collide. While the spelling of pathnames in
filesystem events depends upon the underlying filesystem, such as
APFS, HFS+ or FAT32, the OS enforces such collisions regardless of
filesystem.
Teach the daemon to always report the NFC spelling and to report
the NFD spelling when stored in that format on the disk.
This is slightly more general than "core.precomposeUnicode".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test that FS events from the OS are received using the preserved,
on-disk spelling of files/directories rather than spelling used
to make the change.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Never set CE_FSMONITOR_VALID on the cache-entry of submodule
directories.
During a client command like 'git status', we may need to recurse
into each submodule to compute a status summary for the submodule.
Since the purpose of the ce_flag is to let Git avoid scanning a
cache-entry, setting the flag causes the recursive call to be
avoided and we report incorrect (no status) for the submodule.
We created an OS watch on the root directory of our working
directory and we receive events for everything in the cone
under it. When submodules are present inside our working
directory, we receive events for both our repo (the super) and
any subs within it. Since our index doesn't have any information
for items within the submodules, we can't use those events.
We could try to truncate the paths of those events back to the
submodule boundary and mark the GITLINK as dirty, but that
feels expensive since we would have to prefix compare every FS
event that we receive against a list of submodule roots. And
it still wouldn't be sufficient to correctly report status on
the submodule, since we don't have any space in the cache-entry
to cache the submodule's status (the 'SCMU' bits in porcelain
V2 speak). That is, the CE_FSMONITOR_VALID bit just says that
we don't need to scan/inspect it because we already know the
answer -- it doesn't say that the item is clean -- and we
don't have space in the cache-entry to store those answers.
So we should always do the recursive scan.
Therefore, we should never set the flag on GITLINK cache-entries.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create unit tests to move a directory. Verify that `git status`
gives the same result with and without FSMonitor enabled.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to perform binary search over the cache-entries for a directory
notification and then linearly scan forward to find the immediate children.
Previously, when the FSMonitor reported a modified directory Git would
perform a linear search on the entire cache-entry array for all
entries matching that directory prefix and invalidate them. Since the
cache-entry array is already sorted, we can use a binary search to
find the first matching entry and then only linearly walk forward and
invalidate entries until the prefix changes.
Also, the original code would invalidate anything having the same
directory prefix. Since a directory event should only be received for
items that are immediately within the directory (and not within
sub-directories of it), only invalidate those entries and not the
whole subtree.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the listener thread to shutdown the daemon if the spelling of the
worktree root directory changes.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Force shutdown fsmonitor daemon if the worktree root directory
is moved, renamed, or deleted.
Use Windows low-level GetFileInformationByHandle() to get and
compare the Windows system unique ID for the directory with a
cached version when we started up. This lets us detect the
case where someone renames the directory that we are watching
and then creates a new directory with the original pathname.
This is important because we are listening to a named pipe for
requests and they are stored in the Named Pipe File System (NPFS)
which a kernel-resident pseudo filesystem not associated with
the actual NTFS directory.
For example, if the daemon was watching "~/foo/", it would have
a directory-watch handle on that directory and a named-pipe
handle for "//./pipe/...foo". Moving the directory to "~/bar/"
does not invalidate the directory handle. (So the daemon would
actually be watching "~/bar" but listening on "//./pipe/...foo".
If the user then does "git init ~/foo" and causes another daemon
to start, the first daemon will still have ownership of the pipe
and the second daemon instance will fail to start. "git status"
clients in "~/foo" will ask "//./pipe/...foo" about changes and
the first daemon instance will tell them about "~/bar".
This commit causes the first daemon to shutdown if the system unique
ID for "~/foo" changes (changes from what it was when the daemon
started). Shutdown occurs after a periodic poll. After the
first daemon exits and releases the lock on the named pipe,
subsequent Git commands may cause another daemon to be started
on "~/foo". Similarly, a subsequent Git command may cause another
daemon to be started on "~/bar".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the Windows version of the "health" thread to periodically
inspect the system and shutdown if warranted.
This commit updates the thread's wait loop to use a timeout and
defines a (currently empty) table of functions to poll the system.
A later commit will add functions to the table to actually
inspect the system.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create another thread to watch over the daemon process and
automatically shut it down if necessary.
This commit creates the basic framework for a "health" thread
to monitor the daemon and/or the file system. Later commits
will add platform-specific code to do the actual work.
The "health" thread is intended to monitor conditions that
would be difficult to track inside the IPC thread pool and/or
the file system listener threads. For example, when there are
file system events outside of the watched worktree root or if
we want to have an idle-timeout auto-shutdown feature.
This commit creates the health thread itself, defines the thread-proc
and sets up the thread's event loop. It integrates this new thread
into the existing IPC and Listener thread models.
This commit defines the API to the platform-specific code where all of
the monitoring will actually happen.
The platform-specific code for MacOS is just stubs. Meaning that the
health thread will immediately exit on MacOS, but that is OK and
expected. Future work can define MacOS-specific monitoring.
The platform-specific code for Windows sets up enough of the
WaitForMultipleObjects() machinery to watch for system and/or custom
events. Currently, the set of wait handles only includes our custom
shutdown event (sent from our other theads). Later commits in this
series will extend the set of wait handles to monitor other
conditions.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename platform-specific listener thread related variables
and data types as we prepare to add another backend thread
type.
[] `struct fsmonitor_daemon_backend_data` becomes `struct fsm_listen_data`
[] `state->backend_data` becomes `state->listen_data`
[] `state->error_code` becomes `state->listen_error_code`
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor daemon thread startup to make it easier to start
a third thread class to monitor the health of the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the fsmonitor--daemon to CD outside of the worktree
before starting up.
The common Git startup mechanism causes the CWD of the daemon process
to be in the root of the worktree. On Windows, this causes the daemon
process to hold a locked handle on the CWD and prevents other
processes from moving or deleting the worktree while the daemon is
running.
CD to HOME before entering main event loops.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ignore FSEvents resulting from `xattr` changes. Git does not care about
xattr's or changes to xattr's, so don't waste time collecting these
events in the daemon nor transmitting them to clients.
Various security tools add xattrs to files and/or directories, such as
to mark them as having been downloaded. We should ignore these events
since it doesn't affect the content of the file/directory or the normal
meta-data that Git cares about.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Initialize `o->result.fsmonitor_has_run_once` based upon value
in `o->src_index->fsmonitor_has_run_once` to prevent a second
fsmonitor query during the tree traversal and possibly getting
a skewed view of the working directory.
The checkout code has already talked to the fsmonitor and the
traversal is updating the index as it traverses, so there is
no need to query the fsmonitor.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On MacOS mark repos on NTFS or FAT32 volumes as incompatible.
The builtin FSMonitor used Unix domain sockets on MacOS for IPC
with clients. These sockets are kept in the .git directory.
Unix sockets are not supported by NTFS and FAT32, so the daemon
cannot start up.
Test for this during our compatibility checking so that client
commands do not keep trying to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to detect remote working directories on Windows and mark them as
incompatible with FSMonitor.
With this `git fsmonitor--daemon run` will error out with a message like it
does for bare repos.
Client commands, such as `git status`, will not attempt to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git to detect remote working directories on macOS and mark them as
incompatible with FSMonitor.
With this, `git fsmonitor--daemon run` will error out with a message
like it does for bare repos.
Client commands, like `git status`, will not attempt to start the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
VFS for Git virtual repositories are incompatible with FSMonitor.
VFS for Git is a downstream fork of Git. It contains its own custom
file system watcher that is aware of the virtualization. If a working
directory is being managed by VFS for Git, we should not try to watch
it because we may get incomplete results.
We do not know anything about how VFS for Git works, but we do
know that VFS for Git working directories contain a well-defined
config setting. If it is set, mark the working directory as
incompatible.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend generic incompatibility checkout with platform-specific
mechanism. Stub in Win32 version.
In the existing fsmonitor-settings code we have a way to mark
types of repos as incompatible with fsmonitor (whether via the
hook and IPC APIs). For example, we do this for bare repos,
since there are no files to watch.
Extend this exclusion mechanism for platform-specific reasons.
This commit just creates the framework and adds a stub for Win32.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bare repos do not have a worktree, so there is nothing for the
daemon watch.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a stress test to hammer on the fsmonitor daemon.
Create a client-side thread pool of n threads and have
each of them make m requests as fast as they can.
We do not currently inspect the contents of the response.
We're only interested in placing a heavy request load on
the daemon.
This test is useful for interactive testing and various
experimentation. For example, to place additional load
on the daemon while another test is running. We currently
do not have a test script that actually uses this helper.
We might add such a test in the future.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create some test repos with UTF8 characters in the pathname of the
root directory and verify that the builtin FSMonitor can watch them.
This test is mainly for Windows where we need to avoid `*A()`
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach FSMonitor daemon on Windows to recognize shortname paths as
aliases of normal longname paths. FSMonitor clients, such as `git
status`, should receive the longname spelling of changed files (when
possible).
Sometimes we receive FS events using the shortname, such as when a CMD
shell runs "RENAME GIT~1 FOO" or "RMDIR GIT~1". The FS notification
arrives using whatever combination of long and shortnames were used by
the other process. (Shortnames do seem to be case normalized,
however.)
Use Windows GetLongPathNameW() to try to map the pathname spelling in
the notification event into the normalized longname spelling. (This
can fail if the file/directory is deleted, moved, or renamed, because
we are asking the FS for the mapping in response to the event and
after it has already happened, but we try.)
Special case the shortname spelling of ".git" to avoid under-reporting
these events.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't bother to freshen objects stored in a cruft pack individually
by updating the `.mtimes` file. This is because we can't portably `mmap`
and write into the middle of a file (i.e., to update the mtime of just
one object). Instead, we would have to rewrite the entire `.mtimes` file
which may incur some wasted effort especially if there a lot of cruft
objects and they are freshened infrequently.
Instead, force the freshening code to avoid an optimizing write by
writing out the object loose and letting it pick up a current mtime.
This works because we prefer the mtime of the loose copy of an object
when both a loose and packed one exist (whether or not the packed copy
comes from a cruft pack or not).
This could certainly do with a test and/or be included earlier in this
series/PR, but I want to wait until after I have a chance to clean up
the overly-repetitive nature of the cruft pack tests in general.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose the new `git repack --cruft` mode from `git gc` via a new opt-in
flag. When invoked like `git gc --cruft`, `git gc` will avoid exploding
unreachable objects as loose ones, and instead create a cruft pack and
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using cruft packs, the following race can occur when a geometric
repack that writes a MIDX bitmap takes place afterwords:
- First, create an unreachable object and do an all-into-one cruft
repack which stores that object in the repository's cruft pack.
- Then make that object reachable.
- Finally, do a geometric repack and write a MIDX bitmap.
Assuming that we are sufficiently unlucky as to select a commit from the
MIDX which reaches that object for bitmapping, then the `git
multi-pack-index` process will complain that that object is missing.
The reason is because we don't include cruft packs in the MIDX when
doing a geometric repack. Since the "make that object reachable" doesn't
necessarily mean that we'll create a new copy of that object in one of
the packs that will get rolled up as part of a geometric repack, it's
possible that the MIDX won't see any copies of that now-reachable
object.
Of course, it's desirable to avoid including cruft packs in the MIDX
because it causes the MIDX to store a bunch of objects which are likely
to get thrown away. But excluding that pack does open us up to the above
race.
This patch demonstrates the bug, and resolves it by including cruft
packs in the MIDX even when doing a geometric repack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use the `util` pointer for items in the `existing_packs` string list
to indicate which packs are going to be deleted. Since that has so far
been the only use of that `util` pointer, we just set it to 0 or 1.
But we're going to add an additional state to this field in the next
patch, so prepare for that by adding a #define for the first bit so we
can more expressively inspect the flags state.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In servers which set the pack.window configuration to a large value, we
can wind up spending quite a lot of time finding new bases when breaking
delta chains between reachable and unreachable objects while generating
a cruft pack.
Introduce a handful of `repack.cruft*` configuration variables to
control the parameters used by pack-objects when generating a cruft
pack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose a way to split the contents of a repository into a main and cruft
pack when doing an all-into-one repack with `git repack --cruft -d`, and
a complementary configuration variable.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous patch, pack-objects learned how to generate a cruft pack
so long as no objects are dropped.
This patch teaches pack-objects to handle the case where a non-never
`--cruft-expiration` value is passed. This case is slightly more
complicated than before, because we want pack-objects to save
unreachable objects which would have been pruned when there is another
recent (i.e., non-prunable) unreachable object which reaches the other.
We'll call these objects "unreachable but reachable-from-recent".
Here is how pack-objects handles `--cruft-expiration`:
- Instead of adding all objects outside of the kept pack(s) into the
packing list, only handle the ones whose mtime is within the grace
period.
- Construct a reachability traversal whose tips are the
unreachable-but-recent objects.
- Then, walk along that traversal, stopping if we reach an object in
the kept pack. At each step along the traversal, we add the object
we are visiting to the packing list.
In the majority of these cases, any object we visit in this traversal
will already be in our packing list. But we will sometimes encounter
reachable-from-recent cruft objects, which we want to retain even if
they aged out of the grace period.
The most subtle point of this process is that we actually don't need to
bother to update the rescued object's mtime. Even though we will write
an .mtimes file with a value that is older than the expiration window,
it will continue to survive cruft repacks so long as any objects which
reach it haven't aged out.
That is, a future repack will also exclude that object from the initial
packing list, only to discover it later on when doing the reachability
traversal.
Finally, stopping early once an object is found in a kept pack is safe
to do because the kept packs ordinarily represent which packs will
survive after repacking. Assuming that it _isn't_ safe to halt a
traversal early would mean that there is some ancestor object which is
missing, which implies repository corruption (i.e., the complete set of
reachable objects isn't present).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When generating a cruft pack, the caller within pack-objects will want
to know the precise timestamps of cruft objects (i.e., their
corresponding values in the .mtimes table) rather than the mtime of the
cruft pack itself.
Teach add_recent_packed() to lookup each object's precise mtime from the
.mtimes file if one exists (indicated by the is_cruft bit on the
packed_git structure).
A couple of small things worth noting here:
- load_pack_mtimes() needs to be called before asking for
nth_packed_mtime(), and that call is done lazily here. That function
exits early if the .mtimes file has already been opened and parsed,
so only the first call is slow.
- Checking the is_cruft bit can be done without any extra work on the
caller's behalf, since it is set up for us automatically as a
side-effect of calling add_packed_git() (just like the 'pack_keep'
and 'pack_promisor' bits).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function behaves very similarly to what we will need in
pack-objects in order to implement cruft packs with expiration. But it
is lacking a couple of things. Namely, it needs:
- a mechanism to communicate the timestamps of individual recent
objects to some external caller
- and, in the case of packed objects, our future caller will also want
to know the originating pack, as well as the offset within that pack
at which the object can be found
- finally, it needs a way to skip over packs which are marked as kept
in-core.
To address the first two, add a callback interface in this patch which
reports the time of each recent object, as well as a (packed_git,
off_t) pair for packed objects.
Likewise, add a new option to the packed object iterators to skip over
packs which are marked as kept in core. This option will become
implicitly tested in a future patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `pack-objects` how to generate a cruft pack when no objects are
dropped (i.e., `--cruft-expiration=never`). Later patches will teach
`pack-objects` how to generate a cruft pack that prunes objects.
When generating a cruft pack which does not prune objects, we want to
collect all unreachable objects into a single pack (noting and updating
their mtimes as we accumulate them). Ordinary use will pass the result
of a `git repack -A` as a kept pack, so when this patch says "kept
pack", readers should think "reachable objects".
Generating a non-expiring cruft packs works as follows:
- Callers provide a list of every pack they know about, and indicate
which packs are about to be removed.
- All packs which are going to be removed (we'll call these the
redundant ones) are marked as kept in-core.
Any packs the caller did not mention (but are known to the
`pack-objects` process) are also marked as kept in-core. Packs not
mentioned by the caller are assumed to be unknown to them, i.e.,
they entered the repository after the caller decided which packs
should be kept and which should be discarded.
Since we do not want to include objects in these "unknown" packs
(because we don't know which of their objects are or aren't
reachable), these are also marked as kept in-core.
- Then, we enumerate all objects in the repository, and add them to
our packing list if they do not appear in an in-core kept pack.
This results in a new cruft pack which contains all known objects that
aren't included in the kept packs. When the kept pack is the result of
`git repack -A`, the resulting pack contains all unreachable objects.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new caller in the next commit will want to immediately modify the
object_entry structure created by create_object_entry(). Instead of
forcing that caller to wastefully look-up the entry we just created,
return it from create_object_entry() instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, we will implement and test support for writing a
cruft pack via a special mode of `git pack-objects`. To make sure that
objects are written with the correct timestamps, and a new test-tool
that can dump the object names and corresponding timestamps from a given
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the `.mtimes` format is defined, supplement the pack-write API
to be able to conditionally write an `.mtimes` file along with a pack by
setting an additional flag and passing an oidmap that contains the
timestamps corresponding to each object in the pack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three definitions of an identical function which converts
`the_hash_algo` into either 1 (for SHA-1) or 2 (for SHA-256). There is a
copy of this function for writing both the commit-graph and
multi-pack-index file, and another inline definition used to write the
.rev header.
Consolidate these into a single definition in chunk-format.h. It's not
clear that this is the best header to define this function in, but it
should do for now.
(Worth noting, the .rev caller expects a 4-byte unsigned, but the other
two callers work with a single unsigned byte. The consolidated version
uses the latter type, and lets the compiler widen it when required).
Another caller will be added in a subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This structure will be used to communicate the per-object mtimes when
writing a cruft pack. Here, we need the full packing_data structure
because the mtime information is stored in an array there, not on the
individual object_entry's themselves (to avoid paying the overhead in
structure width for operations which do not generate a cruft pack).
We haven't passed this information down before because one of the two
callers (in bulk-checkin.c) does not have a packing_data structure at
all. In that case (where no cruft pack will be generated), NULL is
passed instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To store the individual mtimes of objects in a cruft pack, introduce a
new `.mtimes` format that can optionally accompany a single pack in the
repository.
The format is defined in Documentation/technical/pack-format.txt, and
stores a 4-byte network order timestamp for each object in name (index)
order.
This patch prepares for cruft packs by defining the `.mtimes` format,
and introducing a basic API that callers can use to read out individual
mtimes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote -v" now shows the list-objects-filter used during
fetching from the remote, if available.
* ac/remote-v-with-object-list-filters:
builtin/remote.c: teach `-v` to list filters for promisor remotes
With a recent update to refuse access to repositories of other
people by default, "sudo make install" and "sudo git describe"
stopped working. This series intends to loosen it while keeping
the safety.
* cb/path-owner-check-with-sudo:
t0034: add negative tests and allow git init to mostly work under sudo
git-compat-util: avoid failing dir ownership checks if running privileged
t: regression git needs safe.directory when using sudo
"git -c branch.autosetupmerge=simple branch $A $B" will set the $B
as $A's upstream only when $A and $B shares the same name, and "git
-c push.default=simple" on branch $A would push to update the
branch $A at the remote $B came from. Also more places use the
sole remote, if exists, before defaulting to 'origin'.
* tk/simple-autosetupmerge:
push: new config option "push.autoSetupRemote" supports "simple" push
push: default to single remote even when not named origin
branch: new autosetupmerge option 'simple' for matching branches
Change the "flow" of how translators interact with the l10n repository
at [1] to adjust it for a new workflow of not having a po/git.pot file
in-tree at all, and to not commit line numbers to the po/*.po files
that we do track in tree.
The current workflow was added in a combination of dce37b66fb (l10n:
initial git.pot for 1.7.10 upcoming release, 2012-02-13) and
271ce198cd (Update l10n guide, 2012-02-29).
As noted in preceding commits I think that it came about due to
technical debt I'd left behind in how the "po/git.pot" file was
created, and a mis-impression that the file:line comments were needed
as anything more than a transitory translation aid.
As the updated po/README.md shows the new workflow is substantially
the same, the difference is that translators no longer need to
initially pull from the l10n coordinator for a new po/git.pot, they
can simply use git.git's canonical source repository.
The l10n coordinator is still expected to announce a release to
translate, which presumably would always be Junio's latest release
tag. I'm not certain if this part of the process is actually
important. I.e. the delta translation-wise between that tag and
"master" is usually pretty small, so perhaps translators can just work
on "master" instead.
1. https://github.com/git-l10n/git-po/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core translation is the minimum set of work that must be done for a
new language translation.
There are over 5000 messages in the template message file "po/git.pot"
that need to be translated. It is not a piece of cake for such a huge
workload. So we used to define a small set of messages called "core
translation" that a new l10n contributor must complete before sending
pull request to the l10n coordinator.
By pulling in some parts of the git-po-helper[^1] logic, we add a new
rule to create this core translation message "po/git-core.pot":
make po/git-core.pot
To help new l10n contributors to initialized their "po/XX.pot" from
"po/git-core.pot", we also add new rules "po-init":
make po-init PO_FILE=po/XX.po
[^1]: https://github.com/git-l10n/git-po-helper/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since there is no longer a "po/git.pot" file in tree, a l10n team leader
has to run several commands to update their "po/XX.po" file:
$ make pot
$ msgmerge --add-location --backup=off -U po/XX.po po/git.pot
To make this process easier, add a new rule so that l10n team leaders
can update their "po/XX.po" with one command. E.g.:
$ make po-update PO_FILE=po/zh_CN.po
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "po/git.pot" file from being tracked, which started with
dce37b66fb (l10n: initial git.pot for 1.7.10 upcoming release,
2012-02-13).
The reason the po/git.pot started being checked in was because the
po/*.po files were changed a schema where we'd generate them from a
known-good snapshot of po/git.pot, instead of each translator running
"make pot" themselves.
This makes sense, but we don't need to carry this file in-tree just to
achieve that aim, and doing so has resulted in a significant amount of
"diff churn" since this method of doing it was introduced:
$ git log -p --oneline -- po/git.pot|wc -l
553743
We can instead let l10n contributors to generate "po/git.pot" in runtime
to update their own "po/XX.po", and the l10n coordinator can check
pull requests using CI pipeline.
This reverts to the schema introduced initially in cd5513a716 (i18n:
Makefile: "pot" target to extract messages marked for translation,
2011-02-22).
The actual "git rm" of po/git.pot was in preceding commit to make this
change easier to review, and to preempt the mailing list from blocking
it due to it being too large.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We get source files saved in "$(FOUND_SOURCE_FILES)" by running the
command "git ls-files" or the command "find". We tried to have the
both commands return the same list of files, but apparently the "find"
command will return more files, such as the generated headers. We can
filter out these generated headers to get closer results.
In addition to this, "$(FOUND_SOURCE_FILES)" may contain duplicate
files. E.g. "git-ls-files" may have duplicate entries for the same file
in different staging areas if there are unresolved conflicts in the
working tree. For this case, we can reduce duplicate entries by passing
the option "--deduplicate" to git-ls-files.
Junio reported that when running "make" in a working tree with
unresolved conflicts, "make" may report warnings like below:
Makefile:xxxx: target '.build/pot/po/FOO.c.po' given more than once
in the same rule
The duplicate targets are introduced by the following pattern rule we
added in the preceding commit for incremental build of "po/git.pot".
$(LOCALIZED_C_GEN_PO): .build/pot/po/%.po: %
Although we have resolved this issue by sorting to create a unique
$(LOCALIZED_C), other targets may benefit from this. Such as: tags,
cscope.out, etc.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commit we moved away from using xgettext(1) to both
generate the po/git.pot, and to merge the incrementally generated
po/git.pot+ file as we sourced translations from C, shell and Perl.
Doing it this way, which dates back to my initial
implementation[1][2][3] was conflating two things: With xgettext(1)
the --from-code both controls what encoding is specified in the
po/git.pot's header, and what encoding we allow in source messages.
We don't ever want to allow non-ASCII in *source messages*, and doing
so has hid e.g. a buggy message introduced in
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) from us, we'd warn about it before, but only when running
"make pot", but the operation would still succeed. Now we'll error out
on it when running "make pot".
Since the preceding Makefile changes made this easy: let's add a "make
check-pot" target with the same prerequisites as the "po/git.pot"
target, but without changing the file "po/git.pot". Running it as part
of the "static-analysis" CI target will ensure that we catch any such
issues in the future. E.g.:
$ make check-pot
XGETTEXT .build/pot/po/builtin/submodule--helper.c.po
xgettext: Non-ASCII string at builtin/submodule--helper.c:3381.
Please specify the source encoding through --from-code.
make: *** [.build/pot/po/builtin/submodule--helper.c.po] Error 1
1. cd5513a716 (i18n: Makefile: "pot" target to extract messages
marked for translation, 2011-02-22)
2. adc3b2b276 (Makefile: add xgettext target for *.sh files,
2011-05-14)
3. 5e9637c629 (i18n: add infrastructure for translating Git with
gettext, 2011-11-18)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before commit fc0fd5b23b (Makefile: help gettext tools to cope with our
custom PRItime format, 2017-07-20), we'd consider source files as-is
with gettext, but because we need to understand PRItime in the same way
that gettext itself understands PRIuMAX, we'd first check if we had a
clean checkout, then munge all of the processed files in-place with
"sed", generate "po/git.pot", and then finally "reset --hard" to undo
our changes.
By generating "pot" snippets in ".build/pot/po" for each source file
and rewriting certain source files with PRItime macros to temporary
files in ".build/pot/po", we can avoid running "make pot" by altering
files in place and doing a "reset --hard" afterwards.
This speed of "make pot" is slower than before on an initial run,
because we run "xgettext" many times (once per source file), but it
can be boosted by parallelization. It is *much* faster for incremental
runs, and will allow us to implement related targets in subsequent
commits.
When the "pot" target was originally added in cd5513a716 (i18n:
Makefile: "pot" target to extract messages marked for translation,
2011-02-22) it behaved like a "normal" target. I.e. we'd skip the
re-generation of the po/git.pot if nothing had to be done.
Then after po/git.pot was checked in in dce37b66fb (l10n: initial
git.pot for 1.7.10 upcoming release, 2012-02-13) the target was broken
until 1f31963e92 (i18n: treat "make pot" as an explicitly-invoked
target, 2014-08-22) when it was made to depend on "FORCE". I.e. the
Makefile's dependency resolution inherently can't handle incremental
building when the target file may be updated by git (or something else
external to "make"). But this case no longer applies, so FORCE is no
longer needed.
That out of the way, the main logic change here is getting rid of the
"reset --hard":
We'll generate intermediate ".build/pot/po/%.po" files from "%", which
is handy to see at a glance what strings (if any) in a given file are
marked for translation:
$ make .build/pot/po/pretty.c.po
[...]
$ cat .build/pot/po/pretty.c.po
#: pretty.c:1051
msgid "unable to parse --pretty format"
msgstr ""
$
For these C source files which contain the PRItime macros, we will
create temporary munged "*.c" files in a tree in ".build/pot/po"
corresponding to our source tree, and have "xgettext" consider those.
The rule needs to be careful to "(cd .build/pot/po && ...)", because
otherwise the comments in the po/git.pot file wouldn't refer to the
correct source locations (they'd be prefixed with ".build/pot/po").
These temporary munged "*.c” files will be removed immediately after
the corresponding po files are generated, because some development tools
cannot ignore the duplicate source files in the ".build" directory
according to the ".gitignore" file, and that may cause trouble.
The output of the generated po/git.pot file is changed in one minor
way: Because we're using msgcat(1) instead of xgettext(1) to
concatenate the output we'll now disambiguate where "TRANSLATORS"
comments come from, in cases where a message is the same in N files,
and either only one has a "TRANSLATORS" comment, or they're
different. E.g. for the "Your edited hunk[...]" message we'll now
apply this change (comment content elided):
+#. #-#-#-#-# add-patch.c.po #-#-#-#-#
#. TRANSLATORS: do not translate [y/n]
[...]
+#. #-#-#-#-# git-add--interactive.perl.po #-#-#-#-#
#. TRANSLATORS: do not translate [y/n]
[...]
#: add-patch.c:1253 git-add--interactive.perl:1244
msgid ""
"Your edited hunk does not apply. Edit again (saying \"no\" discards!) [y/n]? "
msgstr ""
There are six such changes, and they all make the context more
understandable, as msgcat(1) is better at handling these edge cases
than xgettext(1)'s previously used "--join-existing" flag.
But filenames in the above disambiguation lines of extracted-comments
have an extra ".po" extension compared to the filenames at the file
locations. While we could rename the intermediate ".build/pot/po/%.po"
files without the ".po" extension to use more intuitive filenames in
the disambiguation lines of extracted-comments, but that will confuse
developer tools with lots of invalid C or other source files in
".build/pot/po" directory.
The addition of "--omit-header" option for xgettext makes the "pot"
snippets in ".build/pot/po/*.po" smaller. But as we'll see in a
subsequent commit this header behavior has been hiding an
encoding-related bug from us, so let's carry it forward instead of
re-generating it with xgettext(1).
The "po/git.pot" file should have a header entry, because a proper
header entry will increase the speed of creating a new po file using
msginit and set a proper "POT-Creation-Date:" field in the header
entry of a "po/XX.po" file. We use xgettext to generate a separate
header file at ".build/pot/git.header" from "/dev/null", and use this
header to assemble "po/git.pot".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Different users may generate a different message template file
"po/git.pot". This is because the POT file is generated from
"$(LOCALIZED_C)", which is supposed to list all the sources that we
extract the strings to be translated from. But "$(LOCALIZED_C)"
includes "$(C_OBJ)", which only lists the source files used in the
current build for a specific platform and specific compiler
conditions.
Instead of using "$(C_OBJ)", we use "$(FOUND_C_SOURCES)", which lists
all source files we keep track of (or ship in a tarball extract), to
form a stable "LOCALIZED_C". We also add "$(SCALAR_SOURCES)", which
is part of "$(C_OBJ)" but not included in "$(FOUND_C_SOURCES)".
With this update, the newly generated "po/git.pot" will have 30 new
entries coming from the following C source files:
* compat/fsmonitor/fsm-listen-win32.c
* compat/mingw.c
* compat/regex/regcomp.c
* compat/simple-ipc/ipc-win32.c
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will feed xgettext with more C source files and in different order
in subsequent commit. To generate a stable "po/git.pot" regardless of
the number and order of input source files, we sort the c, perl, and
shell source files in groups before feeding them to xgettext.
Ævar suggested that we should not pass the option "--sort-by-file" to
xgettext to sort the translatable strings, as it will mix the three
groups of source files (c, perl and shell) in the file "po/git.pot",
and change the order of translatable strings in the same line of a file.
With this update, the newly generated "po/git.pot" will have the same
entries while in a different order.
With the help of a custom diff driver as shown below,
git config --global diff.gettext-fmt.textconv \
"msgcat --no-location --sort-by-file"
and appending a new entry "*.pot diff=gettext-fmt" to git attributes,
we can see that there are no substantial changes in "po/git.pot".
We won't checkin the newly generated "po/git.pot", because we will
remove it from tree in a later commit.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --recurse-submodules" from multiple remotes (either from
a remote group, or "--all") used to make one extra "git fetch" in
the submodules, which has been corrected.
* jc/avoid-redundant-submodule-fetch:
fetch: do not run a redundant fetch from submodule
The way "git fetch" without "--update-head-ok" ensures that HEAD in
no worktree points at any ref being updated was too wasteful, which
has been optimized a bit.
* os/fetch-check-not-current-branch:
fetch: limit shared symref check only for local branches
Documentation update.
* pb/ggg-in-mfc-doc:
MyFirstContribution: drop PR description for GGG single-patch contributions
MyFirstContribution: reference "The cover letter" in GitGitGadget section
MyFirstContribution: reference "The cover letter" in "Preparing Email"
MyFirstContribution: add standalone section on cover letter
MyFirstContribution: add "Anatomy of a Patch Series" section
"git fetch" unnecessarily failed when an unexpected optional
section appeared in the output, which has been corrected.
* jt/fetch-peek-optional-section:
fetch-pack: make unexpected peek result non-fatal
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
"make coverage-report" without first running "make coverage" did
not produce any meaningful result, which has been corrected.
* ep/coverage-report-wants-test-to-have-run:
Makefile: add a prerequisite to the coverage-report target
The FreeBSD CI build (on Cirrus-CI) has been failing in
't9001-send-email.sh' for quite some time, with an error from the
runtime linker relating to the Perl installation:
$ GIT_SEND_EMAIL_NOTTY=1 git send-email \
'--from=Example <from@example.com>' '--to=nobody@example.com' \
'--smtp-server=/tmp/cirrus-ci-build/t/trash directory.t9001-send-email/fake.sendmail' \
--compose '--subject=foo' 0001-Second.patch
ld-elf.so.1: /usr/local/lib/perl5/5.32/mach/CORE/libperl.so.5.32: Undefined symbol "strerror_l@FBSD_1.6"
This first instance is in t9001.6 but it fails similarly in several tests
in this file.
The FreeBSD image we use is FreeBSD 12.2, which is unsupported since
March 31st, 2022 [1]. Switching to a supported version, 12.3,
makes this error disappear [2].
Change the image we use to FreeBSD 12.3.
[1] https://www.freebsd.org/security/unsupported/
[2] https://lore.kernel.org/git/9cc31276-ab78-fa8a-9fb4-b19266911211@gmail.com/
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to Git 2.35.0, git could be run from an inaccessible working
directory so long as the git repository specified by options and/or
environment variables was accessible. For example:
git init repo
mkdir -p a/b
cd a/b
chmod u-x ..
git -C "${PWD%/a/b}/repo" status
If this example seems a bit contrived, consider running with the
repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
without ensuring the working directory is accessible by that user.
The code added by e6f8861bd4 ("setup: introduce
startup_info->original_cwd") to preserve the working directory attempts
to normalize the path using strbuf_realpath(). If that fails, as in the
case above, it is treated as a fatal error.
This commit treats strbuf_realpath() errors as non-fatal. If an error
occurs, setup_original_cwd() will continue without applying removal
prevention for cwd, resulting in the pre-2.35.0 behavior. The risk
should be minimal, since git will not operate on a repository with
inaccessible ancestors, this behavior is only known to occur when cwd is
a descendant of the repository, an ancestor of cwd is inaccessible, and
no ancestors of the repository are inaccessible.
Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`UNICODE` and `_UNICODE` are not required when building git on Windows.
Actually, they should not be predefined at all.
There're 2 evidences that `(_)UNICODE` is supposed to be nonexist:
compat/win32/trace2_win32_process_info.c:83: It uses jw_array_string
which accepts pe32.szExeFile as const char*.
t/helper/test-drop-caches.c:16: Calling to GetCurrentDirectory with
Buffer as char*.
The autotools build system never defines `UNICODE` and `_UNICODE` and
builds on Windows well.
Signed-off-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix one of the TODOs listed in the CMakeLists.txt by adding support
for building with pcre2.
As pcre2 doesn't provide cmake find module, we find it with pkgconf.
This patch also works with vcpkg on Windows, with pkgconf and pcre2
installed.
Pkgconf and pcre2 is detected automatically just like curl, expat
and iconv. The output of CMake indicates whether pcre2 is found.
Signed-off-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CMakeLists.txt didn't follow the grammar of `set`, and it will fail when
setting `USE_VCPKG` off on non-Windows platforms.
When the platform is Linux, the Makefile adds `compat/linux/procinfo.o`
to `COMPAT_OBJS`, but the CMakeLists.txt didn't add
`compat/linux/procinfo.c` to `compat_SOURCES`. It would cause linkage
error.
Signed-off-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:
dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
exceeds maximum object size 9223372036854775807
[-Werror=stringop-overread]
CC ewah/bitmap.o
3087 | if (memchr(start, '/', end - start) == NULL
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.
See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.
Let's just add a safety check, primarily for GCC's benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GCC v12.x complains thusly:
compat/nedmalloc/nedmalloc.c: In function 'DestroyCaches':
compat/nedmalloc/nedmalloc.c:326:12: error: the comparison will always
evaluate as 'true' for the address of 'caches'
will never be NULL [-Werror=address]
326 | if(p->caches)
| ^
compat/nedmalloc/nedmalloc.c:196:22: note: 'caches' declared here
196 | threadcache *caches[THREADCACHEMAXCACHES];
| ^~~~~~
... and it is correct, of course.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' SDK recently upgraded to GCC v12.x which points out
that the `pos` variable might be used even after the corresponding
memory was `realloc()`ed and therefore potentially no longer valid.
Since a subset of this SDK is used in Git's CI/PR builds, we need to fix
this to continue to be able to benefit from the CI/PR runs.
Note: This bug has been with us since 2a6b149c64 (mingw: avoid using
strbuf in syslog, 2011-10-06), and while it looks tempting to replace
the hand-rolled string manipulation with a `strbuf`-based one, that
commit's message explains why we cannot do that: The `syslog()` function
is called as part of the function in `daemon.c` which is set as the
`die()` routine, and since `strbuf_grow()` can call that function if it
runs out of memory, this would cause a nasty infinite loop that we do
not want to re-introduce.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using a multi-pack bitmap, pack-objects will try to perform its
traversal using a call to `traverse_bitmap_commit_list()`, which calls
`add_object_entry_from_bitmap()` to add each object it finds to its
packing list.
This path can cause pack-objects to add objects from packs that don't
have open pack_fds on them, by avoiding a call to `is_pack_valid()`.
This is because we only call `is_pack_valid()` on the preferred pack (in
order to do verbatim reuse via `reuse_partial_packfile_from_bitmap()`)
and not others when loading a MIDX bitmap.
In this case, `add_object_entry_from_bitmap()` will check whether it
wants each object entry by calling `want_object_in_pack()`, which will
call `want_found_object` (since its caller already supplied a
`found_pack`). In most cases (particularly without `--local`, and when
`ignored_packed_keep_on_disk` and `ignored_packed_keep_in_core` are
both "0"), we'll take the entry from the pack contained in the MIDX
bitmap, all without an open pack_fd.
When we then try to use that entry later to assemble the actual pack,
we'll be susceptible to any simultaneous writers moving that pack out of
the way (e.g., due to a concurrent repack) without having an open file
descriptor, causing races that result in errors like:
remote: Enumerating objects: 1498802, done.
remote: fatal: packfile ./objects/pack/pack-e57d433b5a588daa37fbe946e2b28dfaec03a93e.pack cannot be accessed
remote: aborting due to possible repository corruption on the remote side.
This race can happen even with multi-pack bitmaps, since we may open a
MIDX bitmap that is being rewritten long before its packs are actually
unlinked.
Work around this by calling `is_pack_valid()` from within
`want_found_object()`, matching the behavior in
`want_object_in_pack_one()` (which has an analogous call). Most calls to
`is_pack_valid()` should be basically no-ops, since only the first call
requires us to open a file (subsequent calls realize the file is already
open, and return immediately).
Importantly, when `want_object_in_pack()` is given a non-NULL
`*found_pack`, but `want_found_object()` rejects the copy of the object
in that pack, we must reset `*found_pack` and `*found_offset` to NULL
and 0, respectively. Failing to do so could lead to other checks in
`want_object_in_pack()` (such as `want_object_in_pack_one()`) using the
same (invalid) pack as `*found_pack`, meaning that we don't call
`is_pack_valid()` because `p == *found_pack`. This can lead the caller
to believe it can use a copy of an object from an invalid pack.
An alternative approach to closing this race would have been to call
`is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This
has a couple of problems:
- it is unnecessarily expensive in the cases where we don't actually
need to open any packs (e.g., in `git rev-list --use-bitmap-index
--count`)
- more importantly, it means any time we would have hit this race,
we'll avoid using bitmaps altogether, leading to significant
slowdowns by forcing a full object traversal
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch will teach `want_object_in_pack()` to set its
`*found_pack` and `*found_offset` poitners to NULL when the provided
pack does not pass the `is_pack_valid()` check.
The `--stdin-packs` mode of `pack-objects` is not quite prepared to
handle this. To prepare it for this change, do the following two things:
- Ensure provided packs pass the `is_pack_valid()` check when
collecting the caller-provided packs into the "included" and
"excluded" lists.
- Gracefully handle any _invalid_ packs being passed to
`want_object_in_pack()`.
Calling `is_pack_valid()` early on makes it substantially less likely
that we will have to deal with a pack going away, since we'll have an
open file descriptor on its contents much earlier.
But even packs with open descriptors can become invalid in the future if
we (a) hit our open descriptor limit, forcing us to close some open
packs, and (b) one of those just-closed packs has gone away in the
meantime.
`add_object_entry_from_pack()` depends on having a non-NULL
`*found_pack`, since it passes that pointer to `packed_object_info()`,
meaning that we would SEGV if the pointer became NULL (like we propose
to do in `want_object_in_pack()` in the following patch).
But avoiding calling `packed_object_info()` entirely is OK, too, since
its only purpose is to identify which objects in the included packs are
commits, so that they can form the tips of the advisory traversal used
to discover the object namehashes.
Failing to do this means that at worst we will produce lower-quality
deltas, but it does not prevent us from generating the pack as long as
we can find a copy of each object from the disappearing pack in some
other part of the repository.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before calling `for_each_object_in_pack()`, the caller
`read_packs_list_from_stdin()` loops through each of the `include_packs`
and checks that its `->util` pointer (which is used to store the `struct
packed_git *` itself) is non-NULL.
This check is redundant, because `read_packs_list_from_stdin()` already
checks that the included packs are non-NULL earlier on in the same
function (and it does not add any new entries in between).
Remove this check, since it is not doing anything in the meantime.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pack-objects adds an entry to its packing list, it marks the
packfile and offset containing the object, which we may later use during
verbatim reuse (c.f., `write_reused_pack_verbatim()`).
If the packfile in question is deleted in the background (e.g., due to a
concurrent `git repack`), we'll die() as a result of calling use_pack(),
unless we have an open file descriptor on the pack itself. 4c08018204
(pack-objects: protect against disappearing packs, 2011-10-14) worked
around this by opening the pack ahead of time before recording it as a
valid source for reuse.
4c08018204's treatment meant that we could tolerate disappearing packs,
since it ensures we always have an open file descriptor on any pack that
we mark as a valid source for reuse. This tightens the race to only
happen when we need to close an open pack's file descriptor (c.f., the
caller of `packfile.c::get_max_fd_limit()`) _and_ that pack was deleted,
in which case we'll complain that a pack could not be accessed and
die().
The pack bitmap code does this, too, since prior to dc1daacdcc
(pack-bitmap: check pack validity when opening bitmap, 2021-07-23) it
was vulnerable to the same race.
The MIDX bitmap code does not do this, and is vulnerable to the same
race. Apply the same treatment as dc1daacdcc to the routine responsible
for opening the multi-pack bitmap's preferred pack to close this race.
This patch handles the "preferred" pack (c.f., the section
"multi-pack-index reverse indexes" in
Documentation/technical/pack-format.txt) specially, since pack-objects
depends on reusing exact chunks of that pack verbatim in
reuse_partial_packfile_from_bitmap(). So if that pack cannot be loaded,
the utility of a bitmap is significantly diminished.
Similar to dc1daacdcc, we could technically just add this check in
reuse_partial_packfile_from_bitmap(), since it's possible to use a MIDX
.bitmap without needing to open any of its packs. But it's simpler to do
the check as early as possible, covering all direct uses of the
preferred pack. Note that doing this check early requires us to call
prepare_midx_pack() early, too, so move the relevant part of that loop
from load_reverse_index() into open_midx_bitmap_1().
Subsequent patches handle the non-preferred packs in a slightly
different fashion.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git archive --add-file=<path>" picked up the raw permission bits
from the path and propagated to zip output in some cases, without
normalization, which has been corrected (tar output did not have
this issue).
* jc/archive-add-file-normalize-mode:
archive: do not let on-disk mode leak to zip archives
A bit of test framework fixes with a few fixes to issues found by
valgrind.
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
When modifying the sparse-checkout definition, the sparse-checkout
builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
cache entries in the index. Before, we needed the index to be fully
expanded in order to ensure we had the full list of files necessary that
match the new patterns.
Insert a call to reset_sparse_directories() that expands sparse
directories that are within the new pattern list, but only far enough
that every necessary file path now exists as a cache entry. The
remaining logic within update_sparsity() will modify the SKIP_WORKTREE
bits appropriately.
This allows us to disable command_requires_full_index within the
sparse-checkout builtin. Add tests that demonstrate that we are not
expanding to a full index unnecessarily.
We can see the improved performance in the p2000 test script:
Test HEAD~1 HEAD
------------------------------------------------------------------------
2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
These reductions of 26-28% are small compared to most examples, but the
time is dominated by writing a new copy of the base repository to the
worktree and then deleting it again. The fact that the previous index
expansion was such a large portion of the time is telling how important
it is to complete this sparse index integration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin is almost completely integrated with the
sparse index, allowing the sparse-checkout boundary to be modified
without expanding a sparse index to a full one. Add a test to
p2000-sparse-operations.sh that adds a directory to the sparse-checkout
definition, then removes it. Using both operations is important to
ensure that the operation is doing the same work in each repetition as
well as leaving the test repo in a good state for later tests.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To complete the implementation of expand_to_pattern_list(), we need to
detect when a sparse directory entry should remain sparse. This avoids a
full expansion, so we now need to use the PARTIALLY_SPARSE mode to
indicate this state.
There still are no callers to this method, but we will add one in the
next change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expand_to_pattern_list() method expands sparse directory entries
to their list of contained files when either the pattern list is NULL or
the directory is contained in the new pattern list's cone mode patterns.
It is possible that the pattern list has a recursive match with a
directory 'A/B/C/' and so an existing sparse directory 'A/B/' would need
to be expanded. If there exists a directory 'A/B/D/', then that
directory should not be expanded and instead we can create a sparse
directory.
To implement this, we plug into the add_path_to_index() callback for the
call to read_tree_at(). Since we now need access to both the index we
are writing and the pattern list we are comparing, create a 'struct
modify_index_context' to use as a data transfer object. It is important
that we use the given pattern list since we will use this pattern list
to change the sparse-checkout patterns and cannot use
istate->sparse_checkout_patterns.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --no-sparse-index option is supplied, the sparse-checkout
builtin should explicitly ask to expand a sparse index to a full one.
This is currently done implicitly due to the command_requires_full_index
protection, but that will be removed in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given a 'struct cache_tree', it may be beneficial to navigate directly
to a node within that corresponds to a given path name. Create
cache_tree_find_path() for this function. It returns NULL when no such
path exists.
The implementation is adapted from do_invalidate_path() which does a
similar search but also modifies the nodes it finds along the way. The
method could be implemented simply using tail-recursion, but this while
loop does the same thing.
This new method is not currently used, but will be in an upcoming
change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future change will present a temporary, in-memory mode where the index
can both contain sparse directory entries but also not be completely
collapsed to the smallest possible sparse directories. This will be
necessary for modifying the sparse-checkout definition while using a
sparse index.
For now, convert the single-bit member 'sparse_index' in 'struct
index_state' to be a an 'enum sparse_index_mode' with three modes:
* INDEX_EXPANDED (0): No sparse directories exist. This is always the
case for repositories that do not use cone-mode sparse-checkout.
* INDEX_COLLAPSED: Sparse directories may exist. Files outside the
sparse-checkout cone are reduced to sparse directory entries whenever
possible.
* INDEX_PARTIALLY_SPARSE: Sparse directories may exist. Some file
entries outside the sparse-checkout cone may exist. Running
convert_to_sparse() may further reduce those files to sparse directory
entries.
The main reason to store this extra information is to allow
convert_to_sparse() to short-circuit when the index is already in
INDEX_EXPANDED mode but to actually do the necessary work when in
INDEX_PARTIALLY_SPARSE mode.
The INDEX_PARTIALLY_SPARSE mode will be used in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the first change in a series to allow modifying the
sparse-checkout pattern set without expanding a sparse index to a full
one in the process. Here, we focus on the problem of expanding the
pattern set through a command like 'git sparse-checkout add <path>'
which needs to create new index entries for the paths now being written
to the worktree.
To achieve this, we need to be able to replace sparse directory entries
with their contained files and subdirectories. Once this is complete,
other code paths can discover those cache entries and write the
corresponding files to disk before committing the index.
We already have logic in ensure_full_index() that expands the index
entries, so we will use that as our base. Create a new method,
expand_index(), which takes a pattern list, but for now mostly ignores
it. The current implementation is only correct when the pattern list is
NULL as that does the same as ensure_full_index(). In fact,
ensure_full_index() is converted to a shim over expand_index().
A future update will actually implement expand_index() to its full
capabilities. For now, it is created and documented.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'sparse-index contents' test checks that the sparse index has the
correct set of sparse directories in the index after modifying the cone
mode patterns using 'git sparse-checkout set'. Add to the coverage here
by adding more complicated scenarios that were not previously tested.
In order to check paths that do not exist at HEAD, we need to modify the
test_sparse_checkout_set helper slightly:
1. Add the --skip-checks argument to the 'set' command to avoid failures
when passing paths that do not exist at HEAD.
2. When looking for the non-existence of sparse directories for the
paths in $CONE_DIRS, allow the rev-list command to fail because the
path does not exist at HEAD.
This allows us to add some interesting test cases.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before expanding this test with more involved cases, first extract the
repeated logic into a new test_sparse_checkout_set helper. This helper
checks that 'git sparse-checkout set ...' succeeds and then verifies
that certain directories have sparse directory entries in the sparse
index. It also verifies that the in-cone directories are _not_ sparse
directory entries in the sparse index.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to call that function already before printing the final verdict.
However, now that we added grouping to the GitHub workflow output, we
will want to include even that part in the collapsible group for that
test case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The full logs are contained in the `failed-tests-*.zip` artifacts that
are attached to the failed CI run. Since this is not immediately
obvious to the well-disposed reader, let's mention it explicitly.
Suggested-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the output easier to digest.
Note: since workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details), we need
to remove the explicit grouping that would span the entirety of each
failed test script.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to mark up the test case preamble when presenting test output in
Git's GitHub workflow. Let's suppress the non-marked-up version in that
case. Any information it would contain is included in the marked-up
variant already.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most instances, looking at the log of failed test cases is enough to
identify the problem.
In some (rare?) instances, a previous test case that was marked as
successful actually has information pertaining to a later test case that
fails.
To allow the page to load relatively quickly, let's only show the logs
of the failed test cases to be shown. The full logs are available for
download as artifacts, should a deeper investigation become necessary.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of commands exist to spruce up the output in GitHub workflows:
https://docs.github.com/en/actions/learn-github-actions/workflow-commands-for-github-actions
In addition to the `::group::<label>`/`::endgroup::` commands (which we
already use to structure the output of the build step better), we also
use `::error::`/`::notice::` to draw the attention to test failures and
to test cases that were expected to fail but didn't.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current output of Git's GitHub workflow can be quite confusing,
especially for contributors new to the project.
To make it more helpful, let's introduce some collapsible grouping.
Initially, readers will see the high-level view of what actually
happened (did the build fail, or the test suite?). To drill down, the
respective group can be expanded.
Note: sadly, workflow output currently cannot contain any nested groups
(see https://github.com/actions/runner/issues/802 for details),
therefore we take pains to ensure to end any previous group before
starting a new one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When investigating a test failure, the time that matters most is the
time it takes from getting aware of the failure to displaying the output
of the failing test case.
You currently have to know a lot of implementation details when
investigating test failures in the CI runs. The first step is easy: the
failed job is marked quite clearly, but when opening it, the failed step
is expanded, which in our case is the one running
`ci/run-build-and-tests.sh`. This step, most notably, only offers a
high-level view of what went wrong: it prints the output of `prove`
which merely tells the reader which test script failed.
The actually interesting part is in the detailed log of said failed
test script. But that log is shown in the CI run's step that runs
`ci/print-test-failures.sh`. And that step is _not_ expanded in the web
UI by default. It is even marked as "successful", which makes it very
easy to miss that there is useful information hidden in there.
Let's help the reader by showing the failed tests' detailed logs in the
step that is expanded automatically, i.e. directly after the test suite
failed.
This also helps the situation where the _build_ failed and the
`print-test-failures` step was executed under the assumption that the
_test suite_ failed, and consequently failed to find any failed tests.
An alternative way to implement this patch would be to source
`ci/print-test-failures.sh` in the `handle_test_failures` function to
show these logs. However, over the course of the next few commits, we
want to introduce some grouping which would be harder to achieve that
way (for example, we do want a leaner, and colored, preamble for each
failed test script, and it would be trickier to accommodate the lack of
nested groupings in GitHub workflows' output).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the web UI of GitHub workflows, failed runs are presented with the
job step that failed auto-expanded. In the current setup, this is not
helpful at all because that shows only the output of `prove`, which says
which test failed, but not in what way.
What would help understand the reader what went wrong is the verbose
test output of the failed test.
The logs of the failed runs do contain that verbose test output, but it
is shown in the _next_ step (which is marked as succeeding, and is
therefore _not_ auto-expanded). Anyone not intimately familiar with this
would completely miss the verbose test output, being left mostly
puzzled with the test failures.
We are about to show the failed test cases' output in the _same_ step,
so that the user has a much easier time to figure out what was going
wrong.
But first, we must partially revert the change that tried to improve the
CI runs by combining the `Makefile` targets to build into a single
`make` invocation. That might have sounded like a good idea at the time,
but it does make it rather impossible for the CI script to determine
whether the _build_ failed, or the _tests_. If the tests were run at
all, that is.
So let's go back to calling `make` for the build, and call `make test`
separately so that we can easily detect that _that_ invocation failed,
and react appropriately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test case's output, we do want newline characters, but in the XML
attributes we do not want them.
However, the `xml_attr_encode` function always adds a Line Feed at the
end (which are then encoded as `
`, even for XML attributes.
This seems not to faze Azure Pipelines' XML parser, but it still is
incorrect, so let's fix it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.
Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.
This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.
This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In b92cb86ea1 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31), a function was introduced with a code style
that is different from the surrounding code: it added the opening curly
brace on its own line, when all the existing functions in the same file
cuddle that brace on the same line as the function name.
Let's make the code style consistent again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a technical document to explain cruft packs. It contains a brief
overview of the problem, some background, details on the implementation,
and a couple of alternative approaches not considered here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
macOS CI jobs have been occasionally flaky due to tentative version
skew between perforce and the homebrew packager. Instead of
failing the whole CI job, just let it skip the p4 tests when this
happens.
* cb/ci-make-p4-optional:
ci: use https, not http to download binaries from perforce.com
ci: reintroduce prevention from perforce being quarantined in macOS
ci: avoid brew for installing perforce
ci: make failure to find perforce more user friendly
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
"git show :<path>" learned to work better with the sparse-index
feature.
* ds/sparse-colon-path:
rev-parse: integrate with sparse index
object-name: diagnose trees in index properly
object-name: reject trees found in the index
show: integrate with the sparse index
t1092: add compatibility tests for 'git show'
Teach "git stash" to work better with sparse index entries.
* vd/sparse-stash:
unpack-trees: preserve index sparsity
stash: apply stash using 'merge_ort_nonrecursive()'
read-cache: set sparsity when index is new
sparse-index: expose 'is_sparse_index_allowed()'
stash: integrate with sparse index
stash: expand sparse-checkout compatibility testing
"git bisect" was too silent before it is ready to start computing
the actual bisection, which has been corrected.
* cd/bisect-messages-from-pre-flight-states:
bisect: output bisect setup status in bisect log
bisect: output state before we are ready to compute bisection
"git pull" without "--recurse-submodules=<arg>" made
submodule.recurse take precedence over fetch.recurseSubmodules by
mistake, which has been corrected.
* gc/pull-recurse-submodules:
pull: do not let submodule.recurse override fetch.recurseSubmodules
Trace2 documentation updates.
* js/trace2-doc-fixes:
trace2 docs: add missing full stop
trace2 docs: clarify what `varargs` is all about
trace2 docs: fix a JSON formatted example
trace2 docs: surround more terms in backticks
trace2 docs: "printf" is not an English word
trace2 docs: a couple of grammar fixes
"git log --since=X" will stop traversal upon seeing a commit that
is older than X, but there may be commits behind it that is younger
than X when the commit was created with a faulty clock. A new
option is added to keep digging without stopping, and instead
filter out commits with timestamp older than X.
* mv/log-since-as-filter:
log: "--since-as-filter" option is a non-terminating "--since" variant
The temporary files fed to external diff command are now generated
inside a new temporary directory under the same basename.
* rs/external-diff-tempfile:
diff: use mks_tempfile_dt()
tempfile: add mks_tempfile_dt()
New tests for the safe.directory mechanism.
* sg/safe-directory-tests-and-docs:
safe.directory: document and check that it's ignored in the environment
t0033-safe-directory: check when 'safe.directory' is ignored
t0033-safe-directory: check the error message without matching the trash dir
The previous patch demonstrates a scenario where the list of packs
written by `pack-objects` (and stored in the `names` string_list) is
out-of-order, and can thus cause us to delete packs we shouldn't.
This patch resolves that bug by ensuring that `names` is sorted in all
cases, not just when
delete_redundant && pack_everything & ALL_INTO_ONE
is true.
Because we did sort `names` in that case (which, prior to `--geometric`
repacks, was the only time we would actually delete packs, this is only
a bug for `--geometric` repacks.
It would be sufficient to only sort `names` when `delete_redundant` is
set to a non-zero value. But sorting a small list of strings is cheap,
and it is defensive against future calls to `string_list_has_string()`
on this list.
Co-discovered-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing a `--geometric=<d>` repack, `git repack` determines a
splitting point among packs ordered by their object count such that:
- each pack above the split has at least `<d>` times as many objects
as the next-largest pack by object count, and
- the first pack above the split has at least `<d>` times as many
object as the sum of all packs below the split line combined
`git repack` then creates a pack containing all of the objects contained
in packs below the split line by running `git pack-objects
--stdin-packs` underneath. Once packs are moved into place, then any
packs below the split line are removed, since their objects were just
combined into a new pack.
But `git repack` tries to be careful to avoid removing a pack that it
just wrote, by checking:
struct packed_git *p = geometry->pack[i];
if (string_list_has_string(&names, hash_to_hex(p->hash)))
continue;
in the `delete_redundant` and `geometric` conditional towards the end of
`cmd_repack`.
But it's possible to trick `git repack` into not recognizing a pack that
it just wrote when `names` is out-of-order (which violates
`string_list_has_string()`'s assumption that the list is sorted and thus
binary search-able).
When this happens in just the right circumstances, it is possible to
remove a pack that we just wrote, leading to object corruption.
Luckily, this is quite difficult to provoke in practice (for a couple of
reasons):
- we ordinarily write just one pack, so `names` usually contains just
one entry, and is thus sorted
- when we do write more than one pack (e.g., due to `--max-pack-size`)
we have to: (a) write a pack identical to one that already
exists, (b) have that pack be below the split line, and (c) have
the set of packs written by `pack-objects` occur in an order which
tricks `string_list_has_string()`.
Demonstrate the above scenario in a failing test, which causes `git
repack --geometric` to write a pack which occurs below the split line,
_and_ fail to recognize that it wrote that pack.
The following patch will fix this bug.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'repack' to ignore packs named on the command line with the
'--keep-pack' option. Specifically, modify 'init_pack_geometry()' to treat
command line-kept packs the same way it treats packs with an on-disk '.keep'
file (that is, skip the pack and do not include it in the 'geometry'
structure).
Without this handling, a '--keep-pack' pack would be included in the
'geometry' structure. If the pack is *before* the geometry split line (with
at least one other pack and/or loose objects present), 'repack' assumes the
pack's contents are "rolled up" into another pack via 'pack-objects'.
However, because the internally-invoked 'pack-objects' properly excludes
'--keep-pack' objects, any new pack it creates will not contain the kept
objects. Finally, 'repack' deletes the '--keep-pack' as "redundant" (since
it assumes 'pack-objects' created a new pack with its contents), resulting
in possible object loss and repository corruption.
Add a test ensuring that '--keep-pack' packs are now appropriately handled.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do make sure that an attempt to merge with various forms of local
changes will "fail", but the point of stopping the merge is so that
we refrain from discarding uncommitted local changes that could be
precious. Add a few more checks for each case to make sure the
local changes are left intact.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c7c4bdeccf (run-command API: remove "env" member, always use
"env_array", 2021-11-25), there was a push to replace
cld.env = env->v;
with
strvec_pushv(&cld.env_array, env->v);
The conversion in c7c4bdeccf was mostly plug-and-play, with the snag
that some instances of strvec_pushv() became guarded with a NULL check
to ensure that the second argument was non-NULL.
This conversion was slightly over-eager to add a conditional in
builtin/receive-pack.c::unpack(), since we know at the point that we
add the result of `tmp_objdir_env()` into the child process's
environment, that `tmp_objdir` is non-NULL.
This follows from the conditional just before our strvec_pushv() call
(which returns from the function if `tmp_objdir` was NULL), as well as
the call to tmp_objdir_add_as_alternate() just below, which relies on
its argument (`tmp_objdir`) being non-NULL.
In the meantime, this extra conditional isn't hurting anything. But it
is redundant and thus unnecessarily confusing. So let's remove it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option,
2010-11-12) introduced the "--recurse-submodule" option, the
approach taken was to perform fetches in submodules only once, after
all the main fetching (it may usually be a fetch from a single
remote, but it could be fetching from a group of remotes using
fetch_multiple()) succeeded. Later we added "--all" to fetch from
all defined remotes, which complicated things even more.
If your project has a submodule, and you try to run "git fetch
--recurse-submodule --all", you'd see a fetch for the top-level,
which invokes another fetch for the submodule, followed by another
fetch for the same submodule. All but the last fetch for the
submodule come from a "git fetch --recurse-submodules" subprocess
that is spawned via the fetch_multiple() interface for the remotes,
and the last fetch comes from the code at the end.
Because recursive fetching from submodules is done in each fetch for
the top-level in fetch_multiple(), the last fetch in the submodule
is redundant. It only matters when fetch_one() interacts with a
single remote at the top-level.
While we are at it, there is one optimization that exists in dealing
with a group of remote, but is missing when "--all" is used. In the
former, when the group turns out to be a group of one, instead of
spawning "git fetch" as a subprocess via the fetch_multiple()
interface, we use the normal fetch_one() code path. Do the same
when handing "--all", if it turns out that we have only one remote
defined.
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This switch statement was recently added to make it clear that
unpack_loose_header() returns an enum value, not an int. This adds
complications for future developers if that enum gains new values, since
that developer would need to add a case statement to this switch for
little real value.
Instead, we can revert back to an 'if' statement, but make the enum
explicit by using "!= ULHR_OK" instead of assuming it has the numerical
value zero.
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_bundle_header() function to be non-static, and rename
it to parse_bundle_header_fd(). The parse_bundle_header() function is
already public, and it's a thin wrapper around this function. This
will be used by code that wants to pass a fd to the bundle API.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the 'url' parameter was absolute, the previous implementation would
concatenate 'remote_url' with 'url'. Instead, we want to return 'url' in
this case.
The documentation now discusses what happens when supplying two
absolute URLs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This method was initially written in 63e95beb0 (submodule: port
resolve_relative_url from shell to C, 2016-05-15). As we will need
similar functionality in the bundle URI feature, extract this to be
available in remote.h.
The code is almost exactly the same, except for the following trivial
differences:
* Fix whitespace and wrapping issues with the prototype and argument
lists.
* Let's call starts_with_dot_{,dot_}slash_native() instead of the
functionally identical "starts_with_dot_{,dot_}slash()" wrappers
"builtin/submodule--helper.c".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This method will be used in an upcoming extension of git-remote-curl to
download a single file over HTTP(S) by request.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the populating of the --keep=* option argument to "index-pack" to
a static function, a subsequent commit will make use of it in another
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a version of the deref_without_lazy_fetch function which can be
called with custom oi_flags and to grab information about the
"object_type". This will be used for the bundle-uri client in a
subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a path_match_flags() function and have the two sets of
starts_with_dot_{,dot_}slash() functions added in
63e95beb08 (submodule: port resolve_relative_url from shell to C,
2016-04-15) and a2b26ffb1a (fsck: convert gitmodules url to URL
passed to curl, 2020-04-18) be thin wrappers for it.
As the latter of those notes the fsck version was copied from the
initial builtin/submodule--helper.c version.
Since the code added in a2b26ffb1a was doing really doing the same as
win32_is_dir_sep() added in 1cadad6f65 (git clone <url>
C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move
the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can
then call either it or the platform-specific is_dir_sep() from this
new function.
Let's likewise change code in various other places that was hardcoding
checks for "'/' || '\\'" with the new is_xplatform_dir_sep(). As can
be seen in those callers some of them still concern themselves with
':' (Mac OS classic?), but let's leave the question of whether that
should be consolidated for some other time.
As we expect to make wider use of the "native" case in the future,
define and use two starts_with_dot_{,dot_}slash_native() convenience
wrappers. This makes the diff in builtin/submodule--helper.c much
smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the sending of the "agent" and "object-format" capabilities
into a function.
This was added in its current form in ab67235bc4 (connect: parse v2
refs with correct hash algorithm, 2020-05-25). When we connect to a v2
server we need to know about its object-format, and it needs to know
about ours. Since most things in connect.c and transport.c piggy-back
on the eager getting of remote refs via the handshake() those commands
can make use of the just-sent-over object-format by ls-refs.
But I'm about to add a command that may come after ls-refs, and may
not, but we need the server to know about our user-agent and
object-format. So let's split this into a function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This check was introduced in 8ee5d73137 (Fix fetch/pull when run without
--update-head-ok, 2008-10-13) in order to protect against replacing the ref
of the active branch by mistake, for example by running git fetch origin
master:master.
It was later extended in 8bc1f39f41 (fetch: protect branches checked out
in all worktrees, 2021-12-01) to scan all worktrees.
This operation is very expensive (takes about 30s in my repository) when
there are many tags or branches, and it is executed on every fetch, even if
no local heads are updated at all.
Limit it to protect only refs/heads/* to improve fetch performance.
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Libcurl has a CURLOPT_RESOLVE easy option that allows
the result of hostname resolution in the following
format to be passed:
[+]HOST:PORT:ADDRESS[,ADDRESS]
This way, redirects and everything operating against the
HOST+PORT will use the provided ADDRESS(s).
The following format is also allowed to stop using
hostname resolutions that have already been passed:
-HOST:PORT
See https://curl.se/libcurl/c/CURLOPT_RESOLVE.html for
more details.
Let's add a corresponding "http.curloptResolve" config
option that takes advantage of CURLOPT_RESOLVE.
Each value configured for the "http.curloptResolve" key
is passed "as is" to libcurl through CURLOPT_RESOLVE, so
it should be in one of the above 2 formats. This keeps
the implementation simple and makes us consistent with
libcurl's CURLOPT_RESOLVE, and with curl's corresponding
`--resolve` command line option.
The implementation uses CURLOPT_RESOLVE only in
get_active_slot() which is called by all the HTTP
request sending functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a Git server responds to a fetch request, it may send optional
sections before the packfile section. To handle this, the Git client
calls packet_reader_peek() (see process_section_header()) in order to
see what's next without consuming the line.
However, as implemented, Git errors out whenever what's peeked is not an
ordinary line. This is not only unexpected (here, we only need to know
whether the upcoming line is the section header we want) but causes
errors to include the name of a section header that is irrelevant to the
cause of the error. For example, at $DAYJOB, we have seen "fatal: error
reading section header 'shallow-info'" error messages when none of the
repositories involved are shallow.
Therefore, fix this so that the peek returns 1 if the upcoming line is
the wanted section header and nothing else. Because of this change,
reader->line may now be NULL later in the function, so update the error
message printing code accordingly.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a support library that provides one function that can be used
to run a "scriplet" of commands through sudo and that helps invoking
sudo in the slightly awkward way that is required to ensure it doesn't
block the call (if shell was allowed as tested in the prerequisite)
and it doesn't run the command through a different shell than the one
we intended.
Add additional negative tests as suggested by Junio and that use a
new workspace that is owned by root.
Document a regression that was introduced by previous commits where
root won't be able anymore to access directories they own unless
SUDO_UID is removed from their environment.
The tests document additional ways that this new restriction could
be worked around and the documentation explains why it might be instead
considered a feature, but a "fix" is planned for a future change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
bdc77d1d68 (Add a function to determine whether a path is owned by the
current user, 2022-03-02) checks for the effective uid of the running
process using geteuid() but didn't account for cases where that user was
root (because git was invoked through sudo or a compatible tool) and the
original uid that repository trusted for its config was no longer known,
therefore failing the following otherwise safe call:
guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
[sudo] password for guy:
fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)
Attempt to detect those cases by using the environment variables that
those tools create to keep track of the original user id, and do the
ownership check using that instead.
This assumes the environment the user is running on after going
privileged can't be tampered with, and also adds code to restrict that
the new behavior only applies if running as root, therefore keeping the
most common case, which runs unprivileged, from changing, but because of
that, it will miss cases where sudo (or an equivalent) was used to change
to another unprivileged user or where the equivalent tool used to raise
privileges didn't track the original id in a sudo compatible way.
Because of compatibility with sudo, the code assumes that uid_t is an
unsigned integer type (which is not required by the standard) but is used
that way in their codebase to generate SUDO_UID. In systems where uid_t
is signed, sudo might be also patched to NOT be unsigned and that might
be able to trigger an edge case and a bug (as described in the code), but
it is considered unlikely to happen and even if it does, the code would
just mostly fail safely, so there was no attempt either to detect it or
prevent it by the code, which is something that might change in the future,
based on expected user feedback.
Reported-by: Guy Maurel <guy.j@maurel.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Randall Becker <rsbecker@nexbridge.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally reported after release of v2.35.2 (and other maint branches)
for CVE-2022-24765 and blocking otherwise harmless commands that were
done using sudo in a repository that was owned by the user.
Add a new test script with very basic support to allow running git
commands through sudo, so a reproduction could be implemented and that
uses only `git status` as a proxy of the issue reported.
Note that because of the way sudo interacts with the system, a much
more complete integration with the test framework will require a lot
more work and that was therefore intentionally punted for now.
The current implementation requires the execution of a special cleanup
function which should always be kept as the last "test" or otherwise
the standard cleanup functions will fail because they can't remove
the root owned directories that are used. This also means that if
failures are found while running, the specifics of the failure might
not be kept for further debugging and if the test was interrupted, it
will be necessary to clean the working directory manually before
restarting by running:
$ sudo rm -rf trash\ directory.t0034-root-safe-directory/
The test file also uses at least one initial "setup" test that creates
a parallel execution directory under the "root" sub directory, which
should be used as top level directory for all repositories that are
used in this test file. Unlike all other tests the repository provided
by the test framework should go unused.
Special care should be taken when invoking commands through sudo, since
the environment is otherwise independent from what the test framework
setup and might have changed the values for HOME, SHELL and dropped
several relevant environment variables for your test. Indeed `git status`
was used as a proxy because it doesn't even require commits in the
repository to work and usually doesn't require much from the environment
to run, but a future patch will add calls to `git init` and that will
fail to honor the default branch name, unless that setting is NOT
provided through an environment variable (which means even a CI run
could fail that test if enabled incorrectly).
A new SUDO prerequisite is provided that does some sanity checking
to make sure the sudo command that will be used allows for passwordless
execution as root without restrictions and doesn't mess with git's
execution path. This matches what is provided by the macOS agents that
are used as part of GitHub actions and probably nowhere else.
Most of those characteristics make this test mostly only suitable for
CI, but it might be executed locally if special care is taken to provide
for all of them in the local configuration and maybe making use of the
sudo credential cache by first invoking sudo, entering your password if
needed, and then invoking the test with:
$ GIT_TEST_ALLOW_SUDO=YES ./t0034-root-safe-directory.sh
If it fails to run, then it means your local setup wouldn't work for the
test because of the configuration sudo has or other system settings, and
things that might help are to comment out sudo's secure_path config, and
make sure that the account you are using has no restrictions on the
commands it can run through sudo, just like is provided for the user in
the CI.
For example (assuming a username of marta for you) something probably
similar to the following entry in your /etc/sudoers (or equivalent) file:
marta ALL=(ALL:ALL) NOPASSWD: ALL
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, GitHub prefills the PR description using the commit message
for single-commit PRs. This results in a duplicate commit message below
the three-dash line if the contributor does not empty out the PR
description before submitting, which adds noise for reviewers.
Add a note to that effect in MyFirstContribution.txt.
This partly addresses:
https://github.com/gitgitgadget/gitgitgadget/issues/340
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "Sending Patches via GitGitGadget" section mentions that the PR
title and description will be used as the cover letter, but does not
explain what is a cover letter or what should be included in it.
Refer readers to the new "The cover letter" section added in a previous
commit.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit added a standalone section on the purpose of the
cover letter, drawing inspiration from the existing content of the
"Preparing Email" section.
Adjust "Preparing Email" to reference "The cover letter", to avoid
content duplication.
Also, use the imperative mode for the cover letter subject, as is done
in "The cover letter".
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An explanation of the purpose of the cover letter is included in the
"Sending Patches with git send-email" / "Preparing Email" section but is
missing from the "Sending Patches via GitGitGadget" section.
Add a standalone section "The cover letter" under the "Getting Started:
Anatomy of a Patch Series" header to explain what the cover letter is
used for and to draft the cover letter of the 'psuh' topic used in the
tutorial.
For now we mostly copy content from the "Sending Patches with git
send-email" section but do not adjust that section, nor the GGG section,
to reference the new section. This is done in following commits.
Also, adjust the "Preparing Email" Asciidoc anchor to avoid conflicts.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before describing how to send patches to the mailing list either with
GitGitGadget or 'git send-email', the MyFirstContribution tutorial
includes a small "Getting Ready to Share" section where the two
different methods are briefly introduced.
Use this section to also describe what a patch series looks like once
submitted, so that readers get an understanding of the end result before
diving into how to accomplish that end result.
Start by copying the "thread overview" section of a recent contribution
from the public-inbox web UI and explaining how each commit is a
separate mail, and point out the cover letter.
Subsequent commits will move the existing description of the purpose of
the cover letter from the 'git send-email' section to this "anatomy"
section.
Also, change the wording in the introductory paragraph to use
"contributions" instead of "patches", since this makes more sense when
talking about GitHub pull requests.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 4c28e4ada0 (commit: die before asking to edit the log
message, 2010-12-20), we have been "leaking" the "author_ident" when
prepare_to_commit() fails. Instead of returning from right there,
introduce an exit status variable and jump to the clean-up label
at the end.
Instead of explicitly releasing the resource with strbuf_release(),
mark the variable with UNLEAK() at the end, together with two other
variables that are already marked as such. If this were in a
utility function that is called number of times, but these are
different, we should explicitly release resources that grow
proportionally to the size of the problem being solved, but
cmd_commit() is like main() and there is no point in spending extra
cycles to release individual pieces of resource at the end, just
before process exit will clean everything for us for free anyway.
This fixes a leak demonstrated by e.g. "t3505-cherry-pick-empty.sh",
but unfortunately we cannot mark it or other affected tests as passing
now with "TEST_PASSES_SANITIZE_LEAK=true" as we'll need to fix many
other memory leaks before doing so.
Incidentally there are two tests that always passes the leak checker
with or without this change. Mark them as such.
This is based on an earlier patch by Ævar, but takes a different
approach that is more maintainable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 522354d70f (Add Travis CI support, 2015-11-27) the CI has used
http://filehost.perforce.com/perforce/ to download binaries from
filehost.perforce.com, they were then moved to this script in
657343a602 (travis-ci: move Travis CI code into dedicated scripts,
2017-09-10).
Let's use https instead for good measure. I don't think we need to
worry about the DNS or network between the GitHub CI and perforce.com
being MitM'd, but using https gives us extra validation of the payload
at least, and is one less thing to worry about when checking where
else we rely on non-TLS'd http connections.
Also, use the same download site at perforce.com for Linux and macOS
tarballs for consistency.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
5ed9fc3fc8 (ci: prevent `perforce` from being quarantined, 2020-02-27)
introduces this prevention for brew, but brew has been removed in a
previous commit, so reintroduce an equivalent option to avoid a possible
regression.
This doesn't affect github actions (as configure now) and is therefore
done silently to avoid any possible scary irrelevant messages.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perfoce's cask in brew is meant[1] to be used only by humans, so replace
its use from the CI with a scripted binary download which is less likely
to fail, as it is done in Linux.
Kept the logic together so it will be less likely to break when moved
around as on the fly code changes in this area are settled, at which
point it will also feasable to ammend it to avoid some of the hardcoded
values by using similar variables to the ones Linux does.
In that same line, a POSIX sh syntax is used instead of the similar one
used in Linux in preparation for an unrelated future change that might
change the shell currently configured for it.
This change reintroduces the risk that the installed binaries might not
work because of being quarantined that was fixed with 5ed9fc3fc8 (ci:
prevent `perforce` from being quarantined, 2020-02-27) but fixing that
now was also punted for simplicity and since the affected cloud provider
is scheduled to be retired with an on the fly change, but should be
addressed if that other change is not integrated further.
The discussion on the need to keep 2 radically different versions of
the binaries to be tested with Linux vs macOS or how to upgrade to
newer versions now that brew won't do that automatically for us has
been punted for now as well. On that line the now obsolete comment
about it in lib.sh was originally being updated by this change but
created conflicts as it is moved around by other on the fly changes,
so will be addressed independently as well.
[1] https://github.com/Homebrew/homebrew-cask/pull/122347#discussion_r856026584
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a future change that will make perforce installation
optional in macOS, make sure that the check for it is done without
triggering scary looking errors and add a user friendly message instead.
All other existing uses of 'type <cmd>' in our shell scripts that
check the availability of a command <cmd> send both standard output
and error stream to /dev/null to squelch "<cmd> not found" diagnostic
output, but this script left the standard error stream shown.
Redirect it just like everybody else to squelch this error message that
we fully expect to see.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix code added in 8d84097f96 (commit-graph: expire commit-graph
files, 2019-06-18) to check the return value of the stat() system
call. Not doing so caused us to use uninitialized memory in the "Bloom
generation is limited by --max-new-filters" test in
t4216-log-bloom.sh:
+ rm -f trace.event
+ pwd
+ GIT_TRACE2_EVENT=[...]/t/trash directory.t4216-log-bloom/limits/trace.event git commit-graph write --reachable --split=replace --changed-paths --max-new-filters=2
==24835== Syscall param utimensat(times[0].tv_sec) points to uninitialised byte(s)
==24835== at 0x499E65A: __utimensat64_helper (utimensat.c:34)
==24835== by 0x4999142: utime (utime.c:36)
==24835== by 0x552BE0: mark_commit_graphs (commit-graph.c:2213)
==24835== by 0x550822: write_commit_graph (commit-graph.c:2424)
==24835== by 0x54E3A0: write_commit_graph_reachable (commit-graph.c:1681)
==24835== by 0x4374BB: graph_write (commit-graph.c:269)
==24835== by 0x436F7D: cmd_commit_graph (commit-graph.c:326)
==24835== by 0x407B9A: run_builtin (git.c:465)
==24835== by 0x406651: handle_builtin (git.c:719)
==24835== by 0x407575: run_argv (git.c:786)
==24835== by 0x406410: cmd_main (git.c:917)
==24835== by 0x511F09: main (common-main.c:56)
==24835== Address 0x1ffeffde70 is on thread 1's stack
==24835== in frame #1, created by utime (utime.c:25)
==24835== Uninitialised value was created by a stack allocation
==24835== at 0x552B50: mark_commit_graphs (commit-graph.c:2201)
==24835==
[...]
error: last command exited with $?=126
not ok 137 - Bloom generation is limited by --max-new-filters
This would happen as we stat'd the non-existing
".git/objects/info/commit-graph" file. Let's fix mark_commit_graphs()
to check the stat()'s return value, and while we're at it fix another
case added in the same commit to do the same.
The caller in expire_commit_graphs() would have been less likely to
run into this, as it's operating on files it just got from readdir(),
but it could still happen due to a race with e.g. a concurrent "rm
-rf" of the commit-graph files.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in my 3b6a8db3b0 (object-file.c: use "enum" return
type for unpack_loose_header(), 2021-10-01) revealed both by running
the test suite with --valgrind, and with the amended "git fsck" test.
In practice this regression in v2.34.0 caused us to claim that we
couldn't parse the header, as opposed to not being able to unpack
it. Before the change in the C code the test_cmp added here would emit:
-error: unable to unpack header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
+error: unable to parse header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
I.e. we'd proceed to call parse_loose_header() on the uninitialized
"hdr" value, and it would have been very unlikely for that
uninitialized memory to be a valid git object.
The other callers of unpack_loose_header() were already checking the
enum values exhaustively. See 3b6a8db3b0 and
5848fb11ac (object-file.c: return ULHR_TOO_LONG on "header too long",
2021-10-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Skip a test added in f1e3df3169 (t: increase test coverage of
signature verification output, 2020-03-04) when running under
valgrind. Due to valgrind's interception of mkstemp() this test will
fail with:
+ pwd
+ TMPDIR=[...]/t/trash directory.t4202-log/bogus git log --show-signature -n1 plain-fail
==7696== VG_(mkstemp): failed to create temp file: [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_d545ddcf
[... 10 more similar lines omitted ..]
valgrind: Startup or configuration error:
valgrind: Can't create client cmdline file in [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_6e542d1d
valgrind: Unable to start up properly. Giving up.
error: last command exited with $?=1
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in b7d11a0f5d (tests: exercise the RUNTIME_PREFIX
feature, 2021-07-24) where tests that want to set up and test a "git"
wrapper in $PATH conflicted with the t/bin/valgrind wrapper(s) doing
the same.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "--add-file" option is used to add the contents from an
untracked file to the archive, the permission mode bits for these
files are sent to the archive-backend specific "write_entry()"
method as-is. We normalize the mode bits for tracked files way
before we pass them to the write_entry() method; we should do the
same here.
This is not strictly needed for "tar" archive-backend, as it has its
own code to further clean them up, but "zip" archive-backend is not
so well prepared.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in "git pull" where `submodule.recurse` is preferred over
`fetch.recurseSubmodules` when performing a fetch
(Documentation/config/fetch.txt says that `fetch.recurseSubmodules`
should be preferred.). Do this by passing the value of the
"--recurse-submodules" CLI option to the underlying fetch, instead of
passing a value that combines the CLI option and config variables.
In other words, this bug occurred because builtin/pull.c is conflating
two similar-sounding, but different concepts:
- Whether "git pull" itself should care about submodules e.g. whether it
should update the submodule worktrees after performing a merge.
- The value of "--recurse-submodules" to pass to the underlying "git
fetch".
Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
invoked with "--recurse-submodules[=value]", overriding the value of
`fetch.recurseSubmodules`.
An alternative (and more obvious) approach to fix the bug would be to
teach "git pull" to understand `fetch.recurseSubmodules`, but the
proposed solution works better because:
- We don't maintain two identical config-parsing implementions in "git
pull" and "git fetch".
- It works better with other commands invoked by "git pull" e.g. "git
merge" won't accidentally respect `fetch.recurseSubmodules`.
Reported-by: Huang Zou <huang.zou@schrodinger.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
* rs/commit-summary-wo-break-rewrite:
commit, sequencer: turn off break_opt for commit summary
Avoid problems from interaction between malloc_check and address
sanitizer.
* pw/test-malloc-with-sanitize-address:
tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
This allows seeing the current intermediate status without adding a new
good or bad commit:
$ git bisect log | tail -1
# status: waiting for bad commit, 1 good commit known
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 73c6de06af ("bisect: don't use invalid oid as rev when
starting") changes the behaviour of `git bisect` to consider invalid
oids as pathspecs again, as in the old shell implementation.
While that behaviour may be desirable, it can also cause confusion. For
example, while bisecting in a particular repo I encountered this:
$ git bisect start d93ff48803f0 v6.3
$
...which led to me sitting for a few moments, wondering why there's no
printout stating the first rev to check.
It turns out that the tag was actually "6.3", not "v6.3", and thus the
bisect was still silently started with only a bad rev, because
d93ff48803f0 was a valid oid and "v6.3" was silently considered to be a
pathspec.
While this behaviour may be desirable, it can be confusing, especially
with different repo conventions either using or not using "v" before
release names, or when a branch name or tag is simply misspelled on the
command line.
In order to avoid situations like this, make it more clear what we're
waiting for:
$ git bisect start d93ff48803f0 v6.3
status: waiting for good commit(s), bad commit known
We already have good output once the bisect process has begun in
earnest, so we don't need to do anything more there.
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6cd80496e9 ("gitk: Resize panes correctly when reducing window size",
2020-10-03) introduces a mechanism to record previously-set sash
positions to make sure that correct values are used while computing
resize proportions. However, if we are not using ttk, then sash
represents only the x coordinate and the recorded sash (`oldsash`) only
includes the x coordinate. When we need to access the y coordinate via
the recorded sash position, we generate the following Application Error
popup:
Error: expected integer but got ""
expected integer but got ""
expected integer but got ""
while executing
"$win sash place 0 $sash0 [lindex $s0 1]"
(procedure "resizeclistpanes" line 38)
invoked from within
"resizeclistpanes .tf.histframe.pwclist 2818"
(command bound to event)
To fix this, if we are not using ttk, we append the sash positions with
the y coordinates before recording them to match the use_ttk case.
Signed-off-by: Halil Sen <halil.sen@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The progress meter of "git blame" was showing incorrect numbers
when processing only parts of the file.
* ea/progress-partial-blame:
blame: report correct number of lines in progress when using ranges
Reimplement "vimdiff[123]" mergetool drivers with a more generic
layout mechanism.
* fr/vimdiff-layout:
mergetools: add description to all diff/merge tools
vimdiff: add tool documentation
vimdiff: integrate layout tests in the unit tests framework ('t' folder)
vimdiff: new implementation with layout support
Various cleanups to "git p4".
* jh/p4-various-fixups: (22 commits)
git-p4: sort imports
git-p4: seperate multiple statements onto seperate lines
git-p4: move inline comments to line above
git-p4: only seperate code blocks by a single empty line
git-p4: compare to singletons with "is" and "is not"
git-p4: normalize indentation of lines in conditionals
git-p4: ensure there is a single space around all operators
git-p4: ensure every comment has a single #
git-p4: remove spaces between dictionary keys and colons
git-p4: remove redundant backslash-continuations inside brackets
git-p4: remove extraneous spaces before function arguments
git-p4: place a single space after every comma
git-p4: removed brackets when assigning multiple return values
git-p4: remove spaces around default arguments
git-p4: remove padding from lists, tuples and function arguments
git-p4: sort and de-duplcate pylint disable list
git-p4: remove commented code
git-p4: convert descriptive class and function comments into docstrings
git-p4: improve consistency of docstring formatting
git-p4: indent with 4-spaces
...
The performance of the "untracked cache" feature has been improved
when "--untracked-files=<mode>" and "status.showUntrackedFiles"
are combined.
* tk/untracked-cache-with-uall:
untracked-cache: support '--untracked-files=all' if configured
untracked-cache: test untracked-cache-bypassing behavior with -uall
When unpacking trees, set the default sparsity of the resultant index based
on repo settings and 'is_sparse_index_allowed()'.
Normally, when executing 'unpack_trees', the output index is marked sparse
when (and only when) it unpacks a sparse directory. However, an index may be
"sparse" even if it contains no sparse directories - when all files fall
inside the sparse-checkout definition or otherwise have SKIP_WORKTREE
disabled. Therefore, the output index may be marked "full" even when it is
"sparse", resulting in unnecessary 'ensure_full_index' calls when writing to
disk. Avoid this by setting the "default" index sparsity to match what is
expected for the repository.
As a consequence of this fix, the (non-merge) 'read-tree' performed when
applying a stash with untracked files no longer expands the index. Update
the corresponding test in 't1092'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.
There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the index read in 'do_read_index()' does not exist on-disk, mark the
index "sparse" if the executing command does not require a full index and
sparse index is otherwise enabled.
Some commands (such as 'git stash -u') implicitly create a new index (when
the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
some operation on it. However, when this index is created, it isn't created
with the same sparsity settings as the repo index. As a result, while these
indexes may be sparse during the operation, they are always expanded before
being written to disk. We can avoid that expansion by defaulting the index
to "sparse", in which case it will only be expanded if the full index is
needed.
Note that the function 'set_new_index_sparsity()' is created despite having
only a single caller because additional callers will be added in a
subsequent patch.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose 'is_sparse_index_allowed()' publicly so that it may be used by
callers outside of 'sparse-index.c'. While no such callers exist yet, it
will be used in a subsequent commit.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests verifying expected 'git stash' behavior in
't1092-sparse-checkout-compatibility'. These cases establish the expected
behavior of 'git stash' in a sparse-checkout and verify consistency both
with and without a sparse index. Although no sparse index compatibility has
been integrated into 'git stash' yet, the tests are all 'expect_success' -
we don't want the cone-mode sparse-checkout behavior to change depending on
whether it is using a sparse index or not. Therefore, we expect these tests
to continue passing once sparse index is integrated with 'git stash'.
Additionally, add performance test cases for 'git stash' both with and
without untracked files. Note that, unlike the other tests in
'p2000-sparse-operations.sh', the tests added for 'stash' are combination
operations. This is done to ensure the stash/unstash is not blocked by the
modification of '$SPARSE_CONE/a' performed as part of 'test_perf_on_all'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git remote -v` (`--verbose`) lists down the names of remotes along with
their URLs. It would be beneficial for users to also specify the filter
types for promisor remotes. Something like this -
origin remote-url (fetch) [blob:none]
origin remote-url (push)
Teach `git remote -v` to also specify the filters for promisor remotes.
Closes: https://github.com/gitgitgadget/git/issues/1211
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`detect-compiler` has accumulated a few compiler dependent workarounds
lately for the more and more ubiquitious gcc12. This is intended to make
CI set-ups work across tool-chain updates, but also help those
developers who build with `DEVELOPER=1`.
Alas, `detect-compiler` uses the locale dependent output of `$(CC) -v`
to parse for the version string, which fails unless it literally
contains ` version`.
Use `LANG=C $(CC) -v` instead to grep for stable output.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of a bogus and over-eager coccinelle rule.
source: <xmqq1qxd6e4x.fsf@gitster.g>
* jc/cocci-xstrdup-or-null-fix:
cocci: drop bogus xstrdup_or_null() rule
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
source: <c36896a1-6247-123b-4fa3-b7eb24af1897@web.de>
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
source: <2c988c7b-0efe-4222-4a43-8124fe1a9da6@web.de>
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
source: <xmqqo80j87g0.fsf_-_@gitster.g>
* jc/show-pathspec-fix:
2.36 show regression fix
Regression fix for 2.36 where "git name-rev" started to sometimes
reference strings after they are freed.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <340c8810-d912-7b18-d46e-a9d43f20216a@web.de>
* rs/name-rev-fix-free-after-use:
Revert "name-rev: release unused name strings"
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <xmqq7d7bsu2n.fsf@gitster.g>
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <pull.1258.v2.git.git.1650890741430.gitgitgadget@gmail.com>
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
The example was not in valid JSON format due to a duplicate key "sid".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We append an ellipsis and enclose it in backticks to indicate that it is
a function elsewhere, let's also use that here.
While at it, ensure the same for `waitpid()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4 is designed to run correctly under python2.7 and python3, but
its functional behavior wrt importing user-entered text differs across
these environments:
Under python2, git-p4 "naively" writes the Perforce bytestream into git
metadata (and does not set an "encoding" header on the commits); this
means that any non-utf-8 byte sequences end up creating invalidly-encoded
commit metadata in git.
Under python3, git-p4 attempts to decode the Perforce bytestream as utf-8
data, and fails badly (with an unhelpful error) when non-utf-8 data is
encountered.
Perforce clients (especially p4v) encourage user entry of changelist
descriptions (and user full names) in OS-local encoding, and store the
resulting bytestream to the server unmodified - such that different
clients can end up creating mutually-unintelligible messages. The most
common inconsistency, in many Perforce environments, is likely to be utf-8
(typical in linux) vs cp-1252 (typical in windows).
Make the changelist-description- and user-fullname-handling code
python-runtime-agnostic, introducing three "strategies" selectable via
config:
- 'passthrough', behaving as previously under python2,
- 'strict', behaving as previously under python3, and
- 'fallback', favoring utf-8 but supporting a secondary encoding when
utf-8 decoding fails, and finally escaping high-range bytes if the
decoding with the secondary encoding also fails.
Keep the python2 default behavior as-is ('legacy' strategy), but switch
the python3 default strategy to 'fallback' with default fallback encoding
'cp1252'.
Also include tests exercising these encoding strategies, documentation for
the new config, and improve the user-facing error messages when decoding
does fail.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
* jc/clone-remote-name-leak-fix:
clone: plug a miniscule leak
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
* jc/show-pathspec-fix:
2.36 show regression fix
Add a coccinelle semantic patch necessary to reinforce the git coding style
guideline:
"Do not explicitly compute an integral value with constant 0 or '\ 0', or a
pointer value with constant NULL."
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
13092a91 (cocci: refactor common patterns to use xstrdup_or_null(),
2016-10-12) introduced a rule to rewrite this conditional call to
xstrdup(E) and an assignment to variable V:
- if (E)
- V = xstrdup(E);
into an unconditional call to xstrdup_or_null(E) and an assignment
to variable V:
+ V = xstrdup_or_null(E);
which is utterly bogus. The original code may already have an
acceptable value in V and the conditional assignment may be to
improve the value already in V with a copy of a better value E when
(and only when) E is not NULL.
The rewritten construct unconditionally discards the existing value
of V and replaces it with a copy of E, even when E is NULL, which
changes the meaning of the program.
By the way, if it were
-if (E && !V)
- V = xstrdup(E);
+V = xstrdup_or_null(E);
it would probably have been correct. But there is no existing code
that would have been improved by such a rule, so let's just remove
the bogus one without replacing with the more specific one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The remote_name variable is first assigned a copy of the value of
the "clone.defaultremotename" configuration variable and then by the
value of the "--origin" command line option. The former is prepared
to see multiple instances of the configuration variable by freeing
the current value of the variable before a copy of the newly
discovered value gets assigned to it. The latter however blindly
assigned a copy of the new value to the variable, thereby leaking
the value read from the configuration variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git fast-export doesn't set no_free, so path limiting stopped working
after the first commit. Set the flag and add a basic test to make sure
only changes to the specified files are exported.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git format-patch only sets no_free when --output is given, causing it to
forget pathspecs after the first commit. Set no_free unconditionally
instead.
The existing test was unable to detect this breakage because it checks
stderr for the absence of a certain string, but format-patch writes to
stdout. Also the test was not checking the case of one commit modifying
multiple files and a pathspec limiting the diff. Replace it with a more
thorough one.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
e900d494 (diff: add an API for deferred freeing, 2021-02-11)
introduced a mechanism to delay freeing resources held in
diff_options struct that need to be kept as long as the struct will
be reused to compute diff. "git log -p" was taught to utilize the
mechanism but it was done with an incorrect assumption that the
underlying helper function, cmd_log_walk(), is called only once,
and it is OK to do the freeing at the end of it.
Alas, for "git show A B", the function is called once for each
commit given, so it is not OK to free the resources until we finish
calling it for all the commits given from the command line.
During 2.36 release cycle, we started clearing the <pathspec> as
part of this freeing, which made the bug a lot more visible.
Fix this breakage by tweaking how cmd_log_walk() frees the resources
at the end and using a variant of it that does not immediately free
the resources to show each commit object from the command line in
"git show".
Protect the fix with a few new tests.
Reported-by: Daniel Li <dan@danielyli.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some "simple" centralized workflows, users expect remote tracking
branch names to match local branch names. "git push" pushes to the
remote version/instance of the branch, and "git pull" pulls any changes
to the remote branch (changes made by the same user in another place, or
by other users).
This expectation is supported by the push.default default option "simple"
which refuses a default push for a mismatching tracking branch name, and
by the new branch.autosetupmerge option, "simple", which only sets up
remote tracking for same-name remote branches.
When a new branch has been created by the user and has not yet been
pushed (and push.default is not set to "current"), the user is prompted
with a "The current branch %s has no upstream branch" error, and
instructions on how to push and add tracking.
This error is helpful in that following the advice once per branch
"resolves" the issue for that branch forever, but inconvenient in that
for the "simple" centralized workflow, this is always the right thing to
do, so it would be better to just do it.
Support this workflow with a new config setting, push.autoSetupRemote,
which will cause a default push, when there is no remote tracking branch
configured, to push to the same-name on the remote and --set-upstream.
Also add a hint offering this new option when the "The current branch %s
has no upstream branch" error is encountered, and add corresponding tests.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With "push.default=current" configured, a simple "git push" will push to
the same-name branch on the current branch's branch.<name>.pushRemote, or
remote.pushDefault, or origin. If none of these are defined, the push will
fail with error "fatal: No configured push destination".
The same "default to origin if no config" behavior applies with
"push.default=matching".
Other commands use "origin" as a default when there are multiple options,
but default to the single remote when there is only one - for example,
"git checkout <something>". This "assume the single remote if there is
only one" behavior is more friendly/useful than a defaulting behavior
that only uses the name "origin" no matter what.
Update "git push" to also default to the single remote (and finally fall
back to "origin" as default if there are several), for
"push.default=current" and for other current and future remote-defaulting
push behaviors.
This change also modifies the behavior of ls-remote in a consistent way,
so defaulting not only supplies 'origin', but any single configured remote
also.
Document the change in behavior, correct incorrect assumptions in related
tests, and add test cases reflecting this new single-remote-defaulting
behavior.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the default push.default option, "simple", beginners are
protected from accidentally pushing to the "wrong" branch in
centralized workflows: if the remote tracking branch they would push
to does not have the same name as the local branch, and they try to do
a "default push", they get an error and explanation with options.
There is a particular centralized workflow where this often happens:
a user branches to a new local topic branch from an existing
remote branch, eg with "checkout -b feature1 origin/master". With
the default branch.autosetupmerge configuration (value "true"), git
will automatically add origin/master as the upstream tracking branch.
When the user pushes with a default "git push", with the intention of
pushing their (new) topic branch to the remote, they get an error, and
(amongst other things) a suggestion to run "git push origin HEAD".
If they follow this suggestion the push succeeds, but on subsequent
default pushes they continue to get an error - so eventually they
figure out to add "-u" to change the tracking branch, or they spelunk
the push.default config doc as proposed and set it to "current", or
some GUI tooling does one or the other of these things for them.
When one of their coworkers later works on the same topic branch,
they don't get any of that "weirdness". They just "git checkout
feature1" and everything works exactly as they expect, with the shared
remote branch set up as remote tracking branch, and push and pull
working out of the box.
The "stable state" for this way of working is that local branches have
the same-name remote tracking branch (origin/feature1 in this
example), and multiple people can work on that remote feature branch
at the same time, trusting "git pull" to merge or rebase as required
for them to be able to push their interim changes to that same feature
branch on that same remote.
(merging from the upstream "master" branch, and merging back to it,
are separate more involved processes in this flow).
There is a problem in this flow/way of working, however, which is that
the first user, when they first branched from origin/master, ended up
with the "wrong" remote tracking branch (different from the stable
state). For a while, before they pushed (and maybe longer, if they
don't use -u/--set-upstream), their "git pull" wasn't getting other
users' changes to the feature branch - it was getting any changes from
the remote "master" branch instead (a completely different class of
changes!)
An experienced git user might say "well yeah, that's what it means to
have the remote tracking branch set to origin/master!" - but the
original user above didn't *ask* to have the remote master branch
added as remote tracking branch - that just happened automatically
when they branched their feature branch. They didn't necessarily even
notice or understand the meaning of the "set up to track 'origin/master'"
message when they created the branch - especially if they are using a
GUI.
Looking at how to fix this, you might think "OK, so disable auto setup
of remote tracking - set branch.autosetupmerge to false" - but that
will inconvenience the *second* user in this story - the one who just
wanted to start working on the topic branch. The first and second
users swap roles at different points in time of course - they should
both have a sane configuration that does the right thing in both
situations.
Make this "branches have the same name locally as on the remote"
workflow less painful / more obvious by introducing a new
branch.autosetupmerge option called "simple", to match the same-name
"push.default" option that makes similar assumptions.
This new option automatically sets up tracking in a *subset* of the
current default situations: when the original ref is a remote tracking
branch *and* has the same branch name on the remote (as the new local
branch name).
Update the error displayed when the 'push.default=simple' configuration
rejects a mismatching-upstream-name default push, to offer this new
branch.autosetupmerge option that will prevent this class of error.
With this new configuration, in the example situation above, the first
user does *not* get origin/master set up as the tracking branch for
the new local branch. If they "git pull" in their new local-only
branch, they get an error explaining there is no upstream branch -
which makes sense and is helpful. If they "git push", they get an
error explaining how to push *and* suggesting they specify
--set-upstream - which is exactly the right thing to do for them.
This new option is likely not appropriate for users intentionally
implementing a "triangular workflow" with a shared upstream tracking
branch, that they "git pull" in and a "private" feature branch that
they push/force-push to just for remote safe-keeping until they are
ready to push up to the shared branch explicitly/separately. Such
users are likely to prefer keeping the current default
merge.autosetupmerge=true behavior, and change their push.default to
"current".
Also extend the existing branch tests with three new cases testing
this option - the obvious matching-name and non-matching-name cases,
and also a non-matching-ref-type case. The matching-name case needs to
temporarily create an independent repo to fetch from, as the general
strategy of using the local repo as the remote in these tests
precludes locally branching with the same name as in the "remote".
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Regression fix for 2.36 where "git name-rev" started to sometimes
reference strings after they are freed.
* rs/name-rev-fix-free-after-use:
Revert "name-rev: release unused name strings"
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
The description of 'safe.directory' mentions that it's respected in
the system and global configs, and ignored in the repository config
and on the command line, but it doesn't mention whether it's respected
or ignored when specified via environment variables (nor does the
commit message adding 'safe.directory' [1]).
Clarify that 'safe.directory' is ignored when specified in the
environment, and add tests to make sure that it remains so.
[1] 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the documentation 'safe.directory' "is only respected
when specified in a system or global config, not when it is specified
in a repository config or via the command line option -c
safe.directory=<path>".
Add tests to check that 'safe.directory' in the repository config or
on the command line is indeed ignored.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02) when git finds itself in a repository
owned by someone else, it aborts with a "fatal: unsafe repository
(<repo path>)" error message and an advice about how to set the
'safe.directory' config variable to mark that repository as safe.
't0033-safe-directory.sh' contains tests that check that this feature
and handling said config work as intended. To ensure that git dies
for the right reason, several of those tests check that its standard
error contains the name of that config variable, but:
- it only appears in the advice part, not in the actual error
message.
- it is interpreted as a regexp by 'grep', so, because of the dot,
it matches the name of the test script and the path of the trash
directory as well. Consequently, these tests could be fooled by
any error message that would happen to include the path of the
test repository.
Tighten these checks to look for "unsafe repository" instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not obvious that the 'git rev-parse' builtin would use the sparse
index, but it is possible to parse paths out of the index using the
":<path>" syntax. The 'git rev-parse' output is only the OID of the
object found at that location, but otherwise behaves similarly to 'git
show :<path>'. This includes the failure conditions on directories and
the error messages depending on whether a path is in the worktree or
not.
The only code change required is to change the
command_requires_full_index setting in builtin/rev-parse.c, and we can
re-use many existing 'git show' tests for the rev-parse case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git show :<path>' where '<path>' is a directory, then
there is a subtle difference between a full checkout and a sparse
checkout. The error message from diagnose_invalid_index_path() reports
whether the path is on disk or not. The full checkout will have the
directory on disk, but the path will not be in the index. The sparse
checkout could have the directory not exist, specifically when that
directory is outside of the sparse-checkout cone.
In the case of a sparse index, we have yet another state: the path can
be a sparse directory in the index. In this case, the error message from
diagnose_invalid_index_path() would erroneously say "path '<path>' is in
the index, but not at stage 0", which is false.
Add special casing around sparse directory entries so we get to the
correct error message. This requires two checks in order to get parity
with the normal sparse-checkout case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_oid_with_context_1() method is used when parsing revision
arguments. One particular case is to take a ":<path>" string and search
the index for the given path.
In the case of a sparse index, this might find a sparse directory entry,
in which case the contained object is a tree. In the case of a full
index, this search within the index would fail.
In order to maintain identical return state as in a full index, inspect
the discovered cache entry to see if it is a sparse directory and reject
it. This requires being careful around the only_to_die option to be sure
we die only at the correct time.
This changes the behavior of 'git show :<sparse-dir>', but does not
bring it entirely into alignment with a full index case. It specifically
hits the wrong error message within diagnose_invalid_index_path(). That
error message will be corrected in a future change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git show' command can take an input to request the state of an
object in the index. This can lead to parsing the index in order to load
a specific file entry. Without the change presented here, a sparse index
would expand to a full one, taking much longer than usual to access a
simple file.
There is one behavioral change that happens here, though: we now can
find a sparse directory entry within the index! Commands that previously
failed because we could not find an entry in the worktree or index now
succeed because we _do_ find an entry in the index.
There might be more work to do to make other situations succeed when
looking for an indexed tree, perhaps by looking at or updating the
cache-tree extension as needed. These situations include having a full
index or asking for a directory that is within the sparse-checkout cone
(and hence is not a sparse directory entry in the index).
For now, we demonstrate how the sparse index integration is extremely
simple for files outside of the cone as well as directories within the
cone. A later change will resolve this behavior around sparse
directories.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .warn_if_uninitialized member was introduced by 48308681
(git submodule update: have a dedicated helper for cloning,
2016-02-29) to submodule_update_clone struct and initialized to
false. When c9911c93 (submodule--helper: teach update_data more
options, 2022-03-15) moved it to update_data struct, it started
to initialize it to true but this change was not explained in
its log message.
The member is set to true only when pathspec was given, and is
used when a submodule that matched the pathspec is found
uninitialized to give diagnostic message. "submodule update"
without pathspec is supposed to iterate over all submodules
(i.e. without pathspec limitation) and update only the
initialized submodules, and finding uninitialized submodules
during the iteration is a totally expected and normal thing that
should not be warned.
[jc: added tests]
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
The diff_free() call is to be used after we completely finished with
a diffopt structure. After "git diff A B" finishes producing
output, calling it before process exit is fine. But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".
After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine. There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and
instead call it just one. When the mechanism was introduced in
e900d494 (diff: add an API for deferred freeing, 2021-02-11),
however, we forgot to do the same to "diff-tree --stdin", which *is*
a moral equivalent to "git log".
During 2.36 release cycle, we started clearing the pathspec in
diff_free(), so programs like gitk that runs
git diff-tree --stdin -- <pathspec>
downstream of a pipe, processing one commit after another, started
showing irrelevant comparison outside the given <pathspec> from the
second commit. The same commit, by forgetting to teach the .no_free
mechanism, broke "diff-tree --stdin -I<regexp>" and nobody noticed
it for over a year, presumably because it is so seldom used an
option.
But <pathspec> is a different story. The breakage was very
prominently visible and was reported immediately after 2.36 was
released.
Fix this breakage by mimicking how "git log" utilizes the .no_free
member so that "diff-tree --stdin" behaves more similarly to "log".
Protect the fix with a few new tests.
Reported-by: Matthias Aßhauer <mha1993@live.de>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_object_directory() method returns the exact string stored at
the_repository->objects->odb->path. The return type of "char *" implies
that the caller must keep track of the buffer and free() it when
complete. This causes significant problems later when the ODB is
accessed.
Use "const char *" as the return type to avoid this confusion. There are
no current callers that care about the non-const definition.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --object-dir argument to 'git multi-pack-index' allows a user to
specify an alternate to use instead of the local $GITDIR. This is used
by third-party tools like VFS for Git to maintain the pack-files in a
"shared object cache" used by multiple clones.
On Windows, the user can specify a path using a Windows-style file path
with backslashes such as "C:\Path\To\ObjectDir". This same path style is
used in the .git/objects/info/alternates file, so it already matches the
path of that alternate. However, find_odb() converts these paths to
real-paths for the comparison, which use forward slashes. As of the
previous change, lookup_multi_pack_index() uses real-paths, so it
correctly finds the target multi-pack-index when given these paths.
Some commands such as 'git multi-pack-index repack' call child processes
using the object_dir value, so it can be helpful to convert the path to
the real-path before sending it to those locations.
Add a callback to convert the real path immediately upon parsing the
argument. We need to be careful that we don't store the exact value out
of get_object_directory() and free it, or we could corrupt a later use
of the_repository->objects->odb->path.
We don't use get_object_directory() for the initial instantiation in
cmd_multi_pack_index() because we need 'git multi-pack-index -h' to work
without a Git repository.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helper looks for a parsed multi-pack-index whose object directory
matches the given object_dir. Before going into the loop over the parsed
multi-pack-indexes, it calls find_odb() to ensure that the given
object_dir is actually a known object directory.
However, find_odb() uses real-path manipulations to compare the input to
the alternate directories. This same real-path comparison is not used in
the loop, leading to potential issues with the strcmp().
Update the method to use the real-path values instead.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():
BUG: create called without valid new_oid
This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).
The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.
[1] https://github.com/git-for-windows/git/issues/3781
There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.
The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.
While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.
The added test helps check against a regression, but does check that our
intended error message is handled correctly.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2d53975488.
3656f84278 (name-rev: prefer shorter names over following merges,
2021-12-04) broke the assumption of 2d53975488 (name-rev: release unused
name strings, 2020-02-04) that a better name for a child is a better
name for all of its ancestors as well, because it added a penalty for
generation > 0. This leads to strings being free(3)'d that are still
needed.
079f970971 (name-rev: sort tip names before applying, 2020-02-05)
already reduced the number of free(3) calls for the use case that
motivated the original patch (name-rev --all in the Chromium repository)
from ca. 44000 to 5, and 3656f84278 eliminated even those few. So this
revert won't affect name-rev's performance on that particular repo.
Reported-by: Thomas Hurst <tom@hur.st>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--since=<time>" option of "git log" limits the commits displayed by
the command by stopping the traversal once it sees a commit whose
timestamp is older than the given time and not digging further into its
parents.
This is OK in a history where a commit always has a newer timestamp than
any of its parents'. Once you see a commit older than the given <time>,
all ancestor commits of it are even older than the time anyway. It
poses, however, a problem when there is a commit with a wrong timestamp
that makes it appear older than its parents. Stopping traversal at the
"incorrectly old" commit will hide its ancestors that are newer than
that wrong commit and are newer than the cut-off time given with the
--since option. --max-age and --after being the synonyms to --since,
they share the same issue.
Add a new "--since-as-filter" option that is a variant of
"--since=<time>". Instead of stopping the traversal to hide an old
enough commit and its all ancestors, exclude commits with an old
timestamp from the output but still keep digging the history.
Without other traversal stopping options, this will force the command in
"git log" family to dig down the history to the root. It may be an
acceptable cost for a small project with short history and many commits
with screwy timestamps.
It is quite unlikely for us to add traversal stopper other than since,
so have this as a --since-as-filter option, rather than a separate
--as-filter, that would be probably more confusing.
Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in 707d2f2fe8 (CI: use "$runs_on_pool", not
"$jobname" to select packages & config, 2021-11-23).
In that commit I changed CC=gcc from CC=gcc-9, but on OSX the "gcc" in
$PATH points to clang, we need to use gcc-9 instead. Likewise for the
linux-gcc job CC=gcc-8 was changed to the implicit CC=gcc, which would
select GCC 9.4.0 instead of GCC 8.4.0.
Furthermore in 25715419bf (CI: don't run "make test" twice in one
job, 2021-11-23) when the "linux-TEST-vars" job was split off from
"linux-gcc" the "cc_package: gcc-8" line was copied along with
it, so its "cc_package" line wasn't working as intended either.
As a table, this is what's changed by this commit, i.e. it only
affects the linux-gcc, linux-TEST-vars and osx-gcc jobs:
|-------------------+-----------+-------------------+-------+-------|
| jobname | vector.cc | vector.cc_package | old | new |
|-------------------+-----------+-------------------+-------+-------|
| linux-clang | clang | - | clang | clang |
| linux-sha256 | clang | - | clang | clang |
| linux-gcc | gcc | gcc-8 | gcc | gcc-8 |
| osx-clang | clang | - | clang | clang |
| osx-gcc | gcc | gcc-9 | clang | gcc-9 |
| linux-gcc-default | gcc | - | gcc | gcc |
| linux-TEST-vars | gcc | gcc-8 | gcc | gcc-8 |
|-------------------+-----------+-------------------+-------+-------|
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the wording for a couple paragraphs in two different manuals
relating to sparse behavior.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we have no current plans to actually remove --no-cone mode, we
think users would be better off not using it. Update the documentation
accordingly, including explaining why we think non-cone mode is
problematic for users.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "Internals -- Cone Pattern Set" section starts off discussing
patterns, despite the fact that cone mode is about avoiding the
patterns. This made sense back when non-cone mode was the default and
we started by discussing the full pattern set, but now that we are
changing the default, it makes more sense to discuss cone-mode first and
avoid the full discussion of patterns. Split this section into two, the
first with details about how cone mode operates, and the second
following the full pattern set section and discussing how the cone mode
patterns used under the hood relate to the full pattern set.
While at it, flesh out the "Internals -- Full Pattern Set" section a bit
to include more examples as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since many users like to learn from examples, provide a section in the
manual with example commands that would be used and a brief explanation
of what each does.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With cone mode as the default, it makes sense to discuss it before
non-cone mode. Also, the new default means we can just use directories
in most cases and users do not need to understand patterns or their
meanings. Let's take advantage of this to mark several sections as
"INTERNALS", notifying the user that they do not need to know all those
details in order to make use of the sparse-checkout command.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'init' subcommand of sparse-checkout was deprecated in ba2f3f58ac
("git-sparse-checkout.txt: update to document init/set/reapply changes",
2021-12-14), but a couple places in the manual still assumed it was the
primary way to use sparse-checkout. Correct them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that cone mode is the default, we'd like to focus on the arguments
to set/add being directories rather than patterns, and it probably makes
sense to provide an earlier heads up that files from leading directories
get included as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make cone mode the default, and update the documentation accordingly.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an explicit --no-cone to several sparse-checkout invocations in
preparation for changing the default to cone mode.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--current" is given to "git show-branch" running in the
"--reflog" mode, the code tries to reference a "reflog" message
that does not even exist. This is because the --current is not
prepared to work in that mode.
The reason "--current" exists is to support this request:
I list branches on the command line. These are the branchesI
care about and I use as anchoring points. I may or may not be on
one of these main branches. Please make sure I can view the
commits on the current branch with respect to what is in these
other branches.
And to serve that request, the code checks if the current branch is
among the ones listed on the command line, and adds it only if it is
not to the end of one array, which essentially lists the objects.
The reflog mode additionally uses another array to list reflog
messages, which the "--current" code does not add to. This leaves
one uninitialized slot at the end of the array of reflog messages,
and causes the program to show garbage or segfault.
Catch the unsupported (and meaningless) combination and exit with a
usage error.
There are other combinations of options that are incompatible but
have not been tested. Add test to cover them while adding coverage
for this new combination.
Reported-by: Gregory David <gregory.david@p1sec.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This document gathers tips, scripts and configuration file to help
people working on Git’s codebase use their favorite tools while
following Git’s coding style.
Move the part about Emacs configuration from CodingGuidelines to
ToolsForGit.txt because it's the purpose of the new file centralize the
information about tools.
But, add a mention to Documentation/ToolsForGit.txt in CodingGuidelines
because there is also information about the coding style in it.
Helped-by: Matthieu Moy <Matthieu.Moy@univ-lyon1.fr>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: COGONI Guillaume <cogoni.guillaume@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--keep-base rebases onto the merge base of the given upstream and the
current HEAD regardless of whether a branch is given. This is contrary
to the documentation and to the option's intended purpose. Instead,
rebase onto the merge base of the given upstream and the given branch.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git uses temporary files to pass the contents of blobs to external diff
programs and textconv filters. It calls mks_tempfile_ts() to create
them, which puts them all in the same directory. This requires adding
a random name prefix.
Use mks_tempfile_dt() instead, which allows the files to have arbitrary
names, each in their own separate temporary directory. This way they
can have the same basename as the original blob, which looks nicer in
graphical diff programs.
The test in t4020 to check the prettiness of the temporary paths was
neutered by 5476bdf0e8 (diff tests: don't ignore "git diff" exit code in
"read" loop, 2022-03-07), which removed its grep check without replacing
it with an equivalent test_cmp check. Add one that only checks the
basename of the temporary file and nothing else.
And make the test more robust while at it, by using test_when_finished
to get rid of the added file even if the test fails.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function to create a temporary file with a certain name in a
temporary directory created using mkdtemp(3). Its result is more
sightly than the paths created by mks_tempfile_ts(), which include
a random prefix. That's useful for files passed to a program that
displays their name, e.g. an external diff tool.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two reasons that we could return NULL early within
load_commit_graph_chain():
1. The file does not exist, so the file pointer is NULL.
2. The file exists, but is too small to contain a single hash.
These were grouped together when the function was first written in
5c84b3396 (commit-graph: load commit-graph chains, 2019-06-18) in order
to simplify how the 'chain_name' string is freed. However, the current
code leaves a narrow window where the file pointer is not closed when
the file exists, but is rejected for being too small.
Split out these cases separately to ensure we close the file in this
case.
Signed-off-by: Kleber Tarcísio <klebertarcisio@yahoo.com.br>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is an if statement where both if and else have the same
assignment of options.type to REBASE_MERGE. Simplify
it by getting that assigmnent out of the if.
Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of work around for CI breaking warnings from gcc 12.
* cb/buggy-gcc-12-workaround:
config.mak.dev: alternative workaround to gcc 12 warning in http.c
config.mak.dev: workaround gcc 12 bug affecting "pedantic" CI job
This provides a "no code change needed" option to the "fix" currently
queued as part of ab/http-gcc-12-workaround and therefore should be
reverted once that gets merged.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally noticed by Peff[1], but yet to be corrected[2] and planned to
be released with Fedora 36 (scheduled for Apr 19).
dir.c: In function ‘git_url_basename’:
dir.c:3085:13: error: ‘memchr’ specified bound [9223372036854775808, 0] exceeds maximum object size 9223372036854775807 [-Werror=stringop-overread]
3085 | if (memchr(start, '/', end - start) == NULL
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fedora is used as part of the CI, and therefore that release will trigger
failures, unless the version of the image used is locked to an older
release, as an alternative.
Restricting the flag to the affected source file, as well as implementing
an independent facility to track these workarounds was specifically punted
to minimize the risk of introducing problems so close to a release.
This change should be reverted once the underlying gcc bug is solved and
which should be visible by NOT triggering a warning, otherwise.
[1] https://lore.kernel.org/git/YZQhLh2BU5Hquhpo@coredump.intra.peff.net/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2075786
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1214aa841b (reftable: add blocksource, an abstraction for random
access reads, 2021-10-07), makes the assumption that it is ok to
free a reftable_block pointing to NULL if the size is also set to
0, but implements that using a memset call that at least in glibc
based system will trigger a runtime exception if called with a
NULL pointer as its first parameter.
Avoid doing so by adding a conditional to check for the size in all
three identically looking functions that were affected, and therefore,
still allow memset to help catch callers that might incorrectly pass
a NULL pointer with a non zero size, but avoiding the exception for
the valid cases.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revert the "deletion of a ref should not trigger transaction events
for loose and packed ref backends separately" that regresses the
behaviour when a ref is not modified since it was packed.
* jc/revert-ref-transaction-hook-changes:
RelNotes: revert the description on the reverted topics
Revert "fetch: increase test coverage of fetches"
Revert "Merge branch 'ps/avoid-unnecessary-hook-invocation-with-packed-refs'"
* update the following words translations: commit, untracked, stage,
cache, stash, work..., index, reset, label, check..., tags, graft,
alternate object, amend, ancestor, cherry-pick, bisect, blame, chain,
cache, bug, chunk, branch, bundle, clean, clone, commit-graph, commit
object, commit-ish, committer, cover letter, conflict, dangling,
detach, dir, dumb, fast-forward, file system, fixup, fork, fetch, Git
archive, gitdir, graft, replace ref
* correct some mispellings
* git-po-helper update
* remove some obsolete lines
* unfuzzy entries
* random translation updates
* update contact in pt_PT.po
* add the following words to the translation table: override, recurse,
print, offset, unbundle, mirror repository, multi-pack, bad,
whitespace, batch
* remove the following words of the translation table: core Git
* change the following words on the translation table: dry-run, apply,
patch, replay, blame, chain, gitdir, file system, fork, unset, handle
* some translation to the first person
* update copyright text
* word 'utilização:' to 'uso:'
* word 'pai' to 'parente'
Signed-off-by: Daniel Santos <dacs.git@brilhante.top>
Add a TODO comment indicating that we should release "diffopt" in
release_revisions(). In a preceding commit we started releasing the
"pruning" member of the same type, but handling "diffopt" will require
us to untangle the "no_free" conditions I added in e900d494dc (diff:
add an API for deferred freeing, 2021-02-11).
Let's leave a TODO comment to that effect, and so that we don't forget
refactor code that was changed to use release_revisions() in earlier
commits to stop using the "diffopt" member after a call to
release_revisions(). This works currently, but would become a logic
error as soon as we started freeing "diffopt". Doing that change now
doesn't harm anything, and future-proofs us against a later change to
release_revisions().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the existing reset_topo_walk() into a thin wrapper for a
release_revisions_topo_walk_info() + resetting the member to "NULL",
and call release_revisions_topo_walk_info() from release_revisions().
This fixes memory leaks that have been with us ever since
"topo_walk_info" was added to revision.[ch] in
f0d9cc4196 (revision.c: begin refactoring --topo-order logic,
2018-11-01).
Due to various other leaks this makes no tests pass in their entirety,
but e.g. before this running this on git.git:
./git -P log --pretty=tformat:"%P %H | %s" --parents --full-history --topo-order -3 -- README.md
Would report under SANITIZE=leak:
SUMMARY: LeakSanitizer: 531064 byte(s) leaked in 6 allocation(s).
Now we'll free all of that memory.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"date_mode" in the "struct ref_info".
This uses the date_mode_release() function added in 974c919d36 (date
API: add and use a date_mode_release(), 2022-02-16). As that commit
notes "t7004-tag.sh" tests for the leaks that are being fixed
here. That test now fails "only" 44 tests, instead of the 46 it failed
before this change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call diff_free() on the "pruning" member of "struct rev_info". Doing
so makes several tests pass under SANITIZE=leak.
This was also the last missing piece that allows us to remove the
UNLEAK() in "cmd_diff" and "cmd_diff_index", which allows us to use
those commands as a canary for general leaks in the revisions API. See
[1] for further rationale, and 886e1084d7 (builtin/: add UNLEAKs,
2017-10-01) for the commit that added the UNLEAK() there.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a missing reflog_walk_info_release() to "reflog-walk.c" and use it
in release_revisions().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clear the "boundary_commits" object_array in release_revisions(). This
makes a few more tests pass under SANITIZE=leak, including
"t/t4126-apply-empty.sh" which started failed as an UNLEAK() in
cmd_format_patch() was removed in a preceding commit.
This also re-marks the various tests relying on "git format-patch" as
passing under "SANITIZE=leak", in the preceding "revisions API users:
use release_revisions() in builtin/log.c" commit those were marked as
failing as we removed the UNLEAK(rev) from cmd_format_patch() in
"builtin/log.c".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"prune_data" in the "struct rev_info". This means that any code that
calls "release_revisions()" already can get rid of adjacent calls to
clear_pathspec().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"grep_filter" in the "struct rev_info".This allows us to mark a test
as passing under "TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"filter" in the "struct rev_info". This in combination with a
preceding change to free "cmdline" means that we can mark another set
of tests as passing under "TEST_PASSES_SANITIZE_LEAK=true".
The "filter" member was added recently in ffaa137f64 (revision: put
object filter into struct rev_info, 2022-03-09), and this fixes leaks
intruded in the subsequent leak 7940941de1 (pack-objects: use
rev.filter when possible, 2022-03-09) and 105c6f14ad (bundle: parse
filter capability, 2022-03-09).
The "builtin/pack-objects.c" leak in 7940941de1 was effectively with
us already, but the variable was referred to by a "static" file-scoped
variable. The "bundle.c " leak in 105c6f14ad was newly introduced
with the new "filter" feature for bundles.
The "t5600-clone-fail-cleanup.sh" change here to add
"TEST_PASSES_SANITIZE_LEAK=true" is one of the cases where
run-command.c in not carrying the abort() exit code upwards would have
had that test passing before, but now it *actually* passes[1]. We
should fix the lack of 1=1 mapping of SANITIZE=leak testing to actual
leaks some other time, but it's an existing edge case, let's just mark
the really-passing test as passing for now.
1. https://lore.kernel.org/git/220303.86fsnz5o9w.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"cmdline" in the "struct rev_info". This in combination with a
preceding change to free "commits" and "mailmap" means that we can
whitelist another test under "TEST_PASSES_SANITIZE_LEAK=true".
There was a proposal in [1] to do away with xstrdup()-ing this
add_rev_cmdline(), perhaps that would be worthwhile, but for now let's
just free() it.
We could also make that a "char *" in "struct rev_cmdline_entry"
itself, but since we own it let's expose it as a constant to outside
callers. I proposed that in [2] but have since changed my mind. See
14d30cdfc0 (ref-filter: fix memory leak in `free_array_item()`,
2019-07-10), c514c62a4f (checkout: fix leak of non-existent branch
names, 2020-08-14) and other log history hits for "free((char *)" for
prior art.
This includes the tests we had false-positive passes on before my
6798b08e84 (perl Git.pm: don't ignore signalled failure in
_cmd_close(), 2022-02-01), now they pass for real.
Since there are 66 tests matching t/t[0-9]*git-svn*.sh it's easier to
list those that don't pass than to touch most of those 66. So let's
introduce a "TEST_FAILS_SANITIZE_LEAK=true", which if set in the tests
won't cause lib-git-svn.sh to set "TEST_PASSES_SANITIZE_LEAK=true.
This change also marks all the tests that we removed
"TEST_FAILS_SANITIZE_LEAK=true" from in an earlier commit due to
removing the UNLEAK() from cmd_format_patch(), we can now assert that
its API use doesn't leak any "struct rev_info" memory.
This change also made commit "t5503-tagfollow.sh" pass on current
master, but that would regress when combined with
ps/fetch-atomic-fixup's de004e848a (t5503: simplify setup of test
which exercises failure of backfill, 2022-03-03) (through no fault of
that topic, that change started using "git clone" in the test, which
has an outstanding leak). Let's leave that test out for now to avoid
in-flight semantic conflicts.
1. https://lore.kernel.org/git/YUj%2FgFRh6pwrZalY@carlos-mbp.lan/
2. https://lore.kernel.org/git/87o88obkb1.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".
The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".
Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.
When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0d41 (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794e43 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.
Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.
We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.
The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.
Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"commits" in the "struct rev_info".
We don't expect to use this "struct rev_info" again, so there's no
reason to NULL out revs->commits, as e.g. simplify_merges() and
create_boundary_commit_list() do.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use release_revisions() for users of "struct rev_list" that reach into
the "struct rev_info" and clear the "prune_data" already.
In a subsequent commit we'll teach release_revisions() to clear this
itself, but in the meantime let's invoke release_revisions() here to
clear anything else we may have missed, and for reasons of having
consistent boilerplate.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a release_revisions() with those "struct rev_list" users which
already "UNLEAK" the struct. It may seem odd to simultaneously attempt
to free() memory, but also to explicitly ignore whether we have memory
leaks in the same.
As explained in preceding commits this is being done to use the
built-in commands as a guinea pig for whether the release_revisions()
function works as expected, we'd like to test e.g. whether we segfault
as we change it. In subsequent commits we'll then remove these
UNLEAK() as the function is made to free the memory that caused us to
add them in the first place.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for having the "log" family of functions make wider use
of release_revisions() let's have them call it just before
exiting. This changes the "log", "whatchanged", "show",
"format-patch", etc. commands, all of which live in this file.
The release_revisions() API still only frees the "pending" member, but
will learn to release more members of "struct rev_info" in subsequent
commits.
In the case of "format-patch" revert the addition of UNLEAK() in
dee839a263 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
which will cause several tests that previously passed under
"TEST_PASSES_SANITIZE_LEAK=true" to start failing.
In subsequent commits we'll now be able to use those tests to check
whether that part of the API is really leaking memory, and will fix
all of those memory leaks. Removing the UNLEAK() allows us to make
incremental progress in that direction. See [1] for further details
about this approach.
Note that the release_revisions() will not be sufficient to deal with
the code in cmd_show() added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14) which clobbers the "pending" array in
the case of "OBJ_COMMIT". That will need to be dealt with by some
future follow-up work.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case of cmd_main() in http-push.c we need to move the
deceleration of the "struct rev-list" into the loop over the
"remote_refs" when adding a release_revisions().
We'd previously set up the "revs" for each remote, but would
potentially leak memory on each one.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_info" which
requires a minor refactoring to a "goto cleanup" pattern to use that
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the initialization of the "revision" member of "struct
stash_info" to be initialized vi a macro, and more importantly that
that initializing function be tasked to free it, usually via a "goto
cleanup" pattern.
Despite the "revision" name (and the topic of the series containing
this commit) the "stash info" has nothing to do with the "struct
rev_info". I'm making this change because in the subsequent commit
when we do want to free the "struct rev_info" via a "goto cleanup"
pattern we'd otherwise free() uninitialized memory in some cases, as
we only strbuf_init() the string in get_stash_info().
So while it's not the smallest possible change, let's convert all
users of this pattern in the file while we're at it.
A good follow-up to this change would be to change all the "ret = -1;
goto done;" in this file to instead use a "goto cleanup", and
initialize "int ret = -1" at the start of the relevant functions. That
would allow us to drop a lot of needless brace verbosity on two-line
"if" statements, but let's leave that alone for now.
To ensure that there's a 1=1 mapping between owners of the "struct
stash_info" and free_stash_info() change the assert_stash_ref()
function to be a trivial get_stash_info_assert() wrapper. The caller
will call free_stash_info(), and by returning -1 we'll eventually (via
!!ret) exit with status 1 anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use release_revisions() to various users of "struct rev_list" which
need to have their "struct rev_info" zero-initialized before we can
start using it.
For the bundle.c code see the early exit case added in
3bbbe467f2 (bundle verify: error out if called without an object
database, 2019-05-27).
For the relevant bisect.c code see 45b6370812 (bisect: libify
`check_good_are_ancestors_of_bad` and its dependents, 2020-02-17).
For the submodule.c code see the "goto" on "(!left || !right || !sub)"
added in 8e6df65015 (submodule: refactor show_submodule_summary with
helper function, 2016-08-31).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent commit will add "REV_INFO_INIT" macro adjacent to
repo_init_revisions(), unfortunately between the "struct rev_info"
itself and that function we've added various miscellaneous code
between the two over the years.
Let's move that code either lower in revision.h, giving it API docs
while we're at it, or in cases where it wasn't public API at all move
it into revision.c No lines of code are changed here, only moved
around. The only changes are the addition of new API comments.
The "tree_difference" variable could also be declared like this, which
I think would be a lot clearer, but let's leave that for now to keep
this a move-only change:
static enum {
REV_TREE_SAME,
REV_TREE_NEW, /* Only new files */
REV_TREE_OLD, /* Only files removed */
REV_TREE_DIFFERENT, /* Mixed changes */
} tree_difference = REV_TREE_SAME;
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The users of the revision.[ch] API's "struct rev_info" are a major
source of memory leaks in the test suite under SANITIZE=leak, which in
turn adds a lot of noise when trying to mark up tests with
"TEST_PASSES_SANITIZE_LEAK=true".
The users of that API are largely one-shot, e.g. "git rev-list" or
"git log", or the "git checkout" and "git stash" being modified here
For these callers freeing the memory is arguably a waste of time, but
in many cases they've actually been trying to free the memory, and
just doing that in a buggy manner.
Let's provide a release_revisions() function for these users, and
start migrating them over per the plan outlined in [1]. Right now this
only handles the "pending" member of the struct, but more will be
added in subsequent commits.
Even though we only clear the "pending" member now, let's not leave a
trap in code like the pre-image of index_differs_from(), where we'd
start doing the wrong thing as soon as the release_revisions() learned
to clear its "diffopt". I.e. we need to call release_revisions() after
we've inspected any state in "struct rev_info".
This leaves in place e.g. clear_pathspec(&rev.prune_data) in
stash_working_tree() in builtin/stash.c, subsequent commits will teach
release_revisions() to free "prune_data" and other members that in
some cases are individually cleared by users of "struct rev_info" by
reaching into its members. Those subsequent commits will remove the
relevant calls to e.g. clear_pathspec().
We avoid amending code in index_differs_from() in diff-lib.c as well
as wt_status_collect_changes_index(), has_unstaged_changes() and
has_uncommitted_changes() in wt-status.c in a way that assumes that we
are already clearing the "diffopt" member. That will be handled in a
subsequent commit.
1. https://lore.kernel.org/git/87a6k8daeu.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply coccinelle rules to remove "if (E)" before
"free_commit_list(E)", the function can accept NULL, and further
change cases where "E = NULL" followed to also be unconditionally.
The code changes in this commit were entirely made by the coccinelle
rule being added here, and applied with:
make contrib/coccinelle/free.cocci.patch
patch -p1 <contrib/coccinelle/free.cocci.patch
The only manual intervention here is that the the relevant code in
commit.c has been manually re-indented.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix two memory leaks in "struct rev_info" by freeing that memory in
cmd_format_patch(). These two are unusual special-cases in being in
the "struct rev_info", but not being "owned" by the code in
revision.c. I.e. they're members of the struct so that this code in
"builtin/log.c" can conveniently pass information code in
"log-tree.c".
See e.g. the make_cover_letter() caller of log_write_email_headers()
here in "builtin/log.c", and [1] for a demonstration of where the
"extra_headers" and "ref_message_ids" struct members are used.
See 20ff06805c (format-patch: resurrect extra headers from config,
2006-06-02) and d1566f7883 (git-format-patch: Make the second and
subsequent mails replies to the first, 2006-07-14) for the initial
introduction of "extra_headers" and "ref_message_ids".
We can count on repo_init_revisions() memset()-ing this data to 0
however, so we can count on it being either NULL or something we
allocated. In the case of "extra_headers" let's add a local "char *"
variable to hold it, to avoid the eventual cast from "const char *"
when we free() it.
1. https://lore.kernel.org/git/220401.868rsoogxf.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up on the introduction of string_list_init_nodup() and
string_list_init_dup() in the series merged in bd4232fac3 (Merge
branch 'ab/struct-init', 2021-07-16) and convert code that implicitly
relied on xcalloc() being equivalent to the initializer to use
xmalloc() and string_list_init_{no,}dup() instead.
In the case of get_unmerged() in merge-recursive.c we used the
combination of xcalloc() and assigning "1" to "strdup_strings" to get
what we'd get via string_list_init_dup(), let's use that instead.
Adjacent code in cmd_format_patch() will be changed in a subsequent
commit, since we're changing that let's change the other in-tree
patterns that do the same. Let's also convert a "x == NULL" to "!x"
per our CodingGuidelines, as we need to change the "if" line anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a freeing pattern added in 0906ac2b54 (blame: use changed-path
Bloom filters, 2020-04-16) to use a "goto cleanup", so that we can be
sure that we call cleanup_scoreboard().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since f9500261e0 (fast-rebase:
write conflict state to working tree, index, and HEAD, 2021-05-20)
changed this code to move these strbuf_release() into an if/else
block.
We'll also add to "reflog_msg" in the "else" arm of the "if" block
being modified here, and we'll append to "branch_msg" in both
cases. But after f9500261e0 only the "if" block would free these two
"struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Directly invoking make coverage-report as a target results in an error because
its prerequisites are missing,
This patch adds the compile-test prerequisite, which is run only once each time
the compile-report target is invoked. In practice, the developer may decide to
review the coverage-report results without necessarily rerunning for this
coverage-test, if it has already been run.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not have to guess how common the mistake the change targets is
when describing it. Such an argument may be good while proposing a
change, but does not quite belong in the record of what has already
happened, i.e. a release note.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2a0cafd464,
as it expects a working "a ref deletion must produce a single
transaction, not one for loose and another for packed" topic,
which we do not have.
* 'master' of github.com:git/git: (25 commits)
Git 2.36-rc2
i18n: fix some badly formatted i18n strings
Git 2.36-rc1
t9902: split test to run on appropriate systems
ls-tree doc: document interaction with submodules
Documentation: add --batch-command to cat-file synopsis
git-ls-tree.txt: fix the name of "%(objectsize:padded)"
submodule-helper: fix usage string
doc: replace "--" with {litdd} in credential-cache/fsmonitor
contrib/scalar: fix 'all' target in Makefile
Documentation/Makefile: fix "make info" regression in dad9cd7d51
configure.ac: fix HAVE_SYNC_FILE_RANGE definition
git-compat-util: really support openssl as a source of entropy
ls-tree: `-l` should not imply recursive listing
Git 2.35.2
Git 2.34.2
Git 2.33.2
Git 2.32.1
Git 2.31.2
Git 2.30.3
...
Use test_path_is_file() instead of 'test -f' for better debugging
information.
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
String in submodule--helper is not correctly formatting
placeholders. The string in git-send-email is partial.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the address sanitizer checks for a superset of the issues detected
by setting MALLOC_CHECK_ (which tries to detect things like double
frees and off-by-one errors) there is no need to set the latter when
compiling with -fsanitize=address.
This fixes a regression introduced by 131b94a10a ("test-lib.sh: Use
GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34", 2022-03-04)
which causes all the tests to fail with the message
ASan runtime does not come first in initial library list;
you should either link runtime to your application or
manually preload it with LD_PRELOAD.
when git is compiled with SANITIZE=address on systems with glibc >=
2.34. I have tested SANITIZE=leak and SANITIZE=undefined and they do
not suffer from this regression so the fix in this patch should be
sufficient.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check if git grep works around the PCRE2 big fixed by their e0c6029
(Fix inifinite loop when a single byte newline is searched in JIT.,
2020-05-29), which affects version 10.35 and earlier.
Searching for leading whitespace also triggers the endless loop.
Set a one-second alarm to abort in case we do get hit by the bug, to
avoid having to wait forever for the test result.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "FUNNYNAMES" test prerequisite passes on Cygwin, as the Cygwin
file system interface has a workaround for the underlying operating
system's lack of support for tabs, newlines or quotes. However, it does
not add support for backslash, which is treated as a directory
separator, meaning one of the tests added by 48803821b1 ("completion:
handle unusual characters for sparse-checkout", 2022-02-07) will fail on
Cygwin.
To avoid this failure while still getting maximal test coverage, split
that test into two: test handling of paths that include tabs on anything
that has the FUNNYNAMES prerequisite, but skip testing handling of paths
that include backslashes unless both FUNNYNAMES is set and the system is
not Cygwin.
It might be nice to have more granularity than "FUNNYNAMES" and its
sibling "FUNNIERNAMES" provide, so that tests could be run based on
specific individual characters supported by the file system being
tested, but that seems like it would make the prerequisite checks in
this area much more verbose for very little gain.
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning about converting line endings is extremely confusing. Its
two sentences each use the word "will" without specifying a timeframe,
which makes it sound like both sentences are referring to the same
timeframe. On top of that, it uses the term "original line endings"
without saying whether "original" means LF or CRLF.
Rephrase the warning to be clear about when the line endings will be
changed and what they will be changed to.
On a platform whose native line endings are not CRLF (e.g. Linux), the
"git add" step in the following sequence triggers the warning in
question:
$ git config core.autocrlf true
$ echo 'Hello world!' >hello.txt
$ git add hello.txt
warning: LF will be replaced by CRLF in hello.txt
The file will have its original line endings in your working directory
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ls-tree documentation had never been updated after it learned to
interact with submodules to explicitly mention them. The initial
support was added in f35a6d3bce (Teach core object handling functions
about gitlinks, 2007-04-09). E.g. the discussion of --long added in
f35a6d3bce (Teach core object handling functions about gitlinks,
2007-04-09) didn't explicitly mention them.
But this documentation added in 455923e0a1 (ls-tree: introduce
"--format" option, 2022-03-23) had no such excuse, and was actively
misleading by providing an exhaustive but incomplete list of object
types we'd emit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The externalConsole=true setting is broken for many users (launching the
debugger with such setting results in VS Code waiting forever without
actually starting the debugger). Also, this setting is a matter of user
preference, and is arguably better set in a "launch" section in the
user-wide settings.json than hardcoded in our script. Remove the line to
use VS Code's default, or the user's setting.
Add useful links in contrib/vscode/README.md to help the user to
configure VS Code and how to use the debugging feature.
Helped-by: Matthieu Moy <Matthieu.Moy@univ-lyon1.fr>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Co-authored-by: BRESSAT Jonathan <git.jonathan.bressat@gmail.com>
Signed-off-by: COGONI Guillaume <cogoni.guillaume@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
440c705ea6 (cat-file: add --batch-command mode, 2022-02-18) added
the new option and operating mode without listing it to the synopsis
section. Fix it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 455923e0a1 ("ls-tree: introduce "--format" option", 2022-03-23)
introduced `--format` and the various placeholders it can take, such as
%(objectname) and %(objectsize).
At some point when that patch was being developed, those placeholders
had shorter names, e.g., %(name) and %(size), which can be seen in the
commit message of 455923e0a1. One instance of "%(size:padded)" also
managed to enter the documentation in the final version of the patch.
Correct it to "%(objectsize:padded)".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The missing space at the end of the line makes the closing square
bracket sticking to the dash in the next line
Found during localisation v2.36.0 round 1
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Asciidoc renders `--` as em-dash. This is not appropriate for command
names. It also breaks linkgit links to these commands.
Fix git-credential-cache--daemon and git-fsmonitor--daemon. The latter
was added 3248486920 (fsmonitor: document builtin fsmonitor, 2022-03-25)
and included several links. A check for broken links in the HTML docs
turned this up.
Manually inspecting the other Documentation/git-*--*.txt files turned up
the issue in git-credential-cache--daemon.
While here, quote `git credential-cache--daemon` in the synopsis to
match the vast majority of our other documentation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-tree" learns "--oid-only" option, similar to "--name-only",
and more generalized "--format" option.
source: <cover.1648026472.git.dyroneteng@gmail.com>
* tl/ls-tree-oid-only:
ls-tree: `-l` should not imply recursive listing
Tests that affect the repo in stateful ways are easier to write if we
can run setup steps outside of the measured portion of perf iteration.
This change adds a "--setup 'setup-script'" parameter to test_perf. To
make invocations easier to understand, I also moved the prerequisites to
a new --prereq parameter.
The setup facility will be used in the upcoming perf tests for batch
mode, but it already helps in some existing tests, like t5302 and t7820.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add basic performance tests for git commands that can add data to the
object database. We cover:
* git add
* git stash
* git update-index (via git stash)
* git unpack-objects
* git commit --all
We cover all currently available fsync methods as well.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test cases to exercise batch mode for:
* 'git add'
* 'git stash'
* 'git update-index'
* 'git unpack-objects'
These tests ensure that the added data winds up in the object database.
In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added
due to it already being in the repo.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests use awk to parse OIDs from the output of 'git ls-files
--stage' and 'git ls-tree'. Introduce helpers to centralize these uses
of awk.
Update t5317-pack-objects-filter-objects.sh to use the new ls-files
helper so that it has some usages to review. Other updates are left for
the future.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows has defaulted to core.fsyncObjectFiles=true since
September 2017. We turn on syncing of loose object files with batch mode
in upstream Git so that we can get broad coverage of the new code
upstream.
We don't actually do fsyncs in the most of the test suite, since
GIT_TEST_FSYNC is set to 0. However, we do exercise all of the
surrounding batch mode code since GIT_TEST_FSYNC merely makes the
maybe_fsync wrapper always appear to succeed.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.
By enabling an odb-transaction when unpacking objects, we can take advantage
of batched fsyncs.
Here are some performance numbers to justify batch mode for
unpack-objects, collected on a WSL2 Ubuntu VM.
Fsync Mode | Time for 90 objects (ms)
-------------------------------------
Off | 170
On,fsync | 760
On,batch | 230
Note that the default unpackLimit is 100 objects, so there's a 3x
benefit in the worst case. The non-batch mode fsync scales linearly
with the number of objects, so there are significant benefits even with
smaller numbers of objects.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.
This change enables odb-transactions for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.
There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete, so callers
using the --stdin option will not see them until update-index is done.
This risk is mitigated by flushing the ODB transaction prior to
reporting any verbose output so that objects will be visible to callers
that are synchronizing with update-index by snooping its output.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_files_to_cache function is invoked internally by
builtin/commit.c and builtin/checkout.c for their flags that stage
modified files before doing the larger operation. These commands
can benefit from batched fsyncing.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of the odb transaction infrastructure around writing the
cached tree to the object database.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding many objects to a repo with `core.fsync=loose-object`,
the cost of fsync'ing each object file can become prohibitive.
One major source of the cost of fsync is the implied flush of the
hardware writeback cache within the disk drive. This commit introduces
a new `core.fsyncMethod=batch` option that batches up hardware flushes.
It hooks into the bulk-checkin odb-transaction functionality, takes
advantage of tmp-objdir, and uses the writeout-only support code.
When the new mode is enabled, we do the following for each new object:
1a. Create the object in a tmp-objdir.
2a. Issue a pagecache writeback request and wait for it to complete.
At the end of the entire transaction when unplugging bulk checkin:
1b. Issue an fsync against a dummy file to flush the log and hardware
writeback cache, which should by now have seen the tmp-objdir writes.
2b. Rename all of the tmp-objdir files to their final names.
3b. When updating the index and/or refs, we assume that Git will issue
another fsync internal to that operation. This is not the default
today, but the user now has the option of syncing the index and there
is a separate patch series to implement syncing of refs.
On a filesystem with a singular journal that is updated during name
operations (e.g. create, link, rename, etc), such as NTFS, HFS+, or XFS
we would expect the fsync to trigger a journal writeout so that this
sequence is enough to ensure that the user's data is durable by the time
the git command returns. This sequence also ensures that no object files
appear in the main object store unless they are fsync-durable.
Batch mode is only enabled if core.fsync includes loose-objects. If
the legacy core.fsyncObjectFiles setting is enabled, but core.fsync does
not include loose-objects, we will use file-by-file fsyncing.
In step (1a) of the sequence, the tmp-objdir is created lazily to avoid
work if no loose objects are ever added to the ODB. We use a tmp-objdir
to maintain the invariant that no loose-objects are visible in the main
ODB unless they are properly fsync-durable. This is important since
future ODB operations that try to create an object with specific
contents will silently drop the new data if an object with the target
hash exists without checking that the loose-object contents match the
hash. Only a full git-fsck would restore the ODB to a functional state
where dataloss doesn't occur.
In step (1b) of the sequence, we issue a fsync against a dummy file
created specifically for the purpose. This method has a little higher
cost than using one of the input object files, but makes adding new
callers of this mechanism easier, since we don't need to figure out
which object file is "last" or risk sharing violations by caching the fd
of the last object file.
_Performance numbers_:
Linux - Hyper-V VM running Kernel 5.11 (Ubuntu 20.04) on a fast SSD.
Mac - macOS 11.5.1 running on a Mac mini on a 1TB Apple SSD.
Windows - Same host as Linux, a preview version of Windows 11.
Adding 500 files to the repo with 'git add' Times reported in seconds.
object file syncing | Linux | Mac | Windows
--------------------|-------|-------|--------
disabled | 0.06 | 0.35 | 0.61
fsync | 1.88 | 11.18 | 2.47
batch | 0.15 | 0.41 | 1.53
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clearer in the naming and documentation of the plug_bulk_checkin
and unplug_bulk_checkin APIs that they can be thought of as
a "transaction" to optimize operations on the object database. These
transactions may be nested so that subsystems like the cache-tree
writing code can optimize their operations without caring whether the
top-level code has a transaction active.
Add a flush_odb_transaction API that will be used in update-index to
make objects visible even if a transaction is active. The flush call may
also be useful in future cases if we hold a transaction active around
calling hooks.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit prepares for adding batch-fsync to the bulk-checkin
infrastructure.
The bulk-checkin infrastructure is currently used to batch up addition
of large blobs to a packfile. When a blob is larger than
big_file_threshold, we unconditionally add it to a pack. If bulk
checkins are 'plugged', we allow multiple large blobs to be added to a
single pack until we reach the packfile size limit; otherwise, we simply
make a new packfile for each large blob. The 'unplug' call tells us when
the series of blob additions is done so that we can finish the packfiles
and make their objects available to subsequent operations.
Stated another way, bulk-checkin allows callers to define a transaction
that adds multiple objects to the object database, where the object
database can optimize its internal operations within the transaction
boundary.
Batched fsync will fit into bulk-checkin by taking advantage of the
plug/unplug functionality to determine the appropriate time to fsync
and make newly-added objects available in the primary object database.
* Rename 'state' variable to 'bulk_checkin_packfile', since we will
later be adding 'bulk_fsync_objdir'. This also makes the variable
easier to find in the debugger, since the name is more unique.
* Rename finish_bulk_checkin to flush_bulk_checkin_packfile and call it
unconditionally from unplug_bulk_checkin. Internally it will
conditionally do a flush if there's any work to do.
* Move the 'plugged' data member of 'bulk_checkin_state' into a separate
static variable. Doing this avoids resetting the variable in
finish_bulk_checkin when zeroing the 'bulk_checkin_state'. As-is, we
seem to unintentionally disable the plugging functionality the first
time a new packfile must be created due to packfile size limits. While
disabling the plugging state only results in suboptimal behavior for
the current code, it would be fatal for the bulk-fsync functionality
later in this patch series.
The net effect of these changes is to make a clear separation between
the portion of the bulk-checkin infrastructure that is related to the
packfile (nearly all of it at present) and the part that is related to
other future optimizations of the ODB.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce has a file type "utf8" which represents a text file with
explicit BOM. utf8-encoded files *without* BOM are stored as
regular file type "text". The "utf8" file type behaves like text
in all but one important way: it is stored, internally, without
the leading 3 BOM bytes.
git-p4 has historically imported utf8-with-BOM files (files stored,
in Perforce, as type "utf8") the same way as regular text files -
losing the BOM in the process.
Under most circumstances this issue has little functional impact,
as most systems consider the BOM to be optional and redundant, but
this *is* a correctness failure, and can have lead to practical
issues for example when BOMs are explicitly included in test files,
for example in a file encoding test suite.
Fix the handling of utf8-with-BOM files when importing changes from
p4 to git, and introduce a test that checks it is working correctly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the --branch argument of the "sync" subcommand, git-p4 enables
you to import a perforce branch/path to an arbitrary git ref, using
a full ref path, or to refs/remotes/p4/* or refs/heads/p4/*,
depending on --import-local, using a short ref name.
However, when you later want to explicitly sync such a given ref to
pick up subsequent p4 changes, it only works if the ref was placed
in the p4 path *and* has only one path component (no "/").
This limitation results from a bad assumption in the
existing-branch sync logic, and also means you cannot individually
sync branches detected by --detect-branches, as these also get a
"/" in their names.
Fix "git p4 sync --branch", when called with an existing ref, so
that it works correctly regardless of whether the ref is in the p4
path or not, and (in the case of refs in the p4 path) regardless of
whether it has a "/" in its short name or not.
Also add tests to validate that these branch-specific syncs work
as expected.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add extra ':' to second 'all' target definition to allow 'scalar' to build.
Without this fix, the 'all:' and 'all::' targets together cause a build
failure when 'scalar' build is enabled with 'INCLUDE_SCALAR':
Makefile:14: *** target file `all' has both : and :: entries. Stop.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in my dad9cd7d51 (Makefile: move ".SUFFIXES" rule to
shared.mak, 2022-03-03). As explained in the GNU make documentation
for the $* variable, available at:
info make --index-search='$*'
This rule relied on ".texi" being in the default list of suffixes, as
seen at:
make -f/dev/null -p | grep -v -e ^# -e ^$|grep -F .SUFFIXES
The documentation explains what was going on here:
In an explicit rule, there is no stem; so '$*' cannot be determined
in that way. Instead, if the target name ends with a recognized
suffix (*note Old-Fashioned Suffix Rules: Suffix Rules.), '$*' is
set to the target name minus the suffix. For example, if the
target name is 'foo.c', then '$*' is set to 'foo', since '.c' is a
suffix. GNU 'make' does this bizarre thing only for compatibility
with other implementations of 'make'. You should generally avoid
using '$*' except in implicit rules or static pattern rules.
If the target name in an explicit rule does not end with a
recognized suffix, '$*' is set to the empty string for that rule.
I.e. this rule added back in 5cefc33bff (Documentation: add
gitman.info target, 2007-12-10) was resolving gitman.texi from
gitman.info. We can instead just use the more obvious $< variable
referring to the prerequisite.
This was the only use of $* in our Makefiles in an explicit rule, the
three remaining ones are all implicit rules, and therefore didn't
depend on the ".SUFFIXES" list.
Reported-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove include "lockfile.h" from builtin/apply.c, which is orphaned
since 6d058c8826 (apply: move lockfile into `apply_state`, 2017-10-05)
Signed-off-by: Garrit Franke <garrit@slashdev.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove include "strvec.h" from serve.c, which is orphaned since
f0a35c9ce5 (serve: drop "keys" strvec, 2021-09-15)
Signed-off-by: Garrit Franke <garrit@slashdev.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If sync_file_range is not available when building the configure script,
there is a cosmetic bug when running that script reporting
"HAVE_SYNC_FILE_RANGE: command not found". Remove that error message by
defining HAVE_SYNC_FILE_RANGE to an empty string, rather than generating
a script where that appears as a bare command.
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
05cd988dce (wrapper: add a helper to generate numbers from a CSPRNG,
2022-01-17), configure openssl as the source for entropy in NON-STOP
but doesn't add the needed header or link options.
Since the only system that is configured to use openssl as a source
of entropy is NON-STOP, add the header unconditionally, and -lcrypto
to the list of external libraries.
An additional change is required to make sure a NO_OPENSSL=1 build
will be able to work as well (tested on Linux with a modified value
of CSPRNG_METHOD = openssl), and the more complex logic that allows
for compatibility with APPLE_COMMON_CRYPTO or allowing for simpler
ways to link (without libssl) has been punted for now.
Reported-by: Randall Becker <rsbecker@nexbridge.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using a named enum allows casting an integer to the enum type in both
GDB and LLDB:
$ gdb -q -ex 'b wt-status.c:44' -ex r --args ./git status
(gdb) p (enum color_wt_status) slot
$1 = WT_STATUS_ONBRANCH
$ lldb -o 'b wt-status.c:44' -o r -- ./git status
(lldb) p (color_wt_status) slot
(color_wt_status) $0 = WT_STATUS_ONBRANCH
In LLDB, it's also required to cast in the reversed direction, i.e.
cast an enum constant into its corresponding integer:
(lldb) p (int) color_wt_status::WT_STATUS_ONBRANCH
(int) $1 = 8
Name the enum listing the different RECURSE_SUBMODULES_* modes, to make
debugging easier. For example, when stepping through a part of the code
where an int is compared with a constant in this enum, it allows casting
the int to the enum type or vice-versa, after quickly checking where the
enum constant is declared and learning the enum name.
As to not make this patch a debug-only change, convert the
'fetch_recurse' member of 'struct submodule' to use the newly named
enum.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9c4d58ff2c (ls-tree: split up "fast path" callbacks, 2022-03-23), a
refactoring of the various read_tree_at() callbacks caused us to
unconditionally recurse into directories if `-l` (long format) was
passed on the command line, regardless of whether or not we also pass
the `-r` (recursive) flag.
Fix this by making show_tree_long() return the value of `recurse`,
rather than always returning 1. This value is interpreted by
read_tree_at() to be a signal on whether or not to recurse.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test 'cherry-pick after renaming branch', stop checking for
the presence of a file (opos) because we are going to "grep" in it in
the same test and the lack of it will be noticed as a failure anyway.
In the test 'revert after renaming branch', instead of allowing any
random contents as long as a known phrase is not there in it, we can
expect the exact outcome---after the successful revert of "added", the
contents of file "spoo" should become identical to what was in file
"oops" in the "initial" commit. This test also contains 'test -f' that
verifies presence of a file, but we have a helper function to do the same
thing. Replace it with appropriate helper function 'test_path_is_file'
for better readability and better error messages.
In both tests, we will not notice when "git rev-parse" starts segfaulting
without emitting any output. The 'test' command will end up being just
"test =", which yields success. Use the 'test_cmp_rev' helper to make
sure we will notice such a breakage.
Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dc6b1d92ca (wt-status: use settings from git_diff_ui_config, 2018-05-04)
disabled diffopt.break_opt for diffstats shown by git status and in
commit templates. For git status there isn't even a way to enable it.
Make the commit summary (shown after the commit) consistent by disabling
it there as well.
Reported-by: Laurent Lyaudet <laurent.lyaudet@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree list --porcelain" did not c-quote pathnames and lock
reasons with unsafe bytes correctly, which is worked around by
introducing NUL terminated output format with "-z".
* pw/worktree-list-with-z:
worktree: add -z option for list subcommand
Built-in fsmonitor (part 2).
* jh/builtin-fsmonitor-part2: (30 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
Give hint when branch tracking cannot be established because fetch
refspecs from multiple remote repositories overlap.
* tk/ambiguous-fetch-refspec:
tracking branches: add advice to ambiguous refspec error
"git fetch --refetch" learned to fetch everything without telling
the other side what we already have, which is useful when you
cannot trust what you have in the local object store.
* rc/fetch-refetch:
docs: mention --refetch fetch option
fetch: after refetch, encourage auto gc repacking
t5615-partial-clone: add test for fetch --refetch
fetch: add --refetch option
builtin/fetch-pack: add --refetch option
fetch-pack: add refetch
fetch-negotiator: add specific noop initializer
"git am" can read from the standard input when no mailbox is given
on the command line, but the end-user gets no indication when it
happens, making Git appear stuck.
* jc/mailsplit-warn-on-tty:
am/apply: warn if we end up reading patches from terminal
A handful of obvious clean-ups around a topic that is already in
'master'.
* gc/branch-recurse-submodules-fix:
branch.c: simplify advice-and-die sequence
branch: rework comments for future developers
branch: remove negative exit code
branch --set-upstream-to: be consistent when advising
branch: give submodule updating advice before exit
branch: support more tracking modes when recursing
When creating a loose object file, we didn't report the exact
filename of the file we failed to fsync, even though the
information was readily available, which has been corrected.
* ns/fsync-or-die-message-fix:
object-file: pass filename to fsync_or_die
A couple of fix-up to a topic that is now in 'master'.
* ns/core-fsyncmethod:
core.fsyncmethod: correctly camel-case warning message
core.fsync: fix incorrect expression for default configuration
Work around AIX C compiler that does not seem to grok
initialization of a union member of a struct.
* ab/reftable-aix-xlc-12:
reftable: make assignments portable to AIX xlc v12.01
Move more "git submodule update" to C.
* gc/submodule-update-part2:
submodule--helper: remove forward declaration
submodule: move core cmd_update() logic to C
submodule--helper: reduce logic in run_update_procedure()
submodule--helper: teach update_data more options
builtin/submodule--helper.c: rename option struct to "opt"
submodule update: use die_message()
submodule--helper: run update using child process struct
Code clean-up.
* ds/partial-bundle-more:
pack-objects: lazily set up "struct rev_info", don't leak
bundle: output hash information in 'verify'
bundle: move capabilities to end of 'verify'
pack-objects: parse --filter directly into revs.filter
pack-objects: move revs out of get_object_list()
list-objects-filter: remove CL_ARG__FILTER
"git ls-tree" learns "--oid-only" option, similar to "--name-only",
and more generalized "--format" option.
* tl/ls-tree-oid-only:
ls-tree: split up "fast path" callbacks
ls-tree: detect and error on --name-only --name-status
ls-tree: support --object-only option for "git-ls-tree"
ls-tree: introduce "--format" option
cocci: allow padding with `strbuf_addf()`
ls-tree: introduce struct "show_tree_data"
ls-tree: slightly refactor `show_tree()`
ls-tree: fix "--name-only" and "--long" combined use bug
ls-tree: simplify nesting if/else logic in "show_tree()"
ls-tree: rename "retval" to "recurse" in "show_tree()"
ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
ls-tree: use "enum object_type", not {blob,tree,commit}_type
ls-tree: add missing braces to "else" arms
ls-tree: remove commented-out code
ls-tree tests: add tests for --name-status
"git reflog" command now uses parse-options API to parse its
command line options.
* ab/reflog-parse-options:
reflog: fix 'show' subcommand's argv
reflog [show]: display sensible -h output
reflog: convert to parse_options() API
reflog exists: use parse_options() API
git reflog [expire|delete]: make -h output consistent with SYNOPSIS
reflog: move "usage" variables and use macros
reflog tests: add missing "git reflog exists" tests
reflog: refactor cmd_reflog() to "if" branches
reflog.c: indent argument lists
The output of `git mergetool --tool-help` and `git difftool --tool-help`
only showed the `alias` of each available merge/diff tool.
It is not always obvious what tool these `aliases` end up using (ex:
`opendiff` runs `FileMerge` and `bc` runs `Beyond Compare`).
This commit adds a short description to each of them to help the user
identify the `alias` they want.
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running 'git {merge,diff}tool --tool-help' now also prints usage
information about the vimdiff tool (and its variants) instead of just
its name.
Two new functions ('diff_cmd_help()' and 'merge_cmd_help()') have been
added to the set of functions that each merge tool (ie. scripts found
inside "mergetools/") can overwrite to provided tool specific
information.
Right now, only 'mergetools/vimdiff' implements these functions, but
other tools are encouraged to do so in the future, specially if they
take configuration options not explained anywhere else (as it is the
case with the 'vimdiff' tool and the new 'layout' option)
Note that the function 'show_tool_names', used in the implementation of
'git mergetool --tool-help', is also used in Documentation/Makefile to
generate the list of allowed values for the configuration variables
'{diff,merge}.{gui,}tool'. Adjust the rule so its output is an Asciidoc
"description list" instead of a plain list, with the tool name as the
item and the newly added tool description as the description.
In addition, a section has been added to
"Documentation/git-mergetool.txt" to explain the new "layout"
configuration option with examples.
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ds/partial-bundle-more:
pack-objects: lazily set up "struct rev_info", don't leak
bundle: output hash information in 'verify'
bundle: move capabilities to end of 'verify'
pack-objects: parse --filter directly into revs.filter
pack-objects: move revs out of get_object_list()
list-objects-filter: remove CL_ARG__FILTER
PEP8 recommends that all inline comments should be separated from code
by two spaces, in the "Inline Comments" section:
https://www.python.org/dev/peps/pep-0008/#inline-comments
However, because all instances of these inline comments extended to an
excessive line length, they have been moved onto a seprate line.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PEP8 recommends that when wrapping the arguments of conditional
statements, an extra level of indentation should be added to distinguish
arguments from the body of the statement.
This guideline is described here:
https://www.python.org/dev/peps/pep-0008/#indentation
This patch either adds the indentation, or removes unnecessary wrapping.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PEP8 requires that binary operators such as assignment and comparison
operators should always be surrounded by a pair of single spaces, and
recommends that all other binary operators should typically be surround
by single spaces.
The recommendation is given here in the "Other Recommendations"
section
https://www.python.org/dev/peps/pep-0008/#other-recommendations
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PEP8 makes no specific recommendation about spaces preceding colons in
dictionary declarations, but all the code examples contained with it
declare dictionaries with a single space after the colon, and none
before.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch improves consistency across git-p4 by ensuring all command
separated arguments to function invocations, tuples and lists are
separated by commas with a single space following.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In several places, git-p4 contains code of the form:
(a, b) = foo()
In each case, multiple values are returned through a tuple or a list and
bound into multiple values.
The brackets around the assigned variables are redundant and can be
removed:
a, b = foo()
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-p4 contains configuration commands for pylint embedded in the header
comment.
Previously, these were combined onto single lines and not alphabetically
sorted. This patch breaks each disable command onto a separate line to
give cleaner diffs, removed duplicate entries, and sorts the list
alphabetically.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the script contained commented code including Python 2 print
statements. Presumably, these were used as a developer aid at some point
in history. However, the commented code is generally undesirable, and
this commented code serves no useful purpose. Therefore this patch
removes it.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, a small number of functions, methods and classes were
documented using comments. This patch improves consistency by converting
these into docstrings similar to those that already exist in the script.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch attempts to improve the consistency of the docstrings by
making the following changes:
- Rewraps all docstrings to a 79-character column limit.
- Adds a full stop at the end of every docstring.
- Removes any spaces after the opening triple-quotes of all
docstrings.
- Sets the hanging indent of multi-line docstrings to 3-spaces.
- Ensures that the closing triple-quotes of multi-line docstrings are
always on a new line indented by a 3-space indent.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PEP8 recommends that all code should be indented in 4-space units. This
guideline is described here:
https://www.python.org/dev/peps/pep-0008/#indentation
Previously git-p4 had multiple cases where code was indented with a
non-multiple of 4-spaces. This patch fixes each of these.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python allows the usage of compound statements where multiple statements
are written on a single line separared by semicolons. It is also
possible to add a semicolon after a single statement, however this is
generally considered to be untidy, and is unnecessary.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the PEP8 style guidelines, top-level functions and class definitions
should be separated by two blank lines. Methods should be surrounded by
a single blank line.
This guideline is described here in the "Blank Lines" section:
https://www.python.org/dev/peps/pep-0008/#blank-lines
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Untracked cache was originally designed to only work with
"--untracked-files=normal", and is bypassed when
"--untracked-files=all" is requested, but this causes performance
issues for UI tooling that wants to see "all" on a frequent basis.
On the other hand, the conditions that altogether prevented
applicability to the "all" mode no longer seem to apply, after
several major refactors in recent years; this possibility was
discussed in
81153d02-8e7a-be59-e709-e90cd5906f3a@jeffhostetler.com and
CABPp-BFiwzzUgiTj_zu+vF5x20L0=1cf25cHwk7KZQj2YkVzXw@mail.gmail.com,
and somewhat confirmed experimentally by several users using a
version of this patch to use untracked cache with -uall for about a
year.
When 'git status' runs without using the untracked cache, on a large
repo, on windows, with fsmonitor, it can run very slowly. This can
make GUIs that need to use "-uall" (and therefore currently bypass
untracked cache) unusable when fsmonitor is enabled, on such large
repos.
To partially address this, align the supported directory flags for the
stored untracked cache data with the git config. If a user specifies
an '--untracked-files=' commandline parameter that does not align with
their 'status.showuntrackedfiles' config value, then the untracked
cache will be ignored - as it is for other unsupported situations like
when a pathspec is specified.
If the previously stored flags no longer match the current
configuration, but the currently-applicable flags do match the current
configuration, then discard the previously stored untracked cache
data.
For most users there will be no change in behavior. Users who need
'--untracked-files=all' to perform well will now have the option of
setting "status.showuntrackedfiles" to "all" for better / more
consistent performance.
Users who need '--untracked-files=all' to perform well for their
tooling AND prefer to avoid the verbosity of "all" when running
git status explicitly without options... are out of luck for now (no
change).
Users who have the "status.showuntrackedfiles" config set to "all"
and yet frequently explicitly call
'git status --untracked-files=normal' (and use the untracked cache)
are the only ones who will be disadvantaged by this change. Their
"--untracked-files=normal" calls will, after this change, no longer
use the untracked cache.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Untracked cache was originally designed to only work with
'--untracked-files=normal', and it gets ignored when
'--untracked-files=all' is specified instead.
Add explicit tests for this known as-designed behavior.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The alloc_report() function has been orphaned since its introduction
in 855419f764 (Add specialized object allocator, 2006-06-19), it
appears to have been used for demonstration purposes in that commit
message.
These might be handy to manually use in a debugger, but keeping them
and the "count" member of "alloc_state" just for that doesn't seem
worth it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These macros were last used in 5d3679ee02 (sha1-file: drop
has_sha1_file(), 2019-01-07), so let's remove coccinelle migration
rules added 9b45f49981 (object-store: prepare has_{sha1, object}_file
to handle any repo, 2018-11-13), along with the compatibility macros
themselves.
The "These functions.." in the diff context and the general comment
about compatibility macros still applies to
"NO_THE_REPOSITORY_COMPATIBILITY_MACROS" use just a few lines below
this, so let's keep the comment.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function hasn't been used since 449fa5ee06 (pack-bitmap-write:
ignore BITMAP_FLAG_REUSE, 2020-12-08), which was a cleanup commit
intending to get rid of the code around the reusing of bitmaps.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro was added in 3443546f6e (Use a *real* built-in diff
generator, 2006-03-24), but none of the xdiff code uses it, it uses
xdl_free() directly.
If we need its functionality again we'll use the FREE_AND_NULL() macro
added in 481df65f4f (git-compat-util: add a FREE_AND_NULL() wrapper
around free(ptr); ptr = NULL, 2017-06-15).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a comment about a Makefile knob that was removed in
f7661ce0b8 (Remove -fPIC which was only needed for Git.xs,
2006-09-29). The comment had been copied over to configure.ac in
633b423961 (Copy description of build configuration variables to
configure.ac, 2006-07-08).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a "struct child_process" member added in
ac2fbaa674 (run-command: add clean_on_exit_handler, 2016-10-16), but
which was never used.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error "not tracking: ambiguous information for ref" is raised
when we are evaluating what tracking information to set on a branch,
and find that the ref to be added as tracking branch is mapped
under multiple remotes' fetch refspecs.
This can easily happen when a user copy-pastes a remote definition
in their git config, and forgets to change the tracking path.
Add advice in this situation, explicitly highlighting which remotes
are involved and suggesting how to correct the situation. Also
update a test to explicitly expect that advice.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the behavior of "git -v" to be synonymous with "--version" /
"version", and "git -h" to be synonymous with "--help", but not "help".
These shorthands both display the "unknown option" message. Following
this change, "-v" displays the version, and "-h" displays the help text
of the "git" command.
It should be noted that the "-v" shorthand could be misinterpreted by
the user to mean "verbose" instead of "version", since some sub-commands
make use of it in this context. The top-level "git" command does not
have a "verbose" flag, so it's safe to introduce this shorthand
unambiguously.
Signed-off-by: Garrit Franke <garrit@slashdev.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the dwim_branch_start(), when we cannot find an appropriate
upstream, we will die with the same message anyway, whether we
issue an advice message.
Flip the code around a bit and simplify the flow using
advise_if_enabled() function.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For two cases in which we do not explicitly pass --track=<choice>
option down to the submodule--helper subprocess, we have comments
that say "we do not have to pass --track", but in fact we not just
do not have to, but it would be incorrect to pass any --track option
to the subprocess (instead, the correct behaviour is to let the
subprocess figure out what is the appropriate tracking mode to use).
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/usage-die-message:
config API: use get_error_routine(), not vreportf()
usage.c + gc: add and use a die_message_errno()
gc: return from cmd_gc(), don't call exit()
usage.c API users: use die_message() for error() + exit 128
usage.c API users: use die_message() for "fatal :" + exit 128
usage.c: add a die_message() routine
Add a -z option to be used in conjunction with --porcelain that gives
NUL-terminated output. As 'worktree list --porcelain' does not quote
worktree paths this enables it to handle worktree paths that contain
newlines.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We raised the weather balloon to see if we can allow the construct
in 44ba10d6 (revision: use C99 declaration of variable in for()
loop, 2021-11-14), which was shipped as a part of Git v2.35.
Document that fact in the coding guidelines, and more importantly,
give ourselves a deadline to revisit and update.
Let's declare that we will officially adopt the variable declaration
in the initializaiton part of "for ()" statement this winter, unless
we find that a platform we care about does not grok it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update tests around the use of hook scripts.
* ab/hook-tests-updates:
http tests: use "test_hook" for "smart" and "dumb" http tests
proc-receive hook tests: use "test_hook" instead of "write_script"
tests: extend "test_hook" for "rm" and "chmod -x", convert "$HOOK"
tests: use "test_hook" for misc "mkdir -p" and "chmod" cases
tests: change "mkdir -p && write_script" to use "test_hook"
tests: change "cat && chmod +x" to use "test_hook"
gc + p4 tests: use "test_hook", remove sub-shells
fetch+push tests: use "test_hook" and "test_when_finished" pattern
bugreport tests: tighten up "git bugreport -s hooks" test
tests: assume the hooks are disabled by default
http tests: don't rely on "hook/post-update.sample"
hook tests: turn exit code assertions into a loop
test-lib-functions: add and use a "test_hook" wrapper
Tweaks in the command line prompt (in contrib/) code around its
GIT_PS1_SHOWUPSTREAM feature.
* jd/prompt-upstream-mark:
git-prompt: put upstream comments together
git-prompt: make long upstream state indicator consistent
git-prompt: make upstream state indicator location consistent
git-prompt: rename `upstream` to `upstream_type`
Finishing touches to C rewrite of "git add -i" in single-key
interactive mode.
* pw/add-p-single-key:
terminal: restore settings on SIGTSTP
terminal: work around macos poll() bug
terminal: don't assume stdin is /dev/tty
terminal: use flags for save_term()
"git stash" does not allow subcommands it internally runs as its
implementation detail, except for "git reset", to emit messages;
now "git reset" part has also been squelched.
* vd/stash-silence-reset:
reset: show --no-refresh in the short-help
reset: remove 'reset.refresh' config option
reset: remove 'reset.quiet' config option
reset: do not make '--quiet' disable index refresh
stash: make internal resets quiet and refresh index
reset: suppress '--no-refresh' advice if logging is silenced
reset: replace '--quiet' with '--no-refresh' in performance advice
reset: introduce --[no-]refresh option to --mixed
reset: revise index refresh advice
Replace an instance of "exit(-1)" with "exit(1)". We don't use negative
exit codes - they are misleading because Unix machines will coerce them
to 8-bit unsigned values, losing the sign.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we die while trying to fsync a loose object file, pass the actual
filename we're trying to sync. This is likely to be more helpful for a
user trying to diagnose the cause of the failure than the former
'loose object file' string. It also sidesteps any concerns about
translating the die message differently for loose objects versus
something else that has a real path.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch --set-upstream-to" behaves differently when advice is
enabled/disabled:
| | error prefix | exit code |
|-----------------+--------------+-----------|
| advice enabled | error: | 1 |
| advice disabled | fatal: | 128 |
Make both cases consistent by using die_message() when advice is
enabled (this was first proposed in [1]).
[1] https://lore.kernel.org/git/211210.86ee6ldwlc.gmgdl@evledraar.gmail.com
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug where "hint:" was printed _before_ "fatal:" (instead of the
other way around).
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch --recurse-submodules" does not propagate "--track=inherit"
or "--no-track" to submodules, which causes submodule branches to use
the wrong tracking mode [1]. To fix this, pass the correct options to
the "submodule--helper create-branch" child process and test for it.
While we are refactoring the same code, replace "--track" with the
synonymous, but more consistent-looking "--track=direct" option
(introduced at the same time as "--track=inherit", d3115660b4 (branch:
add flags and config to inherit tracking, 2021-12-20)).
[1] This bug is partially a timing issue: "branch --recurse-submodules"
was introduced around the same time as "--track=inherit", and even
though I rebased "branch --recurse-submodules" on top of that, I had
neglected to support the new tracking mode. Omitting "--no-track"
was just a plain old mistake, though.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new test case file for the different available merge tools.
Right now it only tests the 'mergetool.vimdiff.layout' option. Other
merge tools might be interested in adding their own tests here too.
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git mergetool -t vimdiff', a new configuration option
('mergetool.vimdiff.layout') can now be used to select how the user
wants the different windows, tabs and buffers to be displayed.
If the option is not provided, the layout will be the same one that was
being used before this commit (ie. two rows with LOCAL, BASE and COMMIT
in the top one and MERGED in the bottom one).
The 'vimdiff' variants ('vimdiff{1,2,3}') still work but, because they
represented nothing else than different layouts, are now internally
implemented as a subcase of 'vimdiff' with the corresponding
pre-configured 'layout'.
Again, if you don't set "mergetool.vimdiff.layout" everything will work
the same as before *but* the arguments used to call {n,g,}vim will be
others (even if you don't/shouldn't notice it):
- git mergetool -t vimdiff
> Before this commit:
{n,g,}vim -f -d -c '4wincmd w | wincmd J' $LOCAL $BASE $REMOTE $MERGED
> After this commit:
{n,g,}vim -f -c "echo | split | vertical split | 1b | wincmd l | vertical split | 2b | wincmd l | 3b | wincmd j | 4b | tabdo windo diffthis" -c "tabfirst" $LOCAL $BASE $REMOTE $MERGED
- git mergetool -t vimdiff1
> Before this commit:
{n,g,}vim -f -d -c 'echon "..."' $LOCAL $REMOTE
> After this commit:
{n,g,}vim -f -c "echo | vertical split | 1b | wincmd l | 3b | tabdo windo diffthis" -c "tabfirst" $LOCAL $BASE $REMOTE $MERGED
- git mergetool -t vimdiff2
> Before this commit:
{n,g,}vim -f -d -c 'wincmd l' $LOCAL $MERGED $REMOTE
> After this commit:
{n,g,}vim -f -c "echo | vertical split | 1b | wincmd l | vertical split | 4b | wincmd l | 3b | tabdo windo diffthis" -c "tabfirst" $LOCAL $BASE $REMOTE $MERGED
- git mergetool -t vimdiff3
> Before this commit:
{n,g,}vim -f -d -c 'hid | hid | hid' $LOCAL $REMOTE $BASE $MERGED
> After this commit:
{n,g,}vim -f -c "echo | 4b | bufdo diffthis" -c "tabfirst" $LOCAL $BASE $REMOTE $MERGED
Despite being different, I have manually verified that they generate the same
layout as before.
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add some global trace2 statistics for the number of fsyncs performed
during the lifetime of a Git process.
These stats are printed as part of trace2_cmd_exit_fl, which is
presumably where we might want to print any other cross-cutting
statistics.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b9f5d035 (core.fsync: documentation and user-friendly
aggregate options, 2022-03-15) introduced an incorrect value for
FSYNC_COMPONENTS_DEFAULT. We need an AND-NOT rather than OR-NOT.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase $base $non_branch_commit", when $base is an ancestor or
the $non_branch_commit, modified the current branch, which has been
corrected.
* jc/rebase-detach-fix:
rebase: set REF_HEAD_DETACH in checkout_up_to_date()
rebase: use test_commit helper in setup
When "shallow" information is updated, we forgot to update the
in-core equivalent, which has been corrected.
* jt/reset-grafts-when-resetting-shallow:
shallow: reset commit grafts when shallow is reset
Correct a bug in unpack-trees introduced earlier.
* vd/cache-bottom-fix:
Revert "unpack-trees: improve performance of next_cache_entry"
unpack-trees: increment cache_bottom for sparse directories
t1092: add sparse directory before cone in test repo
Code clean-up.
* ab/refs-various-fixes:
refs debug: add a wrapper for "read_symbolic_ref"
packed-backend: remove stub BUG(...) functions
misc *.c: use designated initializers for struct assignments
refs: use designated initializers for "struct ref_iterator_vtable"
refs: use designated initializers for "struct ref_storage_be"
The worktree repair command was not added to the usage menu for the
worktree command. This commit adds the usage of 'worktree repair'
according to the existing docs.
Signed-off-by: Des Preston <despreston@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the stat info of the moved index entry in 'rename_index_entry_at()'
if the entry is up-to-date with the index. Internally, 'git mv' uses
'rename_index_entry_at()' to move the source index entry to the destination.
However, it directly copies the stat info of the original cache entry, which
will not reflect the 'ctime' of the file renaming operation that happened as
part of the move. If a file is otherwise up-to-date with the index, that
difference in 'ctime' will make the entry appear out-of-date until the next
index-refreshing operation (e.g., 'git status').
Some commands, such as 'git reset', use the cached stat information to
determine whether a file is up-to-date; if this information is incorrect,
the command will fail when it should pass. In order to ensure a moved entry
is evaluated as 'up-to-date' when appropriate, refresh the destination index
entry's stat info in 'git mv' if and only if the file is up-to-date.
Note that the test added in 't7001-mv.sh' requires a "sleep 1" to ensure the
'ctime' of the file creation will be definitively older than the 'ctime' of
the renamed file in 'git mv'.
Reported-by: Maximilian Reichel <reichemn@icloud.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_reflog() invokes parse_options() with PARSE_OPT_KEEP_ARGV0, but it
doesn't account for the retained argv[0] before invoking
cmd_reflog_show() to handle the 'git reflog show' subcommand.
Consequently, cmd_reflog_show() always gets an 'argv' array starting
with elements argv[0]="reflog" and argv[1]="show".
Strip the name of the git command from the 'argv' array before passing
it to the function handling the 'show' subcommand.
There is no user-visible bug here, because cmd_reflog_show() doesn't
have any options or parameters of its own.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the assignment syntax introduced in 66c0dabab5 (reftable: make
reftable_record a tagged union, 2022-01-20) to be portable to AIX xlc
v12.1:
avar@gcc111:[/home/avar]xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0000
The error emitted before this was e.g.:
"reftable/generic.c", line 133.26: 1506-196 (S) Initialization
between types "char*" and "struct reftable_ref_record" is not
allowed.
The syntax in the pre-image is supported by e.g. xlc 13.01 on a newer
AIX version:
avar@gcc119:[/home/avar]xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0006
But as we've otherwise supported this compiler let's not break it
entirely if it's easy to work around it.
Suggested-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document it for partial clones as a means to apply a new filter, and
reference it from the remote.<name>.partialclonefilter config parameter.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After invoking `fetch --refetch`, the object db will likely contain many
duplicate objects. If auto-maintenance is enabled, invoke it with
appropriate settings to encourage repacking/consolidation.
* gc.autoPackLimit: unless this is set to 0 (disabled), override the
value to 1 to force pack consolidation.
* maintenance.incremental-repack.auto: unless this is set to 0, override
the value to -1 to force incremental repacking.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test for doing a refetch to apply a changed partial clone filter
under protocol v0 and v2.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fetch and transports the --refetch option to force a full fetch
without negotiating common commits with the remote. Use when applying a
new partial clone filter to refetch all matching objects.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a refetch option to fetch-pack to force a full fetch. Use when
applying a new partial clone filter to refetch all matching objects.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow a "refetch" where the contents of the local object store are
ignored and a full fetch is performed, not attempting to find or
negotiate common commits with the remote.
A key use case is to apply a new partial clone blob/tree filter and
refetch all the associated matching content, which would otherwise not
be transferred when the commit objects are already present locally.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a specific initializer for the noop fetch negotiator. This is
introduced to support allowing partial clones to skip commit negotiation
when performing a "refetch".
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding [1] (pack-objects: move revs out of
get_object_list(), 2022-03-22) the "repo_init_revisions()" was moved
to cmd_pack_objects() so that it unconditionally took place for all
invocations of "git pack-objects".
We'd thus start leaking memory, which is easily reproduced in
e.g. git.git by feeding e83c516331 (Initial revision of "git", the
information manager from hell, 2005-04-07) to "git pack-objects";
$ echo e83c516331 | ./git pack-objects initial
[...]
==19130==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7120 byte(s) in 1 object(s) allocated from:
#0 0x455308 in __interceptor_malloc (/home/avar/g/git/git+0x455308)
#1 0x75b399 in do_xmalloc /home/avar/g/git/wrapper.c:41:8
#2 0x75b356 in xmalloc /home/avar/g/git/wrapper.c:62:9
#3 0x5d7609 in prep_parse_options /home/avar/g/git/diff.c:5647:2
#4 0x5d415a in repo_diff_setup /home/avar/g/git/diff.c:4621:2
#5 0x6dffbb in repo_init_revisions /home/avar/g/git/revision.c:1853:2
#6 0x4f599d in cmd_pack_objects /home/avar/g/git/builtin/pack-objects.c:3980:2
#7 0x4592ca in run_builtin /home/avar/g/git/git.c:465:11
#8 0x457d81 in handle_builtin /home/avar/g/git/git.c:718:3
#9 0x458ca5 in run_argv /home/avar/g/git/git.c:785:4
#10 0x457b40 in cmd_main /home/avar/g/git/git.c:916:19
#11 0x562259 in main /home/avar/g/git/common-main.c:56:11
#12 0x7fce792ac7ec in __libc_start_main csu/../csu/libc-start.c:332:16
#13 0x4300f9 in _start (/home/avar/g/git/git+0x4300f9)
SUMMARY: LeakSanitizer: 7120 byte(s) leaked in 1 allocation(s).
Aborted
Narrowly fixing that commit would have been easy, just add call
repo_init_revisions() right before get_object_list(), which is
effectively what was done before that commit.
But an unstated constraint when setting it up early is that it was
needed for the subsequent [2] (pack-objects: parse --filter directly
into revs.filter, 2022-03-22), i.e. we might have a --filter
command-line option, and need to either have the "struct rev_info"
setup when we encounter that option, or later.
Let's just change the control flow so that we'll instead set up the
"struct rev_info" only when we need it. Doing so leads to a bit more
verbosity, but it's a lot clearer what we're doing and why.
An earlier version of this commit[3] went behind
opt_parse_list_objects_filter()'s back by faking up a "struct option"
before calling it. Let's avoid that and instead create a blessed API
for this pattern.
We could furthermore combine the two get_object_list() invocations
here by having repo_init_revisions() invoked on &pfd.revs, but I think
clearly separating the two makes the flow clearer. Likewise
redundantly but explicitly (i.e. redundant v.s. a "{ 0 }") "0" to
"have_revs" early in cmd_pack_objects().
While we're at it add parentheses around the arguments to the OPT_*
macros in in list-objects-filter-options.h, as we need to change those
lines anyway. It doesn't matter in this case, but is good general
practice.
1. https://lore.kernel.org/git/619b757d98465dbc4995bdc11a5282fbfcbd3daa.1647970119.git.gitgitgadget@gmail.com
2. https://lore.kernel.org/git/97de926904988b89b5663bd4c59c011a1723a8f5.1647970119.git.gitgitgadget@gmail.com
3. https://lore.kernel.org/git/patch-1.1-193534b0f07-20220325T121715Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git fetch --recurse-submodules" grabbed submodule commits
that would be needed to recursively check out newly fetched commits
in the superproject, it only paid attention to submodules that are
in the current checkout of the superproject. We now do so for all
submodules that have been run "git submodule init" on.
* gc/recursive-fetch-with-unused-submodules:
submodule: fix latent check_has_commit() bug
fetch: fetch unpopulated, changed submodules
submodule: move logic into fetch_task_create()
submodule: extract get_fetch_task()
submodule: store new submodule commits oid_array in a struct
submodule: inline submodule_commits() into caller
submodule: make static functions read submodules from commits
t5526: create superproject commits with test helper
t5526: stop asserting on stderr literally
t5526: introduce test helper to assert on fetches
Updates to refs traditionally weren't fsync'ed, but we can
configure using core.fsync variable to do so.
* ps/fsync-refs:
core.fsync: new option to harden references
Replace core.fsyncObjectFiles with two new configuration variables,
core.fsync and core.fsyncMethod.
* ns/core-fsyncmethod:
core.fsync: documentation and user-friendly aggregate options
core.fsync: new option to harden the index
core.fsync: add configuration parsing
core.fsync: introduce granular fsync control infrastructure
core.fsyncmethod: add writeout-only mode
wrapper: make inclusion of Windows csprng header tightly scoped
* jh/builtin-fsmonitor-part2: (150 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
Create 2x2 test matrix with the untracked-cache and fsmonitor--daemon
features and a series of edits and verify that status output is
identical.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Measure the time taken to apply the FSMonitor query result
to the index and the untracked-cache.
Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when
FSMonitor returns a very large repsonse to ensure that the index is
written to disk.
Normally, when the FSMonitor response includes a tracked file, the
index is always updated. Similarly, the index might be updated when
the response alters the untracked-cache (when enabled). However, in
cases where neither of those cause the index to be considered changed,
the FSMonitor response is wasted. Subsequent Git commands will make
requests with the same token and receive the same response.
If that response is very large, performance may suffer. It would be
more efficient to force update the index now (and the token in the
index extension) in order to reduce the size of the response received
by future commands.
This was observed on Windows after a large checkout. On Windows, the
kernel emits events for the files that are changed as they are
changed. However, it might delay events for the containing
directories until the system is more idle (or someone scans the
directory (so it seems)). The first status following a checkout would
get the list of files. The subsequent status commands would get the
list of directories as the events trickled out. But they would never
catch up because the token was not advanced because the index wasn't
updated.
This list of directories caused `wt_status_collect_untracked()` to
unnecessarily spend time actually scanning them during each command.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon client threads to create a cookie file
inside the .git directory and then wait until FS events for the
cookie are observed by the FS listener thread.
This helps address the racy nature of file system events by
blocking the client response until the kernel has drained any
event backlog.
This is especially important on MacOS where kernel events are
only issued with a limited frequency. See the `latency` argument
of `FSeventStreamCreate()`. The kernel only signals every `latency`
seconds, but does not guarantee that the kernel queue is completely
drained, so we may have to wait more than one interval. If we
increase the latency, the system is more likely to drop events.
We avoid these issues by having each client thread create a unique
cookie file and then wait until it is seen in the event stream.
Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon to periodically truncate the list of
modified files to save some memory.
Clients will ask for the set of changes relative to a token that they
found in the FSMN index extension in the index. (This token is like a
point in time, but different). Clients will then update the index to
contain the response token (so that subsequent commands will be
relative to this new token).
Therefore, the daemon can gradually truncate the in-memory list of
changed paths as they become obsolete (older than the previous token).
Since we may have multiple clients making concurrent requests with a
skew of tokens and clients may be racing to the talk to the daemon,
we lazily truncate the list.
We introduce a 5 minute delay and truncate batches 5 minutes after
they are considered obsolete.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and
the "Simple IPC" interface.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change p7519 to use `test_seq` and `xargs` rather than a `for` loop
to touch thousands of files. This takes minutes off of test runs
on Windows because of process creation overhead.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `test-tool.exe chmtime` to ignore errors when setting the mtime
on a directory on Windows.
NEEDSWORK: The Windows version of `utime()` (aka `mingw_utime()`) does
not properly handle directories because it uses `_wopen()`. It should
be converted to using `CreateFileW()` and backup semantics at a minimum.
Since I'm already in the middle of a large patch series, I did not want
to destabilize other callers of `utime()` right now. The problem has
only been observed in the t/perf/p7519 test when the test repo contains
an empty directory on disk.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do not copy any of the various fsmonitor--daemon files from the .git
directory of the (GIT_PREF_REPO or GIT_PERF_LARGE_REPO) source repo
into the test's trash directory.
When perf tests start, they copy the contents of the source repo into
the test's trash directory. If fsmonitor is running in the source repo,
there may be control files, such as the IPC socket and/or fsmonitor
cookie files. These should not be copied into the test repo.
Unix domain sockets cannot be copied in the manner used by the test
setup, so if present, the test setup fails.
Cookie files are harmless, but we should avoid them.
The builtin fsmonitor keeps all such control files/sockets in
.git/fsmonitor--daemon*, so it is simple to exclude them.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create an IPC client to send query and flush commands to the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the "feature: fsmonitor--daemon" message to the output of
`git version --build-options`.
The builtin FSMonitor is only available on certain platforms and
even then only when certain Makefile flags are enabled, so print
a message in the verbose version output when it is available.
This can be used by test scripts for prereq testing. Granted, tests
could just try `git fsmonitor--daemon status` and look for a 128 exit
code or grep for a "not supported" message on stderr, but these
methods are rather obscure.
The main advantage is that the feature message will automatically
appear in bug reports and other support requests.
This concept was also used during the development of Scalar for
similar reasons.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon to respond to IPC requests from client
Git processes and respond with a list of modified pathnames
relative to the provided token.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs. We need different versions of the declarations
for GCC vs. clang because of compiler and header file conflicts.
While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:
In file included
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/Security.framework/Headers/Security.h:42,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/CoreServices.framework/Frameworks/...
...OSServices.framework/Headers/CSIdentity.h:43,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/CoreServices.framework/Frameworks/...
...OSServices.framework/Headers/OSServices.h:29,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/CoreServices.framework/Frameworks/...
...LaunchServices.framework/Headers/IconsCore.h:23,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/CoreServices.framework/Frameworks/...
...LaunchServices.framework/Headers/LaunchServices.h:23,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/...
...Library/Frameworks/Security.framework/Headers/Authorization.h:193:7:
error: variably modified 'bytes' at file scope
193 | char bytes[kAuthorizationExternalFormLength];
| ^~~~~
The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.
This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082
In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.
Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the win32 backend to register a watch on the working tree
root directory (recursively). Also watch the <gitdir> if it is
not inside the working tree. And to collect path change notifications
into batches and publish.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon to build a list of changed paths and associate
them with a token-id. This will be used by the platform-specific
backends to accumulate changed paths in response to filesystem events.
The platform-specific file system listener thread receives file system
events containing one or more changed pathnames (with whatever
bucketing or grouping that is convenient for the file system). These
paths are accumulated (without locking) by the file system layer into
a `fsmonitor_batch`.
When the file system layer has drained the kernel event queue, it will
"publish" them to our token queue and make them visible to concurrent
client worker threads. The token layer is free to combine and/or de-dup
paths within these batches for efficient presentation to clients.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon to create token-ids and define the
overall token naming scheme.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fsmonitor--daemon to classify relative and absolute
pathnames and decide how they should be handled. This will
be used by the platform-specific backend to respond to each
filesystem event.
When we register for filesystem notifications on a directory,
we get events for everything (recursively) in the directory.
We want to report to clients changes to tracked and untracked
paths within the working directory proper. We do not want to
report changes within the .git directory, for example.
This classification will be used in a later commit by the
different backends to classify paths as events are received.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement 'git fsmonitor--daemon start' command. This command starts
an instance of 'git fsmonitor--daemon run' in the background using
the new 'start_bg_command()' function.
We avoid the fork-and-call technique on Unix systems in favor of a
fork-and-exec technique. This gives us more uniform Trace2 child-*
events. It also makes our usage more consistent with Windows usage.
On Windows, teach 'git fsmonitor--daemon run' to optionally call
'FreeConsole()' to release handles to the inherited Win32 console
(despite being passed invalid handles for stdin/out/err). Without
this, command prompts and powershell terminal windows could hang
in "exit" until the last background child process exited or released
their Win32 console handle. (This was not seen with git-bash shells
because they don't have a Win32 console attached to them.)
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement `run` command to try to begin listening for file system events.
This version defines the thread structure with a single fsmonitor_fs_listen
thread to watch for file system events and a simple IPC thread pool to
watch for connection from Git clients over a well-known named pipe or
Unix domain socket.
This commit does not actually do anything yet because the platform
backends are still just stubs.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stub in empty implementation of fsmonitor--daemon
backend for Darwin (aka MacOS).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stub in empty filesystem listener backend for fsmonitor--daemon on Windows.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement `stop` and `status` client commands to control and query the
status of a `fsmonitor--daemon` server process (and implicitly start a
server process if necessary).
Later commits will implement the actual server and monitor the file
system.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a built-in file system monitoring daemon that can be used by
the existing `fsmonitor` feature (protocol API and index extension)
to improve the performance of various Git commands, such as `status`.
The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and
provides an alternative to hook access to existing fsmonitors such
as `watchman`.
This commit merely adds the new command without any functionality.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document how `core.fsmonitor` can be set to a boolean to enable
or disable the builtin FSMonitor.
Update references to `core.fsmonitor` and `core.fsmonitorHookVersion` and
pointers to `Watchman` to refer to it.
Create `git-fsmonitor--daemon` manual page and describe its features.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use simple IPC to directly communicate with the new builtin file
system monitor daemon when `core.fsmonitor` is set to true.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move fsmonitor config settings to a new and opaque
`struct fsmonitor_settings` structure. Add a lazily-loaded pointer
to this into `struct repo_settings`
Create an `enum fsmonitor_mode` type in `struct fsmonitor_settings` to
represent the state of fsmonitor. This lets us represent which, if
any, fsmonitor provider (hook or IPC) is enabled.
Create `fsm_settings__get_*()` getters to lazily look up fsmonitor-
related config settings.
Get rid of the `core_fsmonitor` global variable. Move the code to
lookup the existing `core.fsmonitor` config value into the fsmonitor
settings.
Create a hook pathname variable in `struct fsmonitor-settings` and
only set it when in hook mode.
Extend the definition of `core.fsmonitor` to be either a boolean
or a hook pathname. When true, the builtin FSMonitor is used.
When false or unset, no FSMonitor (neither builtin nor hook) is
used.
The existing `core_fsmonitor` global variable was used to store the
pathname to the fsmonitor hook *and* it was used as a boolean to see
if fsmonitor was enabled. This dual usage and global visibility leads
to confusion when we add the IPC-based provider. So lets hide the
details in fsmonitor-settings.c and let it decide which provider to
use in the case of multiple settings. This avoids cluttering up
repo-settings.c with these private details.
A future commit in builtin-fsmonitor series will add the ability to
disqualify worktrees for various reasons, such as being mounted from a
remote volume, where fsmonitor should not be started. Having the
config settings hidden in fsmonitor-settings.c allows such worktree
restrictions to override the config values used.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create fsmonitor_ipc__*() client routines to spawn the built-in file
system monitor daemon and send it an IPC request using the `Simple
IPC` API.
Stub in empty fsmonitor_ipc__*() functions for unsupported platforms.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation of test_subcommand_inexact() was originally
introduced in e4d0c11c0 (repack: respect kept objects with '--write-midx
-b', 2021-12-20) with the intention to allow finding a subcommand based
on an initial set of arguments. The inexactness was intended as a way to
allow flexible options beyond that initial set, as opposed to
test_subcommand() which requires that the full list of options is
provided in its entirety.
The implementation began by copying test_subcommand() and replaced the
repeated argument 'printf' statement to append ".*" instead of "," to
each argument. This caused it to be more flexible than initially
intended.
The previous change deleted the only use of test_subcommand_inexact, so
instead of editing the helper, delete it.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--write-midx -b packs non-kept objects' test in t7700-repack.sh
uses test_subcommand_inexact to check that 'git repack' properly adds
the '--honor-pack-keep' flag to the 'git pack-objects' subcommand.
However, the test_subcommand_inexact helper is more flexible than
initially designed, and this instance is the only one that makes use of
it: there are additional arguments between 'git pack-objects' and the
'--honor-pack-keep' flag. In order to make test_subcommand_inexact more
strict, we need to fix this instance.
This test checks that 'git repack --write-midx -a -b -d' will create a
new pack-file that does not contain the objects within the kept pack.
This behavior is possible because of the multi-pack-index bitmap that
will bitmap objects against multiple packs. Without --write-midx, the
objects in the kept pack would be duplicated so the resulting pack is
closed under reachability and bitmaps can be created against it. This is
discussed in more detail in e4d0c11c0 (repack: respect kept objects with
'--write-midx -b', 2021-12-20) which also introduced this instance of
test_subcommand_inexact.
To better verify the intended post-conditions while also removing this
instance of test_subcommand_inexact, rewrite the test to check the list
of packed objects in the kept pack and the list of the objects in the
newly-repacked pack-file _other_ than the kept pack. These lists should
be disjoint.
Be sure to include a non-kept pack-file and loose objects to be extra
careful that this is properly behaving with kept packs and not just
avoiding repacking all pack-files.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "--immediate" option so that it emits valid TAP on
failure. Before this it would omit the required plan at the end,
e.g. under SANITIZE=leak we'd show a "No plan found in TAP output"
error from "prove":
$ prove t0006-date.sh :: --immediate
t0006-date.sh .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/22 subtests
Test Summary Report
-------------------
t0006-date.sh (Wstat: 256 Tests: 22 Failed: 1)
Failed test: 22
Non-zero exit status: 1
Parse errors: No plan found in TAP output
Files=1, Tests=22, 0 wallclock secs ( 0.02 usr 0.01 sys + 0.18 cusr 0.06 csys = 0.27 CPU)
Result: FAIL
Now we'll emit output that doesn't result in TAP parsing failures:
$ prove t0006-date.sh :: --immediate
t0006-date.sh .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/22 subtests
Test Summary Report
-------------------
t0006-date.sh (Wstat: 256 Tests: 22 Failed: 1)
Failed test: 22
Non-zero exit status: 1
Files=1, Tests=22, 0 wallclock secs ( 0.02 usr 0.00 sys + 0.19 cusr 0.05 csys = 0.26 CPU)
Result: FAIL
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the short help output from "git reset -h", the recently added
"--[no-]refresh" option is shown like so:
--refresh skip refreshing the index after reset
which explains what happens when the option is given in the negative
form, i.e. "--no-refresh". We could rephrase the explanation to
read "refresh the index after reset (default)" to hint that the user
can say "--no-refresh" to override the default, but listing the
"--no-refresh" form in the list of options would be more helpful.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.34:
Git 2.34.2
Git 2.33.2
Git 2.32.1
Git 2.31.2
GIT-VERSION-GEN: bump to v2.33.1
Git 2.30.3
setup_git_directory(): add an owner check for the top-level directory
Add a function to determine whether a path is owned by the current user
Change the "git reflog show -h" output to show the usage summary
relevant to it, rather than displaying the same output that "git log
-h" would show.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the work started in 33d7bdd645 (builtin/reflog.c: use
parse-options api for expire, delete subcommands, 2022-01-06) and
convert the cmd_reflog() function itself to use the parse_options()
API.
Let's also add a test which would fail if we forgot
PARSE_OPT_NO_INTERNAL_HELP here, as well as making sure that we'll
still pass through "--" by supplying PARSE_OPT_KEEP_DASHDASH. For that
test we need to change "test_commit()" to accept files starting with
"--".
The "git reflog -h" usage will now show the usage for all of the
sub-commands, rather than a terse summary which wasn't
correct (e.g. "git reflog exists" is not a valid command). See my
8757b35d44 (commit-graph: define common usage with a macro,
2021-08-23) for prior art.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the 'reset.refresh' option, requiring that users explicitly specify
'--no-refresh' if they want to skip refreshing the index.
The 'reset.refresh' option was introduced in 101cee42dd (reset: introduce
--[no-]refresh option to --mixed, 2022-03-11) as a replacement for the
refresh-skipping behavior originally controlled by 'reset.quiet'.
Although 'reset.refresh=false' functionally served the same purpose as
'reset.quiet=true', it exposed [1] the fact that the existence of a global
"skip refresh" option could potentially cause problems for users. Allowing a
global config option to avoid refreshing the index forces scripts using 'git
reset --mixed' to defensively use '--refresh' if index refresh is expected;
if that option is missing, behavior of a script could vary from user-to-user
without explanation.
Furthermore, globally disabling index refresh in 'reset --mixed' was
initially devised as a passive performance improvement; since the
introduction of the option, other changes have been made to Git (e.g., the
sparse index) with a greater potential performance impact without
sacrificing index correctness. Therefore, we can more aggressively err on
the side of correctness and limit the cases of skipping index refresh to
only when a user specifies the '--no-refresh' option.
[1] https://lore.kernel.org/git/xmqqy2179o3c.fsf@gitster.g/
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the 'reset.quiet' config option, remove '--no-quiet' documentation in
'Documentation/git-reset.txt'. In 4c3abd0551 (reset: add new reset.quiet
config setting, 2018-10-23), 'reset.quiet' was introduced as a way to
globally change the default behavior of 'git reset --mixed' to skip index
refresh.
However, now that '--quiet' does not affect index refresh, 'reset.quiet'
would only serve to globally silence logging. This was not the original
intention of the config setting, and there's no precedent for such a setting
in other commands with a '--quiet' option, so it appears to be obsolete.
In addition to the options & its documentation, remove 'reset.quiet' from
the recommended config for 'scalar'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update '--quiet' to no longer implicitly skip refreshing the index in a
mixed reset. Users now have the ability to explicitly disable refreshing the
index with the '--no-refresh' option, so they no longer need to use
'--quiet' to do so. Moreover, we explicitly remove the refresh-skipping
behavior from '--quiet' because it is completely unrelated to the stated
purpose of the option: "Be quiet, only report errors."
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line completion support (in contrib/) learns to give
modified paths to the "git restore" command.
* dc/complete-restore:
completion: tab completion of filenames for 'git restore'
Optimize away strbuf_expand() call with a hardcoded formatting logic
specific for the default format in the --batch and --batch-check
options of "git cat-file".
* jc/cat-file-batch-default-format-optim:
cat-file: skip expanding default format
"git repack" learned a new configuration to disable triggering of
age-old "update-server-info" command, which is rarely useful these
days.
* ps/repack-with-server-info:
repack: add config to skip updating server info
repack: refactor to avoid double-negation of update-server-info
"git name-rev" learned to use the generation numbers when setting
the lower bound of searching commits used to explain the revision,
when available, instead of committer time.
* jk/name-rev-w-genno:
name-rev: use generation numbers if available
Get rid of one use of __asm__() GCC extension that does not help us
much these days, which has an added advantage of not having to
worry about -pedantic complaining.
* bc/block-sha1-without-gcc-asm-extension:
block-sha1: remove use of obsolete x86 assembly
Rewrite of "git submodule update" in C (early part).
* gc/submodule-update-part1:
submodule--helper update-clone: check for --filter and --init
submodule update: add tests for --filter
submodule--helper: remove ensure-core-worktree
submodule--helper update-clone: learn --init
submodule--helper: allow setting superprefix for init_submodule()
submodule--helper: refactor get_submodule_displaypath()
submodule--helper run-update-procedure: learn --remote
submodule--helper: don't use bitfield indirection for parse_options()
submodule--helper: get remote names from any repository
submodule--helper run-update-procedure: remove --suboid
submodule--helper: reorganize code for sh to C conversion
submodule--helper: remove update-module-mode
submodule tests: test for init and update failure output
The previous change moved the 'filter' capability to the end of the 'git
bundle verify' output. Now, add the 'object-format' capability to the
output, when it exists.
This change makes 'git bundle verify' output the hash used in all cases,
even if the capability is not in the bundle. This means that v2 bundles
will always output that they use "sha1". This might look noisy to some
users, but it does simplify the implementation and the test strategy for
this feature.
Since 'verify' ends early when a prerequisite commit is missing, we need
to insert this hash message carefully into our expected test output
throughout t6020.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'filter' capability was added in 105c6f14a (bundle: parse filter
capability, 2022-03-09), but was added in a strange place in the 'git
bundle verify' output.
The tests for this show output like the following:
The bundle contains these 2 refs:
<COMMIT1> <REF1>
<COMMIT2> <REF2>
The bundle uses this filter: blob:none
The bundle records a complete history.
This looks very odd if we have a thin bundle that contains boundary
commits instead of a complete history:
The bundle contains these 2 refs:
<COMMIT1> <REF1>
<COMMIT2> <REF2>
The bundle uses this filter: blob:none
The bundle requires these 2 refs:
<COMMIT3>
<COMMIT4>
This separation between tip refs and boundary refs is unfortunate. Move
the filter capability output to the end of the output. Update the
documentation to match.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change moved the 'revs' variable into cmd_pack_objects()
and now we can remove the global filter_options in favor of revs.filter.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We intend to parse the --filter option directly into revs.filter, but we
first need to move the 'revs' variable out of get_object_list() and pass
it as a pointer instead. This change only deals with the issues of
making that work.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have established the command-line interface for the --[no-]filter
options for a while now, so we do not need a helper to make this
editable in the future.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6d158cba28 (bash completion: Support "divergence from upstream"
messages in __git_ps1, 2010-06-17) introduced support for indicating
divergence from upstream in the PS1 prompt. The comments at the top of
git-prompt.sh that were introduced with that commit are several
paragraphs long. Over the years, other comments have been inserted in
between the paragraphs relating to divergence from upstream.
This commit puts the comments relating to divergence from upstream back
together.
Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a pipe as a separator before long upstream state indicator. This is
consistent with long state indicators for sparse and in-progress
operations (e.g. merge).
For comparison, `__git_ps1` examples without upstream state indicator:
(main)
(main %)
(main *%)
(main|SPARSE)
(main %|SPARSE)
(main *%|SPARSE)
(main|SPARSE|REBASE 1/2)
(main %|SPARSE|REBASE 1/2)
Note that if there are long state indicators, they appear after short
state indicators if there are any, or after the branch name if there are
no short state indicators. Each long state indicator begins with a pipe
(`|`) as a separator.
Before/after examples with long upstream state indicator:
| Before | After |
| ------------------------------- | ------------------------------- |
| (main u=) | (main|u=) |
| (main u= origin/main) | (main|u= origin/main) |
| (main u+1) | (main|u+1) |
| (main u+1 origin/main) | (main|u+1 origin/main) |
| (main % u=) | (main %|u=) |
| (main % u= origin/main) | (main %|u= origin/main) |
| (main % u+1) | (main %|u+1) |
| (main % u+1 origin/main) | (main %|u+1 origin/main) |
| (main|SPARSE u=) | (main|SPARSE|u=) |
| (main|SPARSE u= origin/main) | (main|SPARSE|u= origin/main) |
| (main|SPARSE u+1) | (main|SPARSE|u+1) |
| (main|SPARSE u+1 origin/main) | (main|SPARSE|u+1 origin/main) |
| (main %|SPARSE u=) | (main %|SPARSE|u=) |
| (main %|SPARSE u= origin/main) | (main %|SPARSE|u= origin/main) |
| (main %|SPARSE u+1) | (main %|SPARSE|u+1) |
| (main %|SPARSE u+1 origin/main) | (main %|SPARSE|u+1 origin/main) |
Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make upstream state indicator location more consistent with similar
state indicators (e.g. sparse). Group the short upstream state indicator
(`=`, `<`, `>`, or `<>`) with other short state indicators immediately
after the branch name. Previously short and long upstream state
indicators appeared after all other state indicators.
Use a separator (`SP` or `GIT_PS1_STATESEPARATOR`) between branch name
and short upstream state indicator. Previously the short upstream state
indicator would sometimes appear directly adjacent to the branch name
instead of being separated.
For comparison, `__git_ps1` examples without upstream state indicator:
(main)
(main %)
(main *%)
(main|SPARSE)
(main %|SPARSE)
(main *%|SPARSE)
(main|SPARSE|REBASE 1/2)
(main %|SPARSE|REBASE 1/2)
Note that if there are short state indicators, they appear together
after the branch name and separated from it by `SP` or
`GIT_PS1_STATESEPARATOR`.
Before/after examples with short upstream state indicator:
| Before | After |
| ---------------- | ---------------- |
| (main=) | (main =) |
| (main|SPARSE=) | (main =|SPARSE) |
| (main %|SPARSE=) | (main %=|SPARSE) |
Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `__git_ps1_show_upstream` rename the variable `upstream` to
`upstream_type`. This allows `__git_ps1_show_upstream` to reference a
variable named `upstream` that is declared `local` in `__git_ps1`, which
calls `__git_ps1_show_upstream`.
Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com>
Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in a8cc594333 (hooks: fix an obscure TOCTOU "did we
just run a hook?" race, 2022-03-07): The "invoked_hook" variable
passed to run_commit_hook() wasn't passed forward to run_hooks_opt(),
as push_to_checkout() in that commit correctly did.
Whether we ran the code contingent on having run the hook or not was
thus undefined, but in practice on most (all?) modern platforms we'd
have run it (almost?) all the time, since stack variables will get
initialized to some random value, which most of the time isn't "0".
This bug was revealed by running e.g. "t5537-fetch-shallow.sh" with
the --valgrind option. Unfortunately running the whole test suite with
--valgrind is really slow, so we didn't have a CI job that spotted
this. The --valgrind output was:
==31275== Conditional jump or move depends on uninitialised value(s)
==31275== at 0x43C63F: prepare_to_commit (commit.c:1058)
==31275== by 0x4396A5: cmd_commit (commit.c:1722)
==31275== by 0x407C8A: run_builtin (git.c:465)
==31275== by 0x406741: handle_builtin (git.c:718)
==31275== by 0x407665: run_argv (git.c:785)
==31275== by 0x406500: cmd_main (git.c:916)
==31275== by 0x510839: main (common-main.c:56)
==31275== Uninitialised value was created by a stack allocation
==31275== at 0x43B344: prepare_to_commit (commit.c:719)
Reported-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the various if/else in the callbacks for the "fast path" a lot
easier to read by just using common functions for the parts that are
common, and have per-format callbacks for those parts that are
different.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --name-only and --name-status options are synonyms, but let's
detect and error if both are provided.
In addition let's add explicit --format tests for the combination of
these various options.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'--object-only' is an alias for '--format=%(objectname)'. It cannot
be used together other format-altering options like '--name-only',
'--long' or '--format', they are mutually exclusive.
The "--name-only" option outputs <filepath> only. Likewise, <objectName>
is another high frequency used field, so implement '--object-only' option
will bring intuitive and clear semantics for this scenario. Using
'--format=%(objectname)' we can achieve a similar effect, but the former
is with a lower learning cost(without knowing the format requirement
of '--format' option).
Even so, if a user is prefer to use "--format=%(objectname)", this is entirely
welcome because they are not only equivalent in function, but also have almost
identical performance. The reason is this commit also add the specific of
"--format=%(objectname)" to the current fast-pathes (builtin formats) to
avoid running unnecessary parsing mechanisms.
The following performance benchmarks are based on torvalds/linux.git:
When hit the fast-path:
Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --object-only HEAD
Time (mean ± σ): 83.6 ms ± 2.0 ms [User: 59.4 ms, System: 24.1 ms]
Range (min … max): 80.4 ms … 87.2 ms 35 runs
Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(objectname)' HEAD
Time (mean ± σ): 84.1 ms ± 1.8 ms [User: 61.7 ms, System: 22.3 ms]
Range (min … max): 80.9 ms … 87.5 ms 35 runs
But for a customized format, it will be slower:
Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='oid: %(objectname)' HEAD
Time (mean ± σ): 96.5 ms ± 2.5 ms [User: 72.9 ms, System: 23.5 ms]
Range (min … max): 93.1 ms … 104.1 ms 31 runs
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a --format option to ls-tree. It has an existing default output,
and then --long and --name-only options to emit the default output
along with the objectsize and, or to only emit object paths.
Rather than add --type-only, --object-only etc. we can just support a
--format using a strbuf_expand() similar to "for-each-ref
--format". We might still add such options in the future for
convenience.
The --format implementation is slower than the existing code, but this
change does not cause any performance regressions. We'll leave the
existing show_tree() unchanged, and only run show_tree_fmt() in if
a --format different than the hardcoded built-in ones corresponding to
the existing modes is provided.
I.e. something like the "--long" output would be much slower with
this, mainly due to how we need to allocate various things to do with
quote.c instead of spewing the output directly to stdout.
The new option of '--format' comes from Ævar Arnfjörð Bjarmasonn's
idea and suggestion, this commit makes modifications in terms of the
original discussion on community [1].
In [1] there was a "GIT_TEST_LS_TREE_FORMAT_BACKEND" variable to
ensure that we had test coverage for passing tests that would
otherwise use show_tree() through show_tree_fmt(), and thus that the
formatting mechanism could handle all the same cases as the
non-formatting options.
Somewhere in subsequent re-rolls of that we seem to have drifted away
from what the goal of these tests should be. We're trying to ensure
correctness of show_tree_fmt(). We can't tell if we "hit [the]
fast-path" here, and instead of having an explicit test for that, we
can just add it to something our "test_ls_tree_format" tests for.
Here is the statistics about performance tests:
1. Default format (hitten the builtin formats):
"git ls-tree <tree-ish>" vs "--format='%(mode) %(type) %(object)%x09%(file)'"
$hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r HEAD"
Benchmark 1: /opt/git/master/bin/git ls-tree -r HEAD
Time (mean ± σ): 105.2 ms ± 3.3 ms [User: 84.3 ms, System: 20.8 ms]
Range (min … max): 99.2 ms … 113.2 ms 28 runs
$hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)' HEAD"
Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object)%x09%(file)' HEAD
Time (mean ± σ): 106.4 ms ± 2.7 ms [User: 86.1 ms, System: 20.2 ms]
Range (min … max): 100.2 ms … 110.5 ms 29 runs
2. Default format includes object size (hitten the builtin formats):
"git ls-tree -l <tree-ish>" vs "--format='%(mode) %(type) %(object) %(size:padded)%x09%(file)'"
$hyperfine --warmup=10 "/opt/git/master/bin/git ls-tree -r -l HEAD"
Benchmark 1: /opt/git/master/bin/git ls-tree -r -l HEAD
Time (mean ± σ): 335.1 ms ± 6.5 ms [User: 304.6 ms, System: 30.4 ms]
Range (min … max): 327.5 ms … 348.4 ms 10 runs
$hyperfine --warmup=10 "/opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)' HEAD"
Benchmark 1: /opt/git/ls-tree-oid-only/bin/git ls-tree -r --format='%(mode) %(type) %(object) %(size:padded)%x09%(file)' HEAD
Time (mean ± σ): 337.2 ms ± 8.2 ms [User: 309.2 ms, System: 27.9 ms]
Range (min … max): 328.8 ms … 349.4 ms 10 runs
Links:
[1] https://public-inbox.org/git/RFC-patch-6.7-eac299f06ff-20211217T131635Z-avarab@gmail.com/
[2] https://lore.kernel.org/git/cb717d08be87e3239117c6c667cb32caabaad33d.1646390152.git.dyroneteng@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A convenient way to pad strings is to use something like
`strbuf_addf(&buf, "%20s", "Hello, world!")`.
However, the Coccinelle rule that forbids a format `"%s"` with a
constant string argument cast too wide a net, and also forbade such
padding.
The original rule was introduced by commit:
28c23cd4c3 (strbuf.cocci: suggest strbuf_addbuf() to add one strbuf to an other, 2019-01-25)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"show_tree_data" is a struct that packages the necessary fields for
"show_tree()". This commit is a pre-prepared commit for supporting
"--format" option and it does not affect any existing functionality.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a non-functional change, we introduce an enum "ls_tree_cmdmode"
then use it to mark which columns to output.
This has the advantage of making the show_tree logic simpler and more
readable, as well as making it easier to extend new options (for example,
if we want to add a "--object-only" option, we just need to add a similar
"short-circuit logic in "show_tree()").
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we execute "git ls-tree" with combined "--name-only" and "--long"
, only the pathname will be printed, the size is omitted (the original
discoverer was Peff in [1]).
This commit fix this issue by using `OPT_CMDMODE()` instead to make both
of them mutually exclusive.
[1] https://public-inbox.org/git/YZK0MKCYAJmG+pSU@coredump.intra.peff.net/
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the object_type() function to determine the object type from the
"mode" passed to us by read_tree(), instead of doing so with the S_*()
macros.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyronetengb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable which "show_tree()" return is named "retval", a name that's
a little hard to understand. The commit rename "retval" to "recurse"
which is a more meaningful name than before in the context. We do not
need to take a look at "read_tree_at()" in "tree.c" to make sure what
does "retval" mean.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct strbuf"'s "len" member is a "size_t", not an "int", so
let's change our corresponding types accordingly. This also changes
the "len" and "speclen" variables, which are likewise used to store
the return value of strlen(), which returns "size_t", not "int".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the ls-tree.c code to use type_name() on the enum instead of
using the string constants. This doesn't matter either way for
performance, but makes this a bit easier to read as we'll no longer
need a strcmp() here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing {} to the "else" arms in show_tree() per the
CodingGuidelines.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove code added in f35a6d3bce (Teach core object handling functions
about gitlinks, 2007-04-09), later patched in 7d0b18a4da (Add output
flushing before fork(), 2008-08-04), and then finally ending up in its
current form in d3bee161fe (tree.c: allow read_tree_recursive() to
traverse gitlink entries, 2009-01-25). All while being commented-out!
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --name-status synonym for --name-only added in
c639a5548a (ls-tree: --name-only, 2005-12-01) had no tests, let's
make sure it works the same way as its sibling.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The single-key interactive operation used by "git add -p" has been
made more robust.
* pw/single-key-interactive:
add -p: disable stdin buffering when interactive.singlekey is set
terminal: set VMIN and VTIME in non-canonical mode
terminal: pop signal handler when terminal is restored
terminal: always reset terminal when reading without echo
Bundle file format gets extended to allow a partial bundle,
filtered by similar criteria you would give when making a
partial/lazy clone.
* ds/partial-bundles:
clone: fail gracefully when cloning filtered bundle
bundle: unbundle promisor packs
bundle: create filtered bundles
rev-list: move --filter parsing into revision.c
bundle: parse filter capability
list-objects: handle NULL function pointers
MyFirstObjectWalk: update recommended usage
list-objects: consolidate traverse_commit_list[_filtered]
pack-bitmap: drop filter in prepare_bitmap_walk()
pack-objects: use rev.filter when possible
revision: put object filter into struct rev_info
list-objects-filter-options: create copy helper
index-pack: document and test the --promisor option
The method to trigger malloc check used in our tests no longer work
with newer versions of glibc.
* ep/test-malloc-check-with-glibc-2.34:
test-lib: declare local variables as local
test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34
Test fixes.
* sm/no-git-in-upstream-of-pipe-in-tests:
t0030-t0050: avoid pipes with Git on LHS
t0001-t0028: avoid pipes with Git on LHS
t0003: avoid pipes with Git on LHS
Single perforce branch might be sync'ed multiple times with different
revision numbers, so it will be seen to Git as complete different
commits. This can be done by the following command:
git p4 sync --branch=NAME //perforce/path...
It is assumed, that this command applied multiple times and
peforce repository changes between command invocations.
In such situation, git p4 will see multiple perforce branches with
same name and different revision numbers. The problem is that to make
a shelve, git-p4 script will try to find "origin" branch, if not
specified in command line explicitly. And previously script selected
any branch with same name and don't mention particular revision number.
Later this may cause failure of the command "git diff-tree -r $rev^ $rev",
so shelve can't be created (due to wrong origin branch/commit).
This commit fixes the heuristic by which git p4 selects origin branch:
first it tries to select branch with same perforce path and perforce
revision, and if it fails, then selects branch with only same perforce
path (ignoring perforce revision number).
Signed-off-by: Kirill Frolov <k.frolov@samsung.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase A B" where B is not a commit should behave as if the
HEAD got detached at B and then the detached HEAD got rebased on top
of A. A bug however overwrites the current branch to point at B,
when B is a descendant of A (i.e. the rebase ends up being a
fast-forward). See [1] for the original bug report.
The callstack from checkout_up_to_date() is the following:
cmd_rebase()
-> checkout_up_to_date()
-> reset_head()
-> update_refs()
-> update_ref()
When B is not a valid branch but an oid, rebase sets the head_name
of rebase_options to NULL. This value gets passed down this call
chain through the branch member of reset_head_opts also getting set
to NULL all the way to update_refs().
Then update_refs() checks ropts.branch to decide whether or not to switch
branches. If ropts.branch is NULL, it calls update_ref() to update HEAD.
At this point however, from rebase's point of view, we want a detached
HEAD. But, since checkout_up_to_date() does not set the RESET_HEAD_DETACH
flag, the update_ref() call will deference HEAD and update the branch its
pointing to. We want the HEAD detached at B instead.
Fix this bug by adding the RESET_HEAD_DETACH flag in
checkout_up_to_date if B is not a valid branch, so that once
reset_head() calls update_refs(), it calls update_ref() with
REF_NO_DEREF which updates HEAD directly intead of deferencing it
and updating the branch that HEAD points to.
Also add a test to ensure the correct behavior.
[1] https://lore.kernel.org/git/YiokTm3GxIZQQUow@newk/
Reported-by: Michael McClimon <michael@mcclimon.org>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the next commit that will test rebase with oids instead
of branch names, update the rebase setup test to add a couple of tags we
can use. This uses the test_commit helper so we can replace some lines
that add a commit manually.
Setting logAllRefUpdates is not necessary because it's on by default for
repositories with a working tree.
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "reflog exists" command added in afcb2e7a3b (git-reflog:
add exists command, 2015-07-21) to use parse_options() instead of its
own custom command-line parser. This continues work started in
33d7bdd645 (builtin/reflog.c: use parse-options api for expire,
delete subcommands, 2022-01-06).
As a result we'll understand the --end-of-options synonym for "--", so
let's test for that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of the guaranteed pretty alignment of "-h" output added in my
4631cfc20b (parse-options: properly align continued usage output,
2021-09-21) and wrap and format the "git reflog [expire|delete] -h"
usage output. Also add the missing "--single-worktree" option, as well
as adding other things that were in the SYNOPSIS output, but not in
the "-h" output.
This was last touched in 33d7bdd645 (builtin/reflog.c: use
parse-options api for expire, delete subcommands, 2022-01-06), but in
that commit the previous usage() output was faithfully
reproduced. Let's follow-up on that and make this even easier to read.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the "usage" variables in builtin/reflog.c to the top of the file,
in preparation for later commits defining a common "reflog_usage" in
terms of some of these strings, as was done in
8757b35d44 (commit-graph: define common usage with a macro,
2021-08-23).
While we're at it let's make them "const char *const", as is the
convention with these "usage" variables.
The use of macros here is a bit odd, but in subsequent commits we'll
make these use the same pattern as builtin/commit-graph.c uses since
8757b35d44 (commit-graph: define common usage with a macro,
2021-08-23).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were a few "git reflog exists" tests scattered over the test
suite, but let's consolidate the testing of the main functionality
into a new test file. This makes it easier to run just these tests
during development.
To do that amend and extend an existing test added in
afcb2e7a3b (git-reflog: add exists command, 2015-07-21). Let's use
"test_must_fail" instead of "!" (in case it segfaults), and test for
basic usage, an unknown option etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the "if" branches in cmd_reflog() to use "else if" instead,
and remove the whitespace between them.
As with 92f480909f (multi-pack-index: refactor "goto usage" pattern,
2021-08-23) this makes this code more consistent with how
builtin/{bundle,stash,commit-graph,multi-pack-index}.c look and
behave. Their top-level commands are all similar sub-command routing
functions.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reflog.c was lib-ified in 7d3d226e70 (reflog: libify delete
reflog function and helpers, 2022-03-02) these previously "static"
functions were made non-"static", but the argument lists were not
correspondingly indented according to our usual coding style. Let's do
that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reset_repository_shallow() is called, Git clears its cache of
shallow information, so that if shallow information is re-requested, Git
will read fresh data from disk instead of reusing its stale cached data.
However, the cache of commit grafts is not likewise cleared, even though
there are commit grafts created from shallow information.
This means that if on-disk shallow information were to be updated and
then a commit-graft-using codepath were run (for example, a revision
walk), Git would be using stale commit graft information. This can be
seen from the test in this patch, in which Git performs a revision walk
(to check for changed submodules) after a fetch with --update-shallow.
Therefore, clear the cache of commit grafts whenever
reset_repository_shallow() is called.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the http tests to use "test_hook" insteadd of
"write_script". In both cases we can get rid of sub-shelling. For
"t/t5550-http-fetch-dumb.sh" add a trivial helper which sets up the
hook and calls "update-server-info".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the t5411/*.sh tests to use the test_hook helper instead of
"write_script". Unfortunately these tests do the setup and test across
different test_expect_success blocks, so we have to use
--clobber (implying --setup) for these.
Let's change those that can use a quoted here-doc to do so while we're
at it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the "test_hook" function to take options to disable and remove
hooks. Using the wrapper instead of getting the path and running
"chmod -x" or "rm" will make it easier to eventually emulate the same
behavior with config-based hooks.
Not all of these tests need that new mode, but since the rest are
either closely related or use the same "$HOOK" pattern let's convert
them too.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cd475b3b03 (refs: add ability for backends to special-case reading
of symbolic refs, 2022-03-01) when the "read_symbolic_ref" callback
was added we'd fall back on "refs_read_raw_ref" if there wasn't any
backend implementation of "read_symbolic_ref".
As discussed in the preceding commit this would only happen if we were
running the "debug" backend, e.g. in the "setup for ref completion"
test in t9902-completion.sh with:
GIT_TRACE_REFS=1 git fetch --no-tags other
Let's improve the trace output, but and also eliminate the
now-redundant refs_read_raw_ref() fallback case. As noted in the
preceding commit the "packed" backend will never call
refs_read_symbolic_ref() (nor is it ever going to). For any future
backend such as reftable it's OK to ask that they either implement
this (or a wrapper) themselves.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the stub BUG(...) functions previously used by the "struct
ref_storage_be refs_be_packed" backend.
We never call any functions in the packed backend by using it as a
"normal" primary ref store, instead we'll always initialize a "files"
backend ref-store.
It will then via the "packed_ref_store" member of "struct
files_ref_store" call selected functions in the "packed" backend, and
we'll in addition call others via wrappers in refs.c.
So while these would arguably give us *slightly* more meaningful error
messages we'll NULL the missing members in the initializer anyway, so
we'll reliably get a segfault if we're ever changing the backend and
having it call something it doesn't have.
So there's no need for this verbose boilerplate, and as shown in a
subsequent commit it might even lead to some confusion about the
packed backend being a "real" backend. Let's make it clear that it's
not.
As an aside, this also fixes a warning emitted by SunCC in at least
versions 12.5 and 12.6 of Oracle Developer Studio:
"refs/packed-backend.c", line 1599: warning: Function has no return statement : packed_create_symref
"refs/packed-backend.c", line 1606: warning: Function has no return statement : packed_rename_ref)
"refs/packed-backend.c", line 1613: warning: Function has no return statement : packed_copy_ref
"refs/packed-backend.c", line 1648: warning: Function has no return statement : packed_create_reflog
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a few miscellaneous non-designated initializer assignments to
use designated initializers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the definition of the three refs backends we currently carry to
use designated initializers.
The "= NULL" assignments being retained here are redundant, and could
be removed, but let's keep them for clarity. All of these backends
define almost all fields, so we're not saving much in terms of line
count by omitting these, but e.g. for "refs_be_debug" it's immediately
apparent that we're omitting "init" when comparing its assignment to
the others.
This is a follow-up to similar work merged in bd4232fac3 (Merge
branch 'ab/struct-init', 2021-07-16), a4b9fb6a5c (Merge branch
'ab/designated-initializers-more', 2021-10-18) and a30321b9ea (Merge
branch 'ab/designated-initializers' into
ab/designated-initializers-more, 2021-09-27).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit f2a454e0a5 (unpack-trees: improve performance of
next_cache_entry, 2021-11-29).
The "hint" value was originally needed to improve performance in 'git reset
-- <pathspec>' caused by 'cache_bottom' lagging behind its correct value
when using a sparse index. The 'cache_bottom' tracking has since been
corrected, removing the need for an additional "pseudo-cache_bottom"
tracking variable.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct tracking of the 'cache_bottom' for cases where sparse directories
are present in the index.
BACKGROUND
----------
The 'unpack_trees_options.cache_bottom' is a variable that tracks the
in-progress "bottom" of the cache as 'unpack_trees()' iterates through the
contents of the index. Most importantly, this value informs the sequential
return values of 'next_cache_entry()' which, in the "diff cache" usage of
'unpack_callback()', are either unpacked as-is or are passed into the diff
machinery.
The 'cache_bottom' is intended to track the position of the first entry in
the index that has not yet been diffed or unpacked. It is advanced in two
main ways: either it is incremented when an index entry is marked as "used"
(in 'mark_ce_used()'), indicating that it was unpacked or diffed, or when a
directory is unpacked, in which case it is increased by an amount equaling
the number of index entries inside that tree.
In 17a1bb570b (unpack-trees: preserve cache_bottom, 2021-07-14), it was
identified that sparse directories posed a problem to the above
'cache_bottom' advancement logic - because a sparse directory was both an
index entry that could be "used" and a directory that can be unpacked, the
'cache_bottom' would be incremented too many times. To solve this problem,
the 'mark_ce_used()' advancement of 'cache_bottom' was skipped for sparse
directories.
INCORRECT CACHE_BOTTOM TRACKING
-------------------------------
Skipping the 'cache_bottom' advancement for sparse directories in
'mark_ce_used()' breaks down in two cases:
1. When the 'unpack_trees()' operation is *not* a "cache diff" (because the
directory contents-based incrementing of 'cache_bottom' does not happen).
2. When a cache diff is performed with a pathspec (because
'unpack_index_entry()' will unpack a sparse directory not matched by the
pathspec without performing the directory contents-based increment).
The former luckily does not appear to affect 'git' behavior, likely because
'cache_bottom' is largely unused (non-"cache diff" 'unpack_trees()' uses
'find_index_entry()' - rather than 'next_cache_entry()' - to find the index
entries to unpack).
The latter, however, causes 'cache_bottom' to "lag behind" its intended
position by an amount equal to the number of sparse directories unpacked so
far with 'unpack_index_entry()'. If a repository is structured such that any
sparse directories are ordered lexicographically *after* any
pathspec-matching directories, though, this issue won't present any adverse
behavior.
This was the case with the 't1092-sparse-checkout-compatibility.sh' tests
before the addition of the 'before/' sparse directory (ordered *before* the
in-cone 'deep/' directory), therefore sidestepping the issue. Once the
'before/' directory was added, though, 'cache_bottom' began to lag behind
its intended position, causing 'next_cache_entry()' to return index entries
it had already processed and, ultimately, an incorrect diff.
CORRECTING CACHE_BOTTOM
-----------------------
The problems observed in 't1092' come from 'cache_bottom' lagging behind in
cases where the cache tree-based advancement doesn't occur. To solve this,
then, the fix in 17a1bb570b is "reversed"; rather than skipping
'cache_bottom' advancement in 'mark_ce_used()', we skip the directory
contents-based advancement for sparse directories. Now, every index entry
can be accounted for in 'cache_bottom':
* if you're working with a single index entry, 'cache_bottom' is incremented
in 'mark_ce_used()'
* if you're working with a directory that contains index entries (but is not
one itself), 'cache_bottom' is incremented by the number of entries in
that directory.
Finally, change the 'test_expect_failure' tests in 't1092' failing due to
this bug back to 'test_expect_success'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a sparse directory 'before/' containing files 'a' and 'b' to the test
repo used in 't/t1092-sparse-checkout-compatibility.sh'. This is meant to
ensure that no sparse index integrations rely on the in-cone path(s) being
lexicographically first in the repo.
Unfortunately, some existing tests do not handle this repo architecture
properly:
* 'add outside sparse cone'
* 'status/add: outside sparse cone'
* 'reset with pathspecs inside sparse definition'
All three of these are due to the incorrect handling of the
'unpack_trees_options.cache_bottom' when performing a cache diff via
'unpack_trees'. This will be corrected in a future patch; in the meantime,
mark the tests with 'test_expect_failure'.
Finally, update the 'ls-files' and 'root directory cannot be sparse' tests
to include the 'before/' directory in their expected index contents.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
My a18d66cefb (diff.c: free "buf" in diff_words_flush(), 2022-03-04)
has what it retrospect is a rather obvious bug (I don't know what I
was thinking, if it all): We use the "emitted_symbols" allocation in
append_emitted_diff_symbol() N times, but starting with a18d66cefb
we'd free it after its first use!
The correct way to free this data would have been to add the free() to
the existing free_diff_words_data() function, so let's do that. The
"ecbdata->diff_words->opt->emitted_symbols" might be NULL, so let's
add a trivial free_emitted_diff_symbols() helper next to the function
that appends to it.
This fixes the "no effect on show from" leak tested for in the
preceding commit. Perhaps confusingly this change will skip that test
under SANITIZE=leak, but otherwise opt-in the
"t4015-diff-whitespace.sh" test.
The reason is that a18d66cefb "fixed" the leak in the preceding "no
effect on diff" test, but for the first call to diff_words_flush() the
"wol->buf" would be NULL, so we wouldn't double-free (and
SANITIZE=address would see nothing amiss). With this change we'll
still pass that test, showing that we've also fixed leaks on this
codepath.
We then have to skip the new "no effect on show" test because it
happens to trip over an unrelated memory leak (in revision.c). The
same goes for "move detection with submodules". Both of them pass with
SANITIZE=address though, which would error on the "no effect on show"
test before this change.
Reported-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a failing test which demonstrates a regression in
a18d66cefb ("diff.c: free "buf" in diff_words_flush()", 2022-03-04),
the regression is discussed in detail in the subsequent commit. With
it running `git show --word-diff --color-moved` with SANITIZE=address
would emit:
==31191==ERROR: AddressSanitizer: attempting double-free on 0x617000021100 in thread T0:
#0 0x49f0a2 in free (git+0x49f0a2)
#1 0x9b0e4d in diff_words_flush diff.c:2153:3
#2 0x9aed5d in fn_out_consume diff.c:2354:3
#3 0xe092ab in consume_one xdiff-interface.c:43:9
#4 0xe072eb in xdiff_outf xdiff-interface.c:76:10
#5 0xec7014 in xdl_emit_diffrec xdiff/xutils.c:53:6
[...]
0x617000021100 is located 0 bytes inside of 768-byte region [0x617000021100,0x617000021400)
freed by thread T0 here:
#0 0x49f0a2 in free (git+0x49f0a2)
[...(same stacktrace)...]
previously allocated by thread T0 here:
#0 0x49f603 in __interceptor_realloc (git+0x49f603)
#1 0xde4da4 in xrealloc wrapper.c:126:8
#2 0x995dc5 in append_emitted_diff_symbol diff.c:794:2
#3 0x96c44a in emit_diff_symbol diff.c:1527:3
[...]
This was not caught by the test suite because we test `diff
--word-diff --color-moved` only so far.
Therefore, add a test for `show`, too.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of "test_hook" in various cases that didn't fit neatly into
preceding commits. Here we need to indent blocks in addition to
changing the test code, or to make other small cosmetic changes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change tests that used a "mkdir -p .git/hooks && write_script" pattern
to use the new "test_hook" helper instead. The new helper does not
create the .git/hooks directory, rather we assume that the default
template will do so for us.
An upcoming series[1] will extend "test_hook" to operate in a
"--template=" mode, but for now assuming that we have a .git/hooks
already is a safe assumption. If that assumption becomes false in the
future we'll only need to change 'test_hook", instead of all of these
callsites.
1. https://lore.kernel.org/git/cover-00.13-00000000000-20211212T201308Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor various test code to use the "test_hook" helper. This change:
- Fixes the long-standing issues with those tests using "#!/bin/sh"
instead of "#!$SHELL_PATH". Using "#!/bin/sh" here happened to work
because this code was so simple that it e.g. worked on Solaris
/bin/sh.
- Removes the "mkdir .git/hooks" invocation, as explained in a
preceding commit we'll rely on the default templates to create that
directory for us.
For the test in "t5402-post-merge-hook.sh" it's easier and more
correct to unroll the for-loop into a test_expect_success, so let's do
that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the repository setup code for tests that test hooks the use
of sub-shells when setting up the test repository and hooks, and use
the "test_hook" wrapper instead of "write_scripts".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "t5516-fetch-push.sh" test code to make use of the new
"test_hook" helper, and to use "test_when_finished" to have tests
clean up their own state, instead of relying on subsequent tests to
clean the trash directory.
Before this each test would have been responsible for cleaning up
after a preceding test (which may or may not have run, e.g. if --run
or "GIT_SKIP_TESTS" was used), now each test will instead clean up
after itself.
In order to use both "test_hook" and "test_when_finished" we need to
move them out of sub-shells, which requires some refactoring.
While we're at it split up the "push with negotiation" test, now the
middle of the test doesn't need to "rm event", and since it delimited
two halves that were testing two different things the end-state is
easier to read and reason about.
While changing these lines make the minor change from "-fr" to "-rf"
as the "rm" argument, some of them used it already, it's more common
in the test suite, and it leaves the end-state of the file with more
consistency.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a test added in 788a776069 (bugreport: collect list of
populated hooks, 2020-05-07) to "test_cmp" for the expected output,
instead of selectively using "grep" to check for specific things we
either expect or don't expect in the output.
As noted in a preceding commit our .git/hooks directory already
contains *.sample hooks, so we have no need to clobber the
prepare-commit-msg.sample hook in particular.
Instead we should assert that those *.sample hooks are not included in
the output, and for good measure let's add a new "unknown-hook", to
check that we only look through our own known hooks. See
cfe853e66b (hook-list.h: add a generated list of hooks, like
config-list.h, 2021-09-26) for how we generate that data.
We're intentionally not piping the "actual" output through "sort" or
similar, we'd also like to check that our reported hooks are sorted.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop moving the .git/hooks directory out of the way, or creating it
during test setup. Instead assume that it will contain
harmless *.sample files.
That we can assume that is discussed in point #4 of
f0d4d398e2 (test-lib: split up and deprecate test_create_repo(),
2021-05-10), those parts of this could and should have been done in
that change.
Removing the "mkdir -p" here will then validate that our templates are
being used, since we'd subsequently fail to create a hook in that
directory if it didn't exist. Subsequent commits will have those hooks
created by a "test_hook" wrapper, which will then being doing that
same validation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code added in a87679339c (test: rename http fetch and push
test files, 2014-02-06) to stop relying on the "exec git
update-server-info" in "templates/hooks--post-update.sample", let's
instead inline the expected hook in the test itself.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a test added in 96e7225b31 (hook: add 'run' subcommand,
2021-12-22) to use a for-loop instead of a copy/pasting the same test
for the four exit codes we test.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a "test_hook" wrapper similar to the existing "test_config"
wrapper added in d960c47a88 (test-lib: add helper functions for
config, 2011-08-17).
This wrapper:
- Will clean up the hook with "test_when_finished", unless --setup is
provided.
- Will error if we clobber a hook, unless --clobber is provided.
- Takes a name like "update" instead of ".git/hooks/update".
- Accepts -C <dir>, like "test_config" and "test_commit".
By using a wrapper we'll be able to easily change all the hook-related
code that assumes that the template-created ".git/hooks" directory is
created by "init", "clone" etc. once another topic follows-up and
changes the test suite to stop creating trash directories using those
templates.
In addition this will make it easy to have the hooks configured using
the "configuration-based hooks" topic, once we get around to
integrating that. I.e. we'll be able to run the tests in a mode where
we sometimes create a .git/hooks/<name>, and other times create a
script in another location, and point the relevant configuration
snippet to it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Count string_list items in size_t, not "unsigned int".
* ab/string-list-count-in-size-t:
string-list API: change "nr" and "alloc" to "size_t"
gettext API users: don't explicitly cast ngettext()'s "n"
Code clean-up to allow callers of run_commit_hook() to learn if it
got "success" because the hook succeeded or because there wasn't
any hook.
* ab/racy-hooks:
hooks: fix an obscure TOCTOU "did we just run a hook?" race
merge: don't run post-hook logic on --no-verify
Teach "test-chmtime" to work on a directory and use it to avoid
having to wait for a second in a few places in tests.
* tk/t7063-chmtime-dirs-too:
t7063: mtime-mangling instead of delays in untracked cache testing
t/helper/test-chmtime: update mingw to support chmtime on directories
Fixes to the way generation number v2 in the commit-graph files are
(not) handled.
* ds/commit-graph-gen-v2-fixes:
commit-graph: declare bankruptcy on GDAT chunks
commit-graph: fix generation number v2 overflow values
commit-graph: start parsing generation v2 (again)
commit-graph: fix ordering bug in generation numbers
t5318: extract helpers to lib-commit-graph.sh
test-read-graph: include extra post-parse info
"git stash drop" is reimplemented as an internal call to
reflog_delete() function, instead of invoking "git reflog delete"
via run_command() API.
* jc/stash-drop:
stash: call reflog_delete() in reflog.c
reflog: libify delete reflog function and helpers
stash: add tests to ensure reflog --rewrite --updatref behavior
"git remote rename A B", depending on the number of remote-tracking
refs involved, takes long time renaming them. The command has been
taught to show progress bar while making the user wait.
* tb/rename-remote-progress:
builtin/remote.c: show progress when renaming remote references
builtin/remote.c: parse options in 'rename'
"git read-tree" has been made to be aware of the sparse-index
feature.
* vd/sparse-read-tree:
read-tree: make three-way merge sparse-aware
read-tree: make two-way merge sparse-aware
read-tree: narrow scope of index expansion for '--prefix'
read-tree: integrate with sparse index
read-tree: expand sparse checkout test coverage
read-tree: explicitly disallow prefixes with a leading '/'
status: fix nested sparse directory diff in sparse index
sparse-index: prevent repo root from becoming sparse
Object-file API shuffling.
* ab/object-file-api-updates:
object-file API: pass an enum to read_object_with_reference()
object-file.c: add a literal version of write_object_file_prepare()
object-file API: have hash_object_file() take "enum object_type"
object API: rename hash_object_file_literally() to write_*()
object-file API: split up and simplify check_object_signature()
object API users + docs: check <0, not !0 with check_object_signature()
object API docs: move check_object_signature() docs to cache.h
object API: correct "buf" v.s. "map" mismatch in *.c and *.h
object-file API: have write_object_file() take "enum object_type"
object-file API: add a format_object_header() function
object-file API: return "void", not "int" from hash_object_file()
object-file.c: split up declaration of unrelated variables
Various optimization for "git fetch".
* ps/fetch-mirror-optim:
refs/files-backend: optimize reading of symbolic refs
remote: read symbolic refs via `refs_read_symbolic_ref()`
refs: add ability for backends to special-case reading of symbolic refs
fetch: avoid lookup of commits when not appending to FETCH_HEAD
upload-pack: look up "want" lines via commit-graph
The untracked cache newly computed weren't written back to the
on-disk index file when there is no other change to the index,
which has been corrected.
* tk/empty-untracked-cache:
untracked-cache: write index when populating empty untracked cache
t7519: populate untracked cache before test
t7519: avoid file to index mtime race for untracked cache
When check_has_commit() is called on a missing submodule, initialization
of the struct repository fails, but it attempts to clear the struct
anyway (which is a fatal error). This bug is masked by its only caller,
submodule_has_commits(), first calling add_submodule_odb(). The latter
fails if the submodule does not exist, making submodule_has_commits()
exit early and not invoke check_has_commit().
Fix this bug, and because calling add_submodule_odb() is no longer
necessary as of 13a2f620b2 (submodule: pass repo to
check_has_commit(), 2021-10-08), remove that call too.
This is the last caller of add_submodule_odb(), so remove that
function. (Submodule ODBs are still added as alternates via
add_submodule_odb_by_path().)
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --recurse-submodules" only considers populated
submodules (i.e. submodules that can be found by iterating the index),
which makes "git fetch" behave differently based on which commit is
checked out. As a result, even if the user has initialized all submodules
correctly, they may not fetch the necessary submodule commits, and
commands like "git checkout --recurse-submodules" might fail.
Teach "git fetch" to fetch cloned, changed submodules regardless of
whether they are populated. This is in addition to the current behavior
of fetching populated submodules (which is always attempted regardless
of what was fetched in the superproject, or even if nothing was fetched
in the superproject).
A submodule may be encountered multiple times (via the list of
populated submodules or via the list of changed submodules). When this
happens, "git fetch" only reads the 'populated copy' and ignores the
'changed copy'. Amend the verify_fetch_result() test helper so that we
can assert on which 'copy' is being read.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rearrange functions so that submodule_update() no longer needs to be
forward declared.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch completes the conversion past the flag parsing of
`submodule update` by introducing a helper subcommand called
`submodule--helper update`. The behaviour of `submodule update` should
remain the same after this patch.
Prior to this patch, `submodule update` was implemented by piping the
output of `update-clone` (which clones all missing submodules, then
prints relevant information for all submodules) into
`run-update-procedure` (which reads the information and updates the
submodule tree).
With `submodule--helper update`, we iterate over the submodules and
update the submodule tree in the same process. This reuses most of
existing code structure, except that `update_submodule()` now updates
the submodule tree (instead of printing submodule information to be
consumed by another process).
Recursing on a submodule is done by calling a subprocess that launches
`submodule--helper update`, with a modified `--recursive-prefix` and
`--prefix` parameter.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A later commit will combine the "update-clone" and
"run-update-procedure" commands, so run_update_procedure() will be
removed. Prepare for this by moving as much logic as possible out of
run_update_procedure() and into update_submodule2().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor 'struct update_data' to hold the parsed args needed by "git
submodule--helper update" and refactor "update-clone" and
"run-update-procedure" (the functions that will be combined to form
"update") to use these options.
For "run-update-procedure", 'struct update_data' already holds its args,
so only arg parsing code needs to be updated.
For "update-clone", move its args from 'struct submodule_update_clone'
into 'struct update_data', and replace them with a pointer to 'struct
update_data'. Its other members hold the submodule iteration state of
"update-clone", so those are unchanged.
Incidentally, since we reformat the designated initializers of the
affected structs, also reformat MODULE_CLONE_DATA_INIT for consistency.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later commit, update_clone()'s options will be stored in a struct
update_data instead of submodule_update_clone. Preemptively rename the
options struct to "opt" to shrink that commit's diff.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use die_message() to print the "fatal: " prefix instead of doing it in
git-submodule.sh and remove a now-unnecessary exit code from "git
submodule--helper run-update-procedure".
Also, since die_message() adds the newline for us, replace an invocation
of die_with_status() with printf + exit invocations that do not add a
newline, but are otherwise identical to die_with_status().
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We switch to using the run-command API function that takes a
'struct child process', since we are using a lot of the options. This
will also make it simple to switch over to using 'capture_command()'
when we start handling the output of the command completely in C.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* gc/submodule-update-part1:
submodule--helper update-clone: check for --filter and --init
submodule update: add tests for --filter
submodule--helper: remove ensure-core-worktree
submodule--helper update-clone: learn --init
submodule--helper: allow setting superprefix for init_submodule()
submodule--helper: refactor get_submodule_displaypath()
submodule--helper run-update-procedure: learn --remote
submodule--helper: don't use bitfield indirection for parse_options()
submodule--helper: get remote names from any repository
submodule--helper run-update-procedure: remove --suboid
submodule--helper: reorganize code for sh to C conversion
submodule--helper: remove update-module-mode
submodule tests: test for init and update failure output
If the user suspends git while it is waiting for a keypress reset the
terminal before stopping and restore the settings when git resumes. If
the user tries to resume in the background print an error
message (taking care to use async safe functions) before stopping
again. Ideally we would reprint the prompt for the user when git
resumes but this patch just restarts the read().
The signal handler is established with sigaction() rather than using
sigchain_push() as this allows us to control the signal mask when the
handler is invoked and ensure SA_RESTART is used to restart the
read() when resuming.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On macos the builtin "add -p" does not handle keys that generate
escape sequences because poll() does not work with terminals
there. Switch to using select() on non-windows platforms to work
around this.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_key_without_echo() reads from stdin but uses /dev/tty when it
disables echo. This is unfortunate as there no guarantee that stdin is
the same device as /dev/tty. The perl version of "add -p" uses stdin
when it sets the terminal mode, this commit does the same for the
builtin version. There is still a difference between the perl and
builtin versions though - the perl version will ignore any errors when
setting the terminal mode[1] and will still read single bytes when
stdin is not a terminal. The builtin version displays a warning if
setting the terminal mode fails and switches to reading a line at a
time.
[1] b061c913bb/ReadKey.xs (L1090)
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next commit will add another flag in addition to the existing
full_duplex so change the function signature to take a flags
argument. Also alter the functions that call save_term() so that they
can pass flags down to it.
The choice to use an enum for tho bitwise flags is because gdb will
display the symbolic names of all the flags that are set rather than
the integer value.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a blobless-cloned repo, `git log --follow -- <path>` (`<path>` have
an exact OID rename) shouldn't download blob of the file from where the
new file is renamed.
Add a test case to verify it.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of creating a new allocation, reverse the original list
in-place by calling the reverse_commit_list() helper.
The original code discards the list "bases" after storing its
reverse copy in a newly created list "reversed". If the code that
followed from here used both "bases" and "reversed", the
modification would not have worked, but since the original list
"bases" gets discarded, we can simply reverse "bases" in-place with
the reverse_commit_list() helper and reuse the same variable in the
code that follows.
builtin/merge.c has been left unmodified, since in its case, the
original list is needed separately from its reverse copy by the
code.
Signed-off-by: Jayati Shrivastava <gaurijove@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If no --args are present after 'git restore', it assumes that you
want to tab-complete one of the files with unstaged uncommitted
changes.
If a file has been staged, we don't want to list it, as restoring those
requires a slightly more complex `git restore --staged`, so we only list
those files that are --modified. While --committable also looks like
a good candidate, that includes changes that have been staged.
Signed-off-by: David Cantrell <david@cantrell.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing both loose and packed references to disk we first create a
lockfile, write the updated values into that lockfile, and on commit we
rename the file into place. According to filesystem developers, this
behaviour is broken because applications should always sync data to disk
before doing the final rename to ensure data consistency [1][2][3]. If
applications fail to do this correctly, a hard crash of the machine can
easily result in corrupted on-disk data.
This kind of corruption can in fact be easily observed with Git when the
machine hard-resets shortly after writing references to disk. On
machines with ext4, this will likely lead to the "empty files" problem:
the file has been renamed, but its data has not been synced to disk. The
result is that the reference is corrupt, and in the worst case this can
lead to data loss.
Implement a new option to harden references so that users and admins can
avoid this scenario by syncing locked loose and packed references to
disk before we rename them into place.
[1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
[2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ns/core-fsyncmethod:
core.fsync: documentation and user-friendly aggregate options
core.fsync: new option to harden the index
core.fsync: add configuration parsing
core.fsync: introduce granular fsync control infrastructure
core.fsyncmethod: add writeout-only mode
wrapper: make inclusion of Windows csprng header tightly scoped
This commit adds aggregate options for the core.fsync setting that are
more user-friendly. These options are specified in terms of 'levels of
safety', indicating which Git operations are considered to be sync
points for durability.
The new documentation is also included here in its entirety for ease of
review.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The synopsis for 'git maintenance' did not include the commands other
than the 'run' command. Update this to include the others. The 'start'
command is the only one of these that parses additional options, and
then only the --scheduler option.
Also move the 'register' command down after 'stop' and before
'unregister' for a logical grouping of the commands instead of an
alphabetical one. The diff makes it look as three other commands are
moved up.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When format is passed into --batch, --batch-check, --batch-command,
the format gets expanded. When nothing is passed in, the default format
is set and the expand_format() gets called.
We can save on these cycles by hardcoding how to print the
information when nothing is passed as the format, or when the default
format is passed. There is no need for the fully expanded format with
the default. Since batch_object_write() happens on every object provided
in batch mode, we get a nice performance improvement.
git rev-list --all > /tmp/all-obj.txt
git cat-file --batch-check </tmp/all-obj.txt
with HEAD^:
Time (mean ± σ): 57.6 ms ± 1.7 ms [User: 51.5 ms, System: 6.2 ms]
Range (min … max): 54.6 ms … 64.7 ms 50 runs
with HEAD:
Time (mean ± σ): 49.8 ms ± 1.7 ms [User: 42.6 ms, System: 7.3 ms]
Range (min … max): 46.9 ms … 55.9 ms 56 runs
If nothing is provided as a format argument, or if the default format is
passed, skip expanding of the format and print the object info with a
default format.
See https://lore.kernel.org/git/87eecf8ork.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the options '-q' and '--refresh' to the 'git reset' executed in
'reset_head()', and '--refresh' to the 'git reset -q' executed in
'do_push_stash(...)'.
'stash' is implemented such that git commands invoked as part of it (e.g.,
'clean', 'read-tree', 'reset', etc.) have their informational output
silenced. However, the 'reset' in 'reset_head()' is *not* called with '-q',
leading to the potential for a misleading printout from 'git stash apply
--index' if the stash included a removed file:
Unstaged changes after reset: D <deleted file>
Not only is this confusing in its own right (since, after the reset, 'git
stash' execution would stage the deletion in the index), it would be printed
even when the stash was applied with the '-q' option. As a result, the
messaging is removed entirely by calling 'git status' with '-q'.
Additionally, because the default behavior of 'git reset -q' is to skip
refreshing the index, but later operations in 'git stash' subcommands expect
a non-stale index, enable '--refresh' as well.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If using '--quiet' or 'reset.quiet=true', do not print the 'resetnoRefresh'
advice string. For applications that rely on '--quiet' disabling all
non-error logs, the advice message should be suppressed accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace references to '--quiet' with '--no-refresh' in the advice on how to
skip refreshing the index. When the advice was introduced, '--quiet' was the
only way to avoid the expensive 'refresh_index(...)' at the end of a mixed
reset. After introducing '--no-refresh', however, '--quiet' became only a
fallback option for determining refresh behavior, overridden by
'--[no-]refresh' or 'reset.refresh' if either is set. To ensure users are
advised to use the most reliable option for avoiding 'refresh_index(...)',
replace recommendation of '--quiet' with '--[no-]refresh'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new --[no-]refresh option that is intended to explicitly determine
whether a mixed reset should end in an index refresh.
Starting at 9ac8125d1a (reset: don't compute unstaged changes after reset
when --quiet, 2018-10-23), using the '--quiet' option results in skipping
the call to 'refresh_index(...)' at the end of a mixed reset with the goal
of improving performance. However, by coupling behavior that modifies the
index with the option that silences logs, there is no way for users to have
one without the other (i.e., silenced logs with a refreshed index) without
incurring the overhead of a separate call to 'git update-index --refresh'.
Furthermore, there is minimal user-facing documentation indicating that
--quiet skips the index refresh, potentially leading to unexpected issues
executing commands after 'git reset --quiet' that do not themselves refresh
the index (e.g., internals of 'git stash', 'git read-tree').
To mitigate these issues, '--[no-]refresh' and 'reset.refresh' are
introduced to provide a dedicated mechanism for refreshing the index. When
either is set, '--quiet' and 'reset.quiet' revert to controlling only
whether logs are silenced and do not affect index refresh.
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the advice describing index refresh from "enumerate unstaged changes"
to "refresh the index." Describing 'refresh_index(...)' as "enumerating
unstaged changes" is not fully representative of what an index refresh is
doing; more generally, it updates the properties of index entries that are
affected by outside-of-index state, e.g. CE_UPTODATE, which is affected by
the file contents on-disk. This distinction is relevant to operations that
read the index but do not refresh first - e.g., 'git read-tree' - where a
stale index may cause incorrect behavior.
In addition to changing the advice message, use the "advise" function to
print advice.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, git-repack(1) will update server info that is required by
the dumb HTTP transport. This can be skipped by passing the `-n` flag,
but what we're noticably missing is a config option to permanently
disable updating this information.
Add a new option "repack.updateServerInfo" which can be used to disable
the logic. Most hosting providers have turned off the dumb HTTP protocol
anyway, and on the client-side it woudln't typically be useful either.
Giving a persistent way to disable this feature thus makes quite some
sense to avoid wasting compute cycles and storage.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, git-repack(1) runs `update_server_info()` to generate info
required for the dumb HTTP protocol. This can be disabled via the `-n`
flag, which then sets the `no_update_server_info` flag. Further down the
code this leads to some double-negation logic, which is about to become
more confusing as we're about to add a new config which allows the user
to permanently disable generation of the info.
Refactor the code to avoid the double-negation and add some tests which
verify that the flag continues to work as expected.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug random memory leaks.
* ab/plug-random-leaks:
repository.c: free the "path cache" in repo_clear()
range-diff: plug memory leak in read_patches()
range-diff: plug memory leak in common invocation
lockfile API users: simplify and don't leak "path"
commit-graph: stop fill_oids_from_packs() progress on error and free()
commit-graph: fix memory leak in misused string_list API
submodule--helper: fix trivial leak in module_add()
transport: stop needlessly copying bundle header references
bundle: call strvec_clear() on allocated strvec
remote-curl.c: free memory in cmd_main()
urlmatch.c: add and use a *_release() function
diff.c: free "buf" in diff_words_flush()
merge-base: free() allocated "struct commit **" list
index-pack: fix memory leaks
Newer version of GPGSM changed its output in a backward
incompatible way to break our code that parses its output. It also
added more processes our tests need to kill when cleaning up.
Adjustments have been made to accommodate these changes.
* fs/gpgsm-update:
t/lib-gpg: kill all gpg components, not just gpg-agent
t/lib-gpg: reload gpg components after updating trustlist
gpg-interface/gpgsm: fix for v2.3
Check the return value from parse_tree_indirect() to turn segfaults
into calls to die().
* gc/parse-tree-indirect-errors:
checkout, clone: die if tree cannot be parsed
Align the level of verbose output from the ort backend during inner
merge to that of the recursive backend.
* en/merge-ort-align-verbosity-with-recursive:
merge-ort: exclude messages from inner merges by default
Makefile refactoring with a bit of suffixes rule stripping to
optimize the runtime overhead.
* ab/make-optim-noop:
Makefiles: add and use wildcard "mkdir -p" template
Makefile: add "$(QUIET)" boilerplate to shared.mak
Makefile: move $(comma), $(empty) and $(space) to shared.mak
Makefile: move ".SUFFIXES" rule to shared.mak
Makefile: define $(LIB_H) in terms of $(FIND_SOURCE_FILES)
Makefile: disable GNU make built-in wildcard rules
Makefiles: add "shared.mak", move ".DELETE_ON_ERROR" to it
scalar Makefile: use "The default target of..." pattern
"git fetch" can make two separate fetches, but ref updates coming
from them were in two separate ref transactions under "--atomic",
which has been corrected.
* ps/fetch-atomic:
fetch: make `--atomic` flag cover pruning of refs
fetch: make `--atomic` flag cover backfilling of tags
refs: add interface to iterate over queued transactional updates
fetch: report errors when backfilling tags fails
fetch: control lifecycle of FETCH_HEAD in a single place
fetch: backfill tags before setting upstream
fetch: increase test coverage of fetches
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
shellcheck -i SC2006 -f diff ${_f} | ifne git apply -p2
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a commit in a sequence of linear history has a non-monotonically
increasing commit timestamp, git name-rev might not properly name the
commit.
This occurs because name-rev uses a heuristic of the commit date to
avoid searching down tags which lead to commits that are older than the
named commit. This is intended to avoid work on larger repositories.
This heuristic impacts git name-rev, and by extension git describe
--contains which is built on top of name-rev.
Further more, if --all or --annotate-stdin is used, the heuristic is not
enabled because the full history has to be analyzed anyways. This
results in some confusion if a user sees that --annotate-stdin works but
a normal name-rev does not.
If the repository has a commit graph, we can use the generation numbers
instead of using the commit dates. This is essentially the same check
except that generation numbers make it exact, where the commit date
heuristic could be incorrect due to clock errors.
Since we're extending the notion of cutoff to more than one variable,
create a series of functions for setting and checking the cutoff. This
avoids duplication and moves access of the global cutoff and
generation_cutoff to as few functions as possible.
Add several test cases including a test that covers the new commitGraph
behavior, as well as tests for --all and --annotate-stdin with and
without commitGraphs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in my daf1d8285e (reflog expire: don't use
lookup_commit_reference_gently(), 2021-12-22), in changing from
lookup_commit_reference_gently() to lookup_commit() we stopped trying
to call deref_tag() and parse_object() on the provided OID, but we
also started returning non-NULL for the null_oid().
As a result we'd emit an error() via mark_reachable() later in this
function as we tried to invoke parse_commit() on it.
Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The xfuncname pattern finds func/class declarations
in diffs to display as a hunk header. The word_regex
pattern finds individual tokens in Kotlin code to generate
appropriate diffs.
This patch adds xfuncname regex and word_regex for Kotlin
language.
Signed-off-by: Jaydeep P Das <jaydeepjd.8914@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pipes ignore error codes of LHS command and thus we should not use
them with Git in tests. As an alternative, use a 'tmp' file to write
the Git output so we can test the exit code.
Signed-off-by: Shubham Mishra <shivam828787@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pipes ignore error codes of LHS command and thus we should not use
them with Git in tests. As an alternative, use a 'tmp' file to write
the Git output so we can test the exit code.
Signed-off-by: Shubham Mishra <shivam828787@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces the new ability for the user to harden
the index. In the event of a system crash, the index must be
durable for the user to actually find a file that has been added
to the repo and then deleted from the working tree.
We use the presence of the COMMIT_LOCK flag and absence of the
alternate_index_output as a proxy for determining whether we're
updating the persistent index of the repo or some temporary
index. We don't sync these temporary indexes.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change introduces code to parse the core.fsync setting and
configure the fsync_components variable.
core.fsync is configured as a comma-separated list of component names to
sync. Each time a core.fsync variable is encountered in the
configuration heirarchy, we start off with a clean state with the
platform default value. Passing 'none' resets the value to indicate
nothing will be synced. We gather all negative and positive entries from
the comma separated list and then compute the new value by removing all
the negative entries and adding all of the positive entries.
We issue a warning for components that are not recognized so that the
configuration code is compatible with configs from future versions of
Git with more repo components.
Complete documentation for the new setting is included in a later patch
in the series so that it can be reviewed once in final form.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces the infrastructure for the core.fsync
configuration knob. The repository components we want to sync
are identified by flags so that we can turn on or off syncing
for specific components.
If core.fsyncObjectFiles is set and the core.fsync configuration
also includes FSYNC_COMPONENT_LOOSE_OBJECT, we will fsync any
loose objects. This picks the strictest data integrity behavior
if core.fsync and core.fsyncObjectFiles are set to conflicting values.
This change introduces the currently unused fsync_component
helper, which will be used by a later patch that adds fsyncing to
the refs backend.
Actual configuration and documentation of the fsync components
list are in other patches in the series to separate review of
the underlying mechanism from the policy of how it's configured.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces the `core.fsyncMethod` configuration
knob, which can currently be set to `fsync` or `writeout-only`.
The new writeout-only mode attempts to tell the operating system to
flush its in-memory page cache to the storage hardware without issuing a
CACHE_FLUSH command to the storage controller.
Writeout-only fsync is significantly faster than a vanilla fsync on
common hardware, since data is written to a disk-side cache rather than
all the way to a durable medium. Later changes in this patch series will
take advantage of this primitive to implement batching of hardware
flushes.
When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the
caller is expected to do an ordinary fsync as needed.
On Apple platforms, the fsync system call does not issue a CACHE_FLUSH
directive to the storage controller. This change updates fsync to do
fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity
with existing behavior on Apple platforms by setting the default value
of the new core.fsyncMethod option.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Including NTSecAPI.h in git-compat-util.h causes build errors in any
other file that includes winternl.h. NTSecAPI.h was included in order to
get access to the RtlGenRandom cryptographically secure PRNG. This
change scopes the inclusion of ntsecapi.h to wrapper.c, which is the only
place that it's actually needed.
The build breakage is due to the definition of UNICODE_STRING in
NtSecApi.h:
#ifndef _NTDEF_
typedef LSA_UNICODE_STRING UNICODE_STRING, *PUNICODE_STRING;
typedef LSA_STRING STRING, *PSTRING ;
#endif
LsaLookup.h:
typedef struct _LSA_UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
#ifdef MIDL_PASS
[size_is(MaximumLength/2), length_is(Length/2)]
#endif // MIDL_PASS
PWSTR Buffer;
} LSA_UNICODE_STRING, *PLSA_UNICODE_STRING;
winternl.h also defines UNICODE_STRING:
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING;
typedef UNICODE_STRING *PUNICODE_STRING;
Both definitions have equivalent layouts. Apparently these internal
Windows headers aren't designed to be included together. This is
an oversight in the headers and does not represent an incompatibility
between the APIs.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the block SHA-1 code, we have special assembly code for i386 and
amd64 to perform rotations with assembly. This is supposed to help pick
the correct rotation operation depending on which rotation is smaller,
which can help some systems perform slightly better, since any circular
rotation can be specified as either a rotate left or a rotate right.
However, this isn't needed, so we should remove it.
First, SHA-1, like SHA-2, uses fixed constant rotates. Thus, all
rotation amounts are known at compile time and are in fact baked into
the code. Fortunately, peephole optimizers recognize rotations
specified in the normal way and automatically emit the correct code,
including a preference for choosing a rotate left versus a rotate right.
This has been the case for well over a decade, and is a standard example
of the utility of a peephole optimizer.
Moreover, all modern CPUs, with the exception of extremely limited
embedded CPUs such as some Cortex-M processors, provide a barrel
shifter, which lets the CPU perform rotates of any bit amount in
constant time. This is valuable for many cryptographic algorithms to
improve performance, and is required to prevent timing attacks in
algorithms which use data-dependent rotations (which don't include the
hash algorithms we use). As a result, even though the compiler does the
correct optimization, it isn't even needed here and either a left or a
right rotate is equally acceptable.
In fact, the SHA-256 code already takes this into account and just
writes the simple code using an inline function to let the compiler
optimize it for us.
The downside of using this code, however, is that it uses a GCC
extension, which makes the compiler complain when using -pedantic unless
it's prefixed with __extension__. We could fix that, but since it's
not needed, let's just remove it. We haven't noticed this because
almost everyone uses the SHA1DC code instead, but it still shows up for
some people.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* pw/single-key-interactive:
add -p: disable stdin buffering when interactive.singlekey is set
terminal: set VMIN and VTIME in non-canonical mode
terminal: pop signal handler when terminal is restored
terminal: always reset terminal when reading without echo
131b94a10a ("test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on
glibc >= 2.34", 2022-03-04) introduced "local" variables without
declaring them as such. This conflicts with their use in some tests (at
least when running them with dash), leading to test failures in:
t0006-date.sh
t2002-checkout-cache-u.sh
t3430-rebase-merges.sh
t4138-apply-ws-expansion.sh
t4124-apply-ws-rule.sh
Declare those variables as local to let the tests pass again.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updates to how command line options to "git help" are handled.
* ab/help-fixes:
help: don't print "\n" before single-section output
help: add --no-[external-commands|aliases] for use with --all
help: error if [-a|-g|-c] and [-i|-m|-w] are combined
help: correct usage & behavior of "git help --all"
help: note the option name on option incompatibility
help.c: split up list_all_cmds_help() function
help tests: test "git" and "git help [-a|-g] spacing
help.c: use puts() instead of printf{,_ln}() for consistency
help doc: add missing "]" to "[-a|--all]"
Remove the escape hatch we added when we introduced the weather
balloon to use variadic macros unconditionally, to make it official
that we now have a hard dependency on the feature.
* ab/c99-variadic-macros:
C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code
git-compat-util.h: clarify GCC v.s. C99-specific in comment
General clean-up in reftable implementation, including
clarification of the API documentation, tightening the code to
honor documented length limit, etc.
* hn/reftable-no-empty-keys:
reftable: rename writer_stats to reftable_writer_stats
reftable: add test for length of disambiguating prefix
reftable: ensure that obj_id_len is >= 2 on writing
reftable: avoid writing empty keys at the block layer
reftable: add a test that verifies that writing empty keys fails
reftable: reject 0 object_id_len
Documentation: object_id_len goes up to 31
"git cat-file" learns "--batch-command" mode, which is a more
flexible interface than the existing "--batch" or "--batch-check"
modes, to allow different kinds of inquiries made.
* jc/cat-file-batch-commands:
cat-file: add --batch-command mode
cat-file: add remove_timestamp helper
cat-file: introduce batch_mode enum to replace print_contents
cat-file: rename cmdmode to transform_mode
Improve failure case behaviour of xdiff library when memory
allocation fails.
* pw/xdiff-alloc-fail:
xdiff: handle allocation failure when merging
xdiff: refactor a function
xdiff: handle allocation failure in patience diff
xdiff: fix a memory leak
In sparse-checkouts, files mis-marked as missing from the working tree
could lead to later problems. Such files were hard to discover, and
harder to correct. Automatically detecting and correcting the marking
of such files has been added to avoid these problems.
* en/present-despite-skipped:
repo_read_index: add config to expect files outside sparse patterns
Accelerate clear_skip_worktree_from_present_files() by caching
Update documentation related to sparsity and the skip-worktree bit
repo_read_index: clear SKIP_WORKTREE bit from files present in worktree
unpack-trees: fix accidental loss of user changes
t1011: add testcase demonstrating accidental loss of user modifications
Users can create a new repository using 'git clone <bundle-file>'. The
new "@filter" capability for bundles means that we can generate a bundle
that does not contain all reachable objects, even if the header has no
negative commit OIDs.
It is feasible to think that we could make a filtered bundle work with
the command
git clone --filter=$filter --bare <bundle-file>
or possibly replacing --bare with --no-checkout. However, this requires
having some repository-global config that specifies the specified object
filter and notifies Git about the existence of promisor pack-files.
Without a remote, that is currently impossible.
As a stop-gap, parse the bundle header during 'git clone' and die() with
a helpful error message instead of the current behavior of failing due
to "missing objects".
Most of the existing logic for handling bundle clones actually happens
in fetch-pack.c, but that logic is the same as if the user specified
'git fetch <bundle>', so we want to avoid failing to fetch a filtered
bundle when in an existing repository that has the proper config set up
for at least one remote.
Carefully comment around the test that this is not the desired long-term
behavior of 'git clone' in this case, but instead that we need to do
more work before that is possible.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to have a valid pack-file after unbundling a bundle that has
the 'filter' capability, we need to generate a .promisor file. The
bundle does not promise _where_ the objects can be found, but we can
expect that these bundles will be unbundled in repositories with
appropriate promisor remotes that can find those missing objects.
Use the 'git index-pack --promisor=<message>' option to create this
.promisor file. Add "from-bundle" as the message to help anyone diagnose
issues with these promisor packs.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous change allowed Git to parse bundles with the 'filter'
capability. Now, teach Git to create bundles with this option.
Some rearranging of code is required to get the option parsing in the
correct spot. There are now two reasons why we might need capabilities
(a new hash algorithm or an object filter) so that is pulled out into a
place where we can check both at the same time.
The --filter option is parsed as part of setup_revisions(), but it
expected the --objects flag, too. That flag is somewhat implied by 'git
bundle' because it creates a pack-file walking objects, but there is
also a walk that walks the revision range expecting only commits. Make
this parsing work by setting 'revs.tree_objects' and 'revs.blob_objects'
before the call to setup_revisions().
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that 'struct rev_info' has a 'filter' member and most consumers of
object filtering are using that member instead of an external struct,
move the parsing of the '--filter' option out of builtin/rev-list.c and
into revision.c.
This use within handle_revision_pseudo_opt() allows us to find the
option within setup_revisions() if the arguments are passed directly. In
the case of a command such as 'git blame', the arguments are first
scanned and checked with parse_revision_opt(), which complains about the
option, so 'git blame --filter=blob:none <file>' does not become valid
with this change.
Some commands, such as 'git diff' gain this option without having it
make an effect. And 'git diff --objects' was already possible, but does
not actually make sense in that builtin.
The key addition that is coming is 'git bundle create --filter=<X>' so
we can create bundles containing promisor packs. More work is required
to make them fully functional, but that will follow.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The v3 bundle format has capabilities, allowing newer versions of Git to
create bundles with newer features. Older versions that do not
understand these new capabilities will fail with a helpful warning.
Create a new capability allowing Git to understand that the contained
pack-file is filtered according to some object filter. Typically, this
filter will be "blob:none" for a blobless partial clone.
This change teaches Git to parse this capability, place its value in the
bundle header, and demonstrate this understanding by adding a message to
'git bundle verify'.
Since we will use gently_parse_list_objects_filter() outside of
list-objects-filter-options.c, make it an external method and move its
API documentation to before its declaration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a caller to traverse_commit_list() specifies the options for the
--objects flag but does not specify a show_object function pointer, the
result is a segfault. This is currently visible by running 'git bundle
create --objects HEAD'.
We could fix this problem by supplying a no-op callback in
builtin/bundle.c, but that only solves the problem for one builtin,
leaving this segfault open for other callers.
Replace all callers of the show_commit and show_object function pointers
in list-objects.c to call helper functions show_commit() and
show_object() which check that the given context has non-NULL functions
before passing the necessary data. One extra benefit is that it reduces
duplication due to passing ctx->show_data to every caller.
Test that this segfault no longer occurs for 'git bundle'.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change consolidated traverse_commit_list() and
traverse_commit_list_filtered(). This allows us to simplify the
recommended usage in MyFirstObjectWalk.txt to use this new set of
values.
While here, add some clarification on the difference between the two
methods.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that all consumers of traverse_commit_list_filtered() populate the
'filter' member of 'struct rev_info', we can drop that parameter from
the method prototype to simplify things. In addition, the only thing
different now between traverse_commit_list_filtered() and
traverse_commit_list() is the presence of the 'omitted' parameter, which
is only non-NULL for one caller. We can consolidate these two methods by
having one call the other and use the simpler form everywhere the
'omitted' parameter would be NULL.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that all consumers of prepare_bitmap_walk() have populated the
'filter' member of 'struct rev_info', we can drop that extra parameter
from the method and access it directly from the 'struct rev_info'.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin/pack-objects.c, we use a 'filter_options' global to populate
the --filter=<X> argument. The previous change created a pointer to a
filter option in 'struct rev_info', so we can use that pointer here as a
start to simplifying some usage of object filters.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Placing a 'struct list_objects_filter_options' within 'struct rev_info'
will assist making some bookkeeping around object filters in the future.
For now, let's use this new member to remove a static global instance of
the struct from builtin/rev-list.c.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we add more embedded members with type 'struct
list_objects_filter_options', it will be important to easily perform a
deep copy across multiple such structs. Create
list_objects_filter_copy() to satisfy this need.
This method is recursive to match the recursive nature of the struct.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --promisor option of 'git index-pack' was created in 88e2f9e
(introduce fetch-object: fetch one promisor object, 2017-12-05) but was
untested. It is currently unused within the Git codebase, but that will
change in an upcoming change to 'git bundle unbundle' when there is a
filter capability.
For now, add documentation about the option and add a test to ensure it
is working as expected.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before this change, gitweb would generate pages which included:
<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
When a meta's http-equiv equals "content-type", the http-equiv is said
to be in the "Encoding declaration state". According to the HTML
Standard,
The Encoding declaration state may be used in HTML documents,
but elements with an http-equiv attribute in that state must not
be used in XML documents.
Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
This change removes that meta element since gitweb always generates XML
documents.
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_fetch_task() gets a fetch task by iterating the index; a future
commit will introduce a similar function, get_fetch_task_from_changed(),
that gets a fetch task from the list of changed submodules. Both
functions are similar in that they need to:
* create a fetch task
* initialize the submodule repo for the fetch task
* determine the default recursion mode
Move all of this logic into fetch_task_create() so that it is no longer
split between fetch_task_create() and get_fetch_task(). This will make
it easier to share code with get_fetch_task_from_changed().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_next_submodule() configures the parallel submodule fetch by
performing two functions:
* iterate the index to find submodules
* configure the child processes to fetch the submodules found in the
previous step
Extract the index iterating code into an iterator function,
get_fetch_task(), so that get_next_submodule() is agnostic of how
to find submodules. This prepares for a subsequent commit will teach the
fetch machinery to also iterate through the list of changed
submodules (in addition to the index).
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit prepares for a future commit that will teach `git fetch
--recurse-submodules` how to fetch submodules that are present in
<gitdir>/modules, but are not populated. To do this, we need to store
more information about the changed submodule so that we can read the
submodule configuration from the superproject commit instead of the
filesystem.
Refactor the changed submodules string_list.util to hold a struct
instead of an oid_array. This struct only holds the new_commits
oid_array for now; more information will be added later.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When collecting the string_list of changed submodule names, the new
submodules commits are stored in the string_list_item.util as an
oid_array. A subsequent commit will replace the oid_array with a struct
that has more information.
Prepare for this change by inlining submodule_commits() (which inserts
into the string_list and initializes the string_list_item.util) into its
only caller so that the code is easier to refactor later.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future commit will teach "fetch --recurse-submodules" to fetch
unpopulated submodules. To prepare for this, teach the necessary static
functions how to read submodules from superproject commits using a
"treeish_name" argument (instead of always reading from the index and
filesystem) but do not actually change where submodules are read from.
Submodules will be read from commits when we fetch unpopulated
submodules.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few tests in t5526 use this pattern as part of their setup:
1. Create new commits in the upstream submodules (using
add_upstream_commit()).
2. In the upstream superprojects, add the new submodule commits from the
previous step.
A future commit will add more tests with this pattern, so reduce the
verbosity of present and future tests by introducing a test helper that
creates superproject commits. Since we now have two helpers that add
upstream commits, rename add_upstream_commit() to
add_submodule_commits().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit message, we noted that not all of the "git fetch"
stderr is relevant to the tests. Most of the test setup lines are
dedicated to these details of the stderr:
1. which repos (super/sub/deep) are involved in the fetch
2. the head of the remote-tracking branch before the fetch (i.e. $head1)
3. the head of the remote-tracking branch after the fetch (i.e. $head2)
1. and 3. are relevant because they tell us that the expected commit is
fetched by the expected repo, but 2. is completely irrelevant.
Stop asserting on $head1 by replacing it with a dummy value in the
actual and expected output. Do this by introducing test
helpers (write_expected_*()) that make it easier to construct the
expected output, and use sed to munge the actual output.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests in t/t5526-fetch-submodules.sh are unnecessarily noisy:
* The tests have extra logic in order to reproduce the expected stderr
literally, but not all of these details (e.g. the head of the
remote-tracking branch before the fetch) are relevant to the test.
* The expect.err file is constructed by the add_upstream_commit() helper
as input into test_cmp, but most tests fetch a different combination
of repos from expect.err. This results in noisy tests that modify
parts of that expect.err to generate the expected output.
To address both of these issues, introduce a verify_fetch_result()
helper to t/t5526-fetch-submodules.sh that asserts on the output of "git
fetch --recurse-submodules" and handles the ordering of expect.err.
As a result, the tests no longer construct expect.err manually. Tests
still consider the old head of the remote-tracking branch ("$head1"),
but that will be fixed in a later commit.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a fragile test pattern introduced in 6534703059 (Topo-sort
before --simplify-merges, 2008-08-03) to check the exit code of both
"git name-rev" and "git log".
This test as a whole would fail under SANITIZE=leak, but we'd pass
several "failing" tests due to hiding these exit codes before we'd
spot git dying with abort(). Now we'll instead spot all of the
failures.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a fragile pattern introduced in 696acf45f9 (checkout:
implement "-" abbreviation, add docs and tests, 2009-01-17) to check
the exit code of both "git symbolic-ref" and "git rev-parse".
Without this change this test will become flaky e.g. under
SANITIZE=leak if some (but not all) memory leaks revealed by these
commands are fixed.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix code added in 969c877506 (git apply --directory broken for new
files, 2008-10-12) so that it doesn't invoke "git ls-files" on the
left-hand-side of a pipe, instead let's use an intermediate file.
Since we're doing that we can also drop the sub-shell that was here to
group the two.
There are a lot of these sorts of patterns in the test suite, and
there's no particular reason to fix this one other than in a preceding
commit all similar patterns except this one were fixed in
"t/t4128-apply-root.sh", so let's fix this one straggler as well.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a prerequisite check added in 5c1ebcca4d (grep/icase: avoid
kwsset on literal non-ascii strings, 2016-06-25) to do invoke
'test-tool regex' in such a way that we'll notice if it dies under
SANITIZE=leak due to having a memory leak, as opposed to us not having
the "ICASE" support we're checking for.
Because we weren't making a distinction between the two I'd marked
these tests as passing under SANITIZE=leak in 03d85e21951 (leak tests:
mark remaining leak-free tests as such, 2021-12-17).
Doing this is tricky. Ideally "test_lazy_prereq" would materialize as
a "real" test that we could check the exit code of with the same
signal matching that "test_must_fail" does.
However lazy prerequisites aren't real tests, and are instead lazily
materialized in the guts of "test_have_prereq" when we've already
started another test.
We could detect the abort() (or similar) there and pass that exit code
down, and fail the test that caused the prerequisites to be
materialized.
But that would require extensive changes to test-lib.sh and
test-lib-functions.sh. Let's instead simply check if the exit code of
"test-tool regex" is zero, and if so set the prerequisites. If it's
non-zero let's run it again with "test_must_fail". We'll thus make a
distinction between "bad" non-zero (segv etc) and "good" (exit 1 etc.).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a couple of uses of "test_expect_failure" to use a
"test_expect_success" to positively assert the current behavior, and
replace the intent of "test_expect_failure" with a "TODO" comment int
the description.
As noted in [1] the "test_expect_failure" feature is overly eager to
accept any failure as OK, and thus by design hides segfaults, abort()
etc. Because of that I didn't notice in dd9cede913 (leak tests: mark
some rev-list tests as passing with SANITIZE=leak, 2021-10-31) that
this test leaks memory under SANITIZE=leak.
I have some larger local changes to add a better
"test_expect_failure", which would work just like
"test_expect_success", but would allow us say "test_todo" here (and
"success" would emit a "not ok [...] # TODO", not "ok [...]".
So even though using "test_expect_success" here comes with its own
problems[2], let's use it as a narrow change to fix the problem at
hand here and stop conflating the current "success" with actual
SANITIZE=leak failures.
1. https://lore.kernel.org/git/87tuhmk19c.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/xmqq4k9kj15p.fsf@gitster.g/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a fragile pattern introduced in 2b459b483c (diff: make sure
work tree side is shown as 0{40} when different, 2008-03-02) to check
the exit code of "git rev-list", while we're at it let's get rid of
the needless sub-shell for invoking it in favor of the "-C" option.
Because of this I'd marked these tests as passing under SANITIZE=leak
in 16d4bd4f14 (leak tests: mark some diff tests as passing with
SANITIZE=leak, 2021-10-31), let's remove the
"TEST_PASSES_SANITIZE_LEAK=true" annotation as they no longer do.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a fragile test pattern that's been with us ever since these
tests were introduced in [1], [2] and [3] to properly return the exit
code of the failing command on failure.
Because of this I'd marked this test as passing under SANITIZE=leak in
[4] and [5]. We need to remove those annotations as these tests will
no longer pass.
1. 9081a421a6 (checkout: fix "branch info" memory leaks, 2021-11-16)
2. 0057c0917d (Add selftests verifying that we can parse notes trees
with various fanouts, 2009-10-09)
3. 048cdd4665 (t3305: Verify that adding many notes with git-notes
triggers increased fanout, 2010-02-13)
4. ca08972495 (leak tests: mark some notes tests as passing with
SANITIZE=leak, 2021-10-31)
5. 9081a421a6 (checkout: fix "branch info" memory leaks, 2021-11-16)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a test added in 9c46c054ae (rev-parse: tests git rev-parse
--verify master@{n}, for various n, 2010-08-24) so that we'll stop
ignoring the exit code of "git reflog" by having it on the
left-hand-side of a pipe.
Because of this I'd marked this test as passing under SANITIZE=leak in
f442c94638 (leak tests: mark some rev-parse tests as passing with
SANITIZE=leak, 2021-10-31). As all of it except this specific test
will now pass, let's skip it under the !SANITIZE_LEAK prerequisite.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As in the preceding commit change a similar fragile test pattern
introduced in b798671fa9 (merge-recursive: do not rudely die on
binary merge, 2007-08-14) to use a "test_must_fail" instead.
Before this we wouldn't distinguish normal "git merge" failures from
segfaults or abort(). Unlike the preceding commit we didn't end up
hiding any SANITIZE=leak failures in this case, but let's
correspondingly change these anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a fragile test pattern introduced in 6b763c424e (git-apply: do
not read past the end of buffer, 2007-09-05). Before this we wouldn't
distinguish normal "git apply" failures from segfaults or abort().
I'd previously marked this test as passing under SANITIZE=leak in
f54f48fc07 (leak tests: mark some apply tests as passing with
SANITIZE=leak, 2021-10-31). Let's remove that annotation as this test
will no longer pass.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a test pattern that originated in f1af60bdba (Support 'diff=pgm'
attribute, 2007-04-22) so that we'll stop using "git diff" on the
left-hand-side of a pipe, and thus ignoring its exit code.
Rather than use intermediate files let's rewrite these tests to a much
simpler but more exhaustive "test_tmp" where we'll ignore certain
fields in the output.
Note that this is not a faithful conversion of the previous
"read/test" in some cases, as we were ignoring more fields there than
we strictly needed to. Now we'll "test_cmp" everything we can, and
only ignore the likes of paths to $TEMPDIR etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a test pattern that originated in f1af60bdba (Support 'diff=pgm'
attribute, 2007-04-22) so that we'll stop using "git diff" on the
left-hand-side of a pipe, and thus ignoring its exit code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an issue with the exit code of "diff-files" being ignored, which
has been ignored ever since these tests were originally added in
c859600954 ([PATCH] read-tree: save more user hassles during
fast-forward., 2005-06-07).
Since the exit code was ignored we'd hide errors here under
SANITIZE=leak, which resulted in me mistakenly marking these tests as
passing under SANITIZE=leak in e5a917fcf4 (unpack-trees: don't leak
memory in verify_clean_subdirectory(), 2021-10-07) and
4ea08416b8 (leak tests: mark a read-tree test as passing
SANITIZE=leak, 2021-10-31).
As it would be non-trivial to fix these tests (the leak is in
revision.c) let's un-mark them as passing under SANITIZE=leak in
addition to fixing the issue of ignoring the exit code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the test_stdout_line_count helper added in
cdff1bb5a3 (test-lib-functions: introduce test_stdout_line_count,
2021-07-04) so that we'll spot if git itself dies, segfaults etc in
these expressions.
Because we didn't distinguish these failure conditions before I'd
mistakenly marked these tests as passing under SANITIZE=leak in
dd9cede913 (leak tests: mark some rev-list tests as passing with
SANITIZE=leak, 2021-10-31).
While we're at it let's re-indent these lines to match our usual
style, as we're having to change all of them anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change some of the patterns in the test suite where we were hiding the
exit code from "git" by invoking it in a sub-shell within a "test"
expression to use temporary files and test_cmp instead.
These are not all the occurrences of this anti-pattern, but these in
particular hid issues where LSAN was dying, and I'd thus marked these
tests as passing under the linux-leaks CI job in past commits with
"TEST_PASSES_SANITIZE_LEAK=true". Let's deal with that by either
removing that marking, or skipping specific tests under
!SANITIZE_LEAK.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a Time-of-check to time-of-use (TOCTOU) race in code added in
680ee550d7 (commit: skip discarding the index if there is no
pre-commit hook, 2017-08-14).
This obscure race condition can occur if we e.g. ran the "pre-commit"
hook and it modified the index, but hook_exists() returns false later
on (e.g., because the hook itself went away, the directory became
unreadable, etc.). Then we won't call discard_cache() when we should
have.
The race condition itself probably doesn't matter, and users would
have been unlikely to run into it in practice. This problem has been
noted on-list when 680ee550d7 was discussed[1], but had not been
fixed.
This change is mainly intended to improve the readability of the code
involved, and to make reasoning about it more straightforward. It
wasn't as obvious what we were trying to do here, but by having an
"invoked_hook" it's clearer that e.g. our discard_cache() is happening
because of the earlier hook execution.
Let's also change this for the push-to-checkout hook. Now instead of
checking if the hook exists and either doing a push to checkout or a
push to deploy we'll always attempt a push to checkout. If the hook
doesn't exist we'll fall back on push to deploy. The same behavior as
before, without the TOCTOU race. See 0855331941 (receive-pack:
support push-to-checkout hook, 2014-12-01) for the introduction of the
previous behavior.
This leaves uses of hook_exists() in two places that matter. The
"reference-transaction" check in refs.c, see 6754159767 (refs:
implement reference transaction hook, 2020-06-19), and the
"prepare-commit-msg" hook, see 66618a50f9 (sequencer: run
'prepare-commit-msg' hook, 2018-01-24).
In both of those cases we're saving ourselves CPU time by not
preparing data for the hook that we'll then do nothing with if we
don't have the hook. So using this "invoked_hook" pattern doesn't make
sense in those cases.
The "reference-transaction" and "prepare-commit-msg" hook also aren't
racy. In those cases we'll skip the hook runs if we race with a new
hook being added, whereas in the TOCTOU races being fixed here we were
incorrectly skipping the required post-hook logic.
1. https://lore.kernel.org/git/20170810191613.kpmhzg4seyxy3cpq@sigill.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a minor bug introduced in bc40ce4de6 (merge: --no-verify to
bypass pre-merge-commit hook, 2019-08-07), when that change made the
--no-verify option bypass the "pre-merge-commit" hook it didn't update
the corresponding find_hook() (later hook_exists()) condition.
As can be seen in the preceding commit in 6098817fd7 (git-merge:
honor pre-merge-commit hook, 2019-08-07) the two should go hand in
hand. There's no point in invoking discard_cache() here if the hook
couldn't have possibly updated the index.
It's buggy that we use "hook_exist()" here, and as discussed in the
subsequent commit it's subject to obscure race conditions that we're
about to fix, but for now this change is a strict improvement that
retains any caveats to do with the use of "hooks_exist()" as-is.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "nr" and "alloc" members of "struct string_list" to use
"size_t" instead of "nr". On some platforms the size of an "unsigned
int" will be smaller than a "size_t", e.g. a 32 bit unsigned v.s. 64
bit unsigned. As "struct string_list" is a generic API we use in a lot
of places this might cause overflows.
As one example: code in "refs.c" keeps track of the number of refs
with a "size_t", and auxiliary code in builtin/remote.c in
get_ref_states() appends those to a "struct string_list".
While we're at it split the "nr" and "alloc" in string-list.h across
two lines, which is the case for most such struct member
declarations (e.g. in "strbuf.h" and "strvec.h").
Changing e.g. "int i" to "size_t i" in run_and_feed_hook() isn't
strictly necessary, and there are a lot more cases where we'll use a
local "int", "unsigned int" etc. variable derived from the "nr" in the
"struct string_list". But in that case as well as
add_wrapped_shortlog_msg() in builtin/shortlog.c we need to adjust the
printf format referring to "nr" anyway, so let's also change the other
variables referring to it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a few stray users of the inline gettext.h Q_() function to stop
casting its "n" argument, the vast majority of the users of that
wrapper API use the implicit cast to "unsigned long".
The ngettext() function (which Q_() resolves to) takes an "unsigned
long int", and so does our Q_() wrapper for it, see 0c9ea33b90 (i18n:
add stub Q_() wrapper for ngettext, 2011-03-09). The function isn't
ours, but provided by e.g. GNU libintl.
This amends code added in added in 7171a0b0cf (index-pack: correct
"len" type in unpack_data(), 2016-07-13). The cast it added for the
printf format to die() was needed, but not the cast to Q_().
Likewise the casts in strbuf.c added in 8f354a1fae (l10n: localizable
upload progress messages, 2019-07-02) and for
builtin/merge-recursive.c in ccf7813139 (i18n: merge-recursive: mark
error messages for translation, 2016-09-15) weren't needed.
In the latter case the cast was copy/pasted from the argument to
warning() itself, added in b74d779bd9 (MinGW: Fix compiler warning in
merge-recursive, 2009-05-23). The cast for warning() is needed, but
not the one for ngettext()'s "n" argument.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Generation Data (GDAT) and Generation Data Overflow (GDOV) chunks
store corrected commit date offsets, used for generation number v2.
Recent changes have demonstrated that previous versions of Git were
incorrectly parsing data from these chunks, but might have also been
writing them incorrectly.
I asserted [1] that the previous fixes were sufficient because the known
reasons for incorrectly writing generation number v2 data relied on
parsing the information incorrectly out of a commit-graph file, but the
previous versions of Git were not reading the generation number v2 data.
However, Patrick demonstrated [2] a case where in split commit-graphs
across an alternate boundary (and possibly some other special
conditions) it was possible to have a commit-graph that was generated by
a previous version of Git have incorrect generation number v2 data which
results in errors like the following:
commit-graph generation for commit <oid> is 1623273624 < 1623273710
[1] https://lore.kernel.org/git/f50e74f0-9ffa-f4f2-4663-269801495ed3@github.com/
[2] https://lore.kernel.org/git/Yh93vOkt2DkrGPh2@ncase/
Clearly, there is something else going on. The situation is not
completely understood, but the errors do not reproduce if the
commit-graphs are all generated by a Git version including these recent
fixes.
If we cannot trust the existing data in the GDAT and GDOV chunks, then
we can alter the format to change the chunk IDs for these chunks. This
causes the new version of Git to silently ignore the older chunks (and
disabling generation number v2 in the process) while writing new
commit-graph files with correct data in the GDA2 and GDO2 chunks.
Update commit-graph-format.txt including a historical note about these
deprecated chunks.
Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many output modes of "ls-files" do not work with its
"--recurse-submodules" option, but the "-s" mode has been taught to
work with it.
* jt/ls-files-stage-recurse:
ls-files: support --recurse-submodules --stage
"git checkout -b branch/with/multi/level/name && git stash" only
recorded the last level component of the branch name, which has
been corrected.
* gc/stash-on-branch-with-multi-level-name:
stash: strip "refs/heads/" with skip_prefix
The error message given by "git switch HEAD~4" has been clarified
to suggest the "--detach" option that is required.
* ah/advice-switch-requires-detach-to-detach:
switch: mention the --detach option when dying due to lack of a branch
Use designated initializers we started using in mid 2017 in more
parts of the codebase that are relatively quiescent.
* ab/c99-designated-initializers:
fast-import.c: use designated initializers for "partial" struct assignments
refspec.c: use designated initializers for "struct refspec_item"
convert.c: use designated initializers for "struct stream_filter*"
userdiff.c: use designated initializers for "struct userdiff_driver"
archive-*.c: use designated initializers for "struct archiver"
object-file: use designated initializers for "struct git_hash_algo"
trace2: use designated initializers for "struct tr2_dst"
trace2: use designated initializers for "struct tr2_tgt"
imap-send.c: use designated initializers for "struct imap_server_conf"
When "index-pack" dies due to incoming data exceeding the maximum
allowed input size, include the value of the limit in the error
message.
* mc/index-pack-report-max-size:
index-pack: clarify the breached limit
Tighten the language around "working tree" and "worktree" in the
docs.
* ds/worktree-docs:
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: use 'worktree' over 'working tree'
worktree: extract checkout_worktree()
worktree: extract copy_sparse_checkout()
worktree: extract copy_filtered_worktree_config()
worktree: combine two translatable messages
A not-so-common mistake is to write a script to feed "git bisect
run" without making it executable, in which case all tests will
exit with 126 or 127 error codes, even on revisions that are marked
as good. Try to recognize this situation and stop iteration early.
* rs/bisect-executable-not-found:
bisect--helper: double-check run command on exit code 126 and 127
bisect: document run behavior with exit codes 126 and 127
bisect--helper: release strbuf and strvec on run error
bisect--helper: report actual bisect_state() argument on error
Further polishing of "git sparse-checkout".
* en/sparse-checkout-fixes:
sparse-checkout: reject arguments in cone-mode that look like patterns
sparse-checkout: error or warn when given individual files
sparse-checkout: pay attention to prefix for {set, add}
sparse-checkout: correctly set non-cone mode when expected
sparse-checkout: correct reapply's handling of options
Test modernization.
* cg/t3903-modernize:
tests: make the code more readable
tests: allow testing if a path is truly a file or a directory
t/t3903-stash.sh: replace test [-d|-f] with test_path_is_*
"git submodule update --filter" also requires the "--init" option. Teach
update-clone to do this usage check in C and remove the check from
git-submodule.sh.
In addition, change update-clone's usage string so that it teaches users
about "git submodule update" instead of "git submodule--helper
update-clone" (the string is copied from git-submodule.sh). This should
be more helpful to users since they don't invoke update-clone directly.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the "--filter" option to make sure we don't break anything while
refactoring "git submodule update".
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the logic of "git submodule--helper ensure-core-worktree" into
run-update-procedure, and since this makes the ensure-core-worktree
command obsolete, remove it.
As a result, the order of two operations in git-submodule.sh is
reversed: 'set the value of core.worktree' now happens after the call to
"git submodule--helper relative-path". This is safe - "relative-path"
does not depend on the value of core.worktree.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "git submodule--helper update-clone" the --init flag and remove
the corresponding shell code.
When the `--init` flag is passed to the subcommand, we do not spawn a
new subprocess and call `submodule--helper init` on the submodule paths,
because the Git machinery is not able to pick up the configuration
changes introduced by that init call. So we instead run the
`init_submodule_cb()` callback over each submodule in the same process.
[1] https://lore.kernel.org/git/CAP8UFD0NCQ5w_3GtT_xHr35i7h8BuLX4UcHNY6VHPGREmDVObA@mail.gmail.com/
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allow callers of the `init_submodule()` function to optionally
override the superprefix from the environment.
We need to enable this option because in our conversion of the update
command that will follow, the '--init' option will be handled through
this API. We will need to change the superprefix at that time to ensure
the display paths show correctly in the output messages.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create a function called `do_get_submodule_displaypath()` that
generates the display path required by several submodule functions, and
takes a custom superprefix parameter, instead of reading it from the
environment.
We then redefine the existing `get_submodule_displaypath()` function
as a call to this new function, where the superprefix is obtained from
the environment.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach run-update-procedure to handle --remote instead of parsing
--remote in git-submodule.sh. As a result, "git submodule--helper
[print-default-remote|remote-branch]" have no more callers, so remove
them.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do away with the indirection of local variables added in
c51f8f94e5 (submodule--helper: run update procedures from C,
2021-08-24).
These were only needed because in C you can't get a pointer to a
single bit, so we were using intermediate variables instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`get_default_remote()` retrieves the name of a remote by resolving the
refs from of the current repository's ref store.
Thus in order to use it for retrieving the remote name of a submodule,
we have to start a new subprocess which runs from the submodule
directory.
Let's instead introduce a function called `repo_get_default_remote()`
which takes any repository object and retrieves the remote accordingly.
`get_default_remote()` is then defined as a call to
`repo_get_default_remote()` with 'the_repository' passed to it.
Now that we have `repo_get_default_remote()`, we no longer have to start
a subprocess that called `submodule--helper get-default-remote` from
within the submodule directory.
So let's make a function called `get_default_remote_submodule()` which
takes a submodule path, and returns the default remote for that
submodule, all within the same process.
We can now use this function to save an unnecessary subprocess spawn in
`sync_submodule()`, and also in a subsequent patch, which will require
this functionality.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach run-update-procedure to determine the oid of the submodule's HEAD
instead of doing it in git-submodule.sh.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a function, update_submodule2(), that will implement the
functionality of run-update-procedure and its surrounding shell code in
submodule.sh. This name is temporary; it will replace update_submodule()
when the sh to C conversion is complete.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is dead code - it has not been used since c51f8f94e5
(submodule--helper: run update procedures from C, 2021-08-24).
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend some submodule tests to test for the failure output of "git
submodule [update|init]". The lack of such tests hid a regression in
an earlier version of a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct path_cache" added in 102de880d2 (path.c: migrate global
git_path_* to take a repository argument, 2018-05-17) is only used
directly by code in repository.[ch] (but populated in path.[ch]).
Let's move this code to repository.[ch], and stop leaking this memory
when we run repo_clear(). To avoid the cast change it from a "const
char *" to a "char *".
This also removes the "PATH_CACHE_INIT" macro, which has never been
used for anything. For the "struct repository" we already make a hard
assumption that it (and "the_repository") can be identically
initialized by making it a "static" variable, so making use of a
"PATH_CACHE_INIT" somewhere would have been confusing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend code added in d9c66f0b5b (range-diff: first rudimentary
implementation, 2018-08-13) to use a "goto cleanup" pattern. This
makes for less code, and frees memory that we'd previously leak.
The reason for changing free(util) to FREE_AND_NULL(util) is because
at the end of the function we append the contents of "util" to a
"struct string_list" if it's non-NULL.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a public release_patch() version of the private free_patch()
function added in 13b5af22f3 (apply: move libified code from
builtin/apply.c to apply.{c,h}, 2016-04-22). Unlike the existing
function this one doesn't free() the "struct patch" itself, so we can
use it for variables on the stack.
Use it in range-diff.c to fix a memory leak in common range-diff
invocations, e.g.:
git -P range-diff origin/master origin/next origin/seen
Would emit several errors when compiled with SANITIZE=leak, but now
runs cleanly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in code added in 6c622f9f0b (commit-graph: write
commit-graph chains, 2019-06-18). We needed to free the "lock_name" if
we encounter errors, and the "graph_name" after we'd run unlink() on
it.
For the case of write_commit_graph_file() refactoring the code to free
the "lock_name" after we were done using the "struct lock_file lk"
would have made the control flow more complex. Luckily we can free the
"lock_file" right after the hold_lock_file_for_update() call, if it
makes use of "path" at all it'll have copied its contents to a "struct
strbuf" of its own.
While I'm at it let's fix code added in fb10ca5b54 (sparse-checkout:
write using lockfile, 2019-11-21) in write_patterns_and_update() to
avoid the same complexity that I thought I needed when I wrote the
initial fix for write_commit_graph_file(). We can free the
"sparse_filename" right after calling hold_lock_file_for_update(), we
don't need to wait until we're exiting the function to do so.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in fill_oids_from_packs(), we should always stop_progress(),
but did not do so if we returned an error here. This also plugs a
memory leak in those cases by releasing the two "struct strbuf"
variables the function uses.
While I'm at it stop hardcoding "-1" here and just use the return
value of error() instead, which happens to be "-1".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this code was migrated to the string_list API in
d88b14b3fd (commit-graph: use string-list API for input, 2018-06-27)
it was made to use used both STRING_LIST_INIT_NODUP and a
strbuf_detach() pattern.
Those should not be used together if string_list_clear() is expected
to free the memory, instead we need to either use STRING_LIST_INIT_DUP
with a string_list_append_nodup(), or a STRING_LIST_INIT_NODUP and
manually fiddle with the "strdup_strings" member before calling
string_list_clear(). Let's do the former.
Since "strdup_strings = 1" is set now other code might be broken by
relying on "pack_indexes" not to duplicate it strings, but that
doesn't happen. When we pass this down to write_commit_graph() that
code uses the "struct string_list" without modifying it. Let's add a
"const" to the variable to have the compiler enforce that assumption.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in code added in a6226fd772 (submodule--helper:
convert the bulk of cmd_add() to C, 2021-08-10). If "realrepo" isn't a
copy of the "repo" member we should free() it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the logic added in fddf2ebe38 (transport: teach all vtables to
allow fetch first, 2019-08-21) and save ourselves pointless work in
fetch_refs_from_bundle().
The fetch_refs_from_bundle() caller doesn't care about the "struct
ref *result" return value of get_refs_from_bundle(), and doesn't need
any of the work we were doing in looping over the
"data->header.references" in get_refs_from_bundle().
So this change saves us work, and also fixes a memory leak that we had
when called from fetch_refs_from_bundle(). The other caller of
get_refs_from_bundle() is the "get_refs_list" member we set up for the
"struct transport_vtable bundle_vtable". That caller does care about
the "struct ref *result" return value.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixing this small memory leak in cmd_bundle_create() gets
"t5607-clone-bundle.sh" closer to passing under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a trivial memory leak in code added in a2d725b7bd (Use an
external program to implement fetching with curl, 2009-08-05).
To do this have the cmd_main() use a "goto cleanup" pattern, and to
return an error of 1 unless we can fall through to the http_cleanup()
at the end.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a memory leak in credential_apply_config() by adding and using a
new urlmatch_config_release() function. This just does a
string_list_clear() on the "vars" member.
This finished up work on normalizing the init/free pattern in this
API, started in 73ee449bbf (urlmatch.[ch]: add and use
URLMATCH_CONFIG_INIT, 2021-10-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the freeing logic added in e6e045f803 (diff.c: buffer all
output if asked to, 2017-06-29) to free the containing "buf" in
addition to its members.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in 53eda89b2f (merge-base: teach "git merge-base"
to drive underlying merge_bases_many(), 2008-07-30) by calling free()
on the "struct commit **" list used by "git merge-base".
This gets e.g. "t6010-merge-base.sh" closer to passing under
SANITIZE=leak, it failed 8 tests before when compiled with that
option, and now fails only 5 tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix various memory leaks in "git index-pack", due to how tightly
coupled this command is with the revision walking this doesn't make
any new tests pass.
But e.g. this now passes, and had several failures before, i.e. we
still have failures in tests 3, 5 etc., which are being skipped here.
./t5300-pack-object.sh --run=1-2,4,6-27,30-42
It is a bit odd that we'll free "opts.anomaly", since the "opts" is a
"struct pack_idx_option" declared in pack.h. In pack-write.c there's a
reset_pack_idx_option(), but it only wipes the contents, but doesn't
free() anything.
Doing this here in cmd_index_pack() is correct because while the
struct is declared in pack.h, this code in builtin/index-pack.c (in
read_v2_anomalous_offsets()) is what allocates the "opts.anomaly", so
we should also free it here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In glibc >= 2.34 MALLOC_CHECK_ and MALLOC_PERTURB_ environment
variables have been replaced by GLIBC_TUNABLES. Also the new
glibc requires that you preload a library called libc_malloc_debug.so
to get these features.
Using the ordinary glibc system variable detect if this is glibc >= 2.34 and
use GLIBC_TUNABLES and the new library.
This patch was inspired by a Richard W.M. Jones ndbkit patch
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gpg-agent is one of several processes that newer releases of GnuPG
start automatically. Issue a kill to each of them to ensure they do not
affect separate tests. (Yes, the separate GNUPGHOME should do that
already. If we find that is case, we could drop the --kill entirely.)
In terms of compatibility, the 'all' keyword was added to the --kill &
--reload options in GnuPG 2.1.18. Debian and RHEL are often used as
indicators of how a change might affect older systems we often try to
support.
- Debian Strech (old old stable), which has limited security support
until June 2022, has GnuPG 2.1.18 (or 2.2.x in backports).
- CentOS/RHEL 7, which is supported until June 2024, has GnuPG
2.0.22, which lacks the --kill option, so the change won't have
any impact.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With gpgsm from gnupg-2.3, the changes to the trustlist.txt do not
appear to be picked up without refreshing the gpg-agent. Use the 'all'
keyword to reload all of the gpg components. The scdaemon is started as
a child of gpg-agent, for example.
We used to have a --kill at this spot, but I removed it in 2e285e7803
(t/lib-gpg: drop redundant killing of gpg-agent, 2019-02-07). It seems
like it might be necessary (again) for 2.3.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking if signing was successful will now accept '[GNUPG]:
SIG_CREATED' on the beginning of the first or any subsequent line. Not
just explictly the second one anymore.
Gpgsm v2.3 changed its output when listing keys from `fingerprint` to
`sha1/2 fpr`. This leads to the gpgsm tests silently not being executed
because of a failed prerequisite.
Switch to gpg's `--with-colons` output format when evaluating test
prerequisites to make parsing more robust. This also allows us to
combine the existing grep/cut/tr/echo pipe for writing the trustlist.txt
into a single awk expression.
Adjust error message checking in test for v2.3 specific output changes.
Helped-By: Junio C Hamano <gitster@pobox.com>
Helped-By: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in ff37a60c36 (log tests: check if grep_config() is
called by "log"-like cmds, 2022-02-16), a "test_done" command used
during development made it into a submitted patch causing tests 41-136
in t/t4202-log.sh to be skipped.
Reported-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usage help for --type option of `git config` is missing `type`
in the argument placeholder (`<>`). Add it.
Signed-off-by: Matheus Felipe <matheusfelipeog@protonmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming a remote, Git needs to rename all remote tracking
references to the remote's new name (e.g., renaming
"refs/remotes/old/foo" to "refs/remotes/new/foo" when renaming a remote
from "old" to "new").
This can be somewhat slow when there are many references to rename,
since each rename is done in a separate call to rename_ref() as opposed
to grouping all renames together into the same transaction. It would be
nice to execute all renames as a single transaction, but there is a
snag: the reference transaction backend doesn't support renames during a
transaction (only individually, via rename_ref()).
The reasons there are described in more detail in [1], but the main
problem is that in order to preserve the existing reflog, it must be
moved while holding both locks (i.e., on "oldname" and "newname"), and
the ref transaction code doesn't support inserting arbitrary actions
into the middle of a transaction like that.
As an aside, adding support for this to the ref transaction code is
less straightforward than inserting both a ref_update() and ref_delete()
call into the same transaction. rename_ref()'s special handling to
detect D/F conflicts would need to be rewritten for the transaction code
if we wanted to proactively catch D/F conflicts when renaming a
reference during a transaction. The reftable backend could support this
much more readily because of its lack of D/F conflicts.
Instead of a more complex modification to the ref transaction code,
display a progress meter when running verbosely in order to convince the
user that Git is doing work while renaming a remote.
This is mostly done as-expected, with the minor caveat that we
intentionally count symrefs renames twice, since renaming a symref takes
place over two separate calls (one to delete the old one, and another to
create the new one).
[1]: https://lore.kernel.org/git/572367B4.4050207@alum.mit.edu/
Suggested-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git remote rename' command doesn't currently take any command-line
arguments besides the existing and new name of a remote, and so has no
need to call parse_options().
But the subsequent patch will add a `--[no-]progress` option, in which
case we will need to call parse_options().
Do so now so as to avoid cluttering the following patch with noise, like
adjusting setting `rename.{old,new}_name` to argv[0] and argv[1], since
parse_options handles advancing argv past the name of the sub-command.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the testcase to exercise backfilling of tags for fetches we evoke a
failure of the backfilling mechanism by creating a reference that later
on causes a D/F conflict. Because the assumption was that git-fetch(1)
would notice the D/F conflict early on this conflicting reference was
created via the reference-transaction hook just when we were about to
write the backfilled tag. As it turns out though this is not the case,
and the fetch fails in the same way when we create the conflicting ref
up front.
Simplify the test setup creating the reference up front, which allows us
to get rid of the hook script.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it consistent with the rest of the documentation.
Signed-off-by: Nihal Jere <nihal@nihaljere.xyz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a template to do the "mkdir -p" of $(@D) (the parent dir of $@)
for us, and use it for the "make lint-docs" targets I added in
8650c6298c (doc lint: make "lint-docs" non-.PHONY, 2021-10-15).
As seen in 4c64fb5aad (Documentation/Makefile: fix lint-docs mkdir
dependency, 2021-10-26) maintaining these manual lists of parent
directory dependencies is fragile, in addition to being obviously
verbose.
I used this pattern at the time because I couldn't find another method
than "order-only" prerequisites to avoid doing a "mkdir -p $(@D)" for
every file being created, which as noted in [1] would be significantly
slower.
But as it turns out we can use this neat trick of only doing a "mkdir
-p" if the $(wildcard) macro tells us the path doesn't exist. A re-run
of a performance test similar to that noted downthread of [1] in [2]
shows that this is faster, in addition to being less verbose and more
reliable (this uses my "git-hyperfine" thin wrapper for "hyperfine"[3]):
$ git -c hyperfine.hook.setup= hyperfine -L rev HEAD~1,HEAD~0 -s 'make -C Documentation lint-docs' -p 'rm -rf Documentation/.build' 'make -C Documentation -j1 lint-docs'
Benchmark 1: make -C Documentation -j1 lint-docs' in 'HEAD~1
Time (mean ± σ): 2.914 s ± 0.062 s [User: 2.449 s, System: 0.489 s]
Range (min … max): 2.834 s … 3.020 s 10 runs
Benchmark 2: make -C Documentation -j1 lint-docs' in 'HEAD~0
Time (mean ± σ): 2.315 s ± 0.062 s [User: 1.950 s, System: 0.386 s]
Range (min … max): 2.229 s … 2.397 s 10 runs
Summary
'make -C Documentation -j1 lint-docs' in 'HEAD~0' ran
1.26 ± 0.04 times faster than 'make -C Documentation -j1 lint-docs' in 'HEAD~1'
So let's use that pattern both for the "lint-docs" target, and a few
miscellaneous other targets.
This method of creating parent directories is explicitly racy in that
we don't know if we're going to say always create a "foo" followed by
a "foo/bar" under parallelism, or skip the "foo" because we created
"foo/bar" first. In this case it doesn't matter for anything except
that we aren't guaranteed to get the same number of rules firing when
running make in parallel.
1. https://lore.kernel.org/git/211028.861r45y3pt.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/211028.86o879vvtp.gmgdl@evledraar.gmail.com/
3. https://gitlab.com/avar/git-hyperfine/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The $(QUIET) variables we define are largely duplicated between our
various Makefiles, let's define them in the new "shared.mak" instead.
Since we're not using the environment to pass these around we don't
need to export the "QUIET_GEN" and "QUIET_BUILT_IN" variables
anymore. The "QUIET_GEN" variable is used in "git-gui/Makefile" and
"gitweb/Makefile", but they've got their own definition for those. The
"QUIET_BUILT_IN" variable is only used in the top-level "Makefile". We
still need to export the "V" variable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move these variables over to the shared.mak, we'll make use of them in
a subsequent commit.
Note that there's reason for these to be "simply expanded variables",
i.e. to use ":=" assignments instead of lazily expanded "="
assignments. We could use "=", but let's leave this as-is for now for
ease of review.
See 425ca6710b (Makefile: allow combining UBSan with other
sanitizers, 2017-07-15) for the commit that introduced these.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was added in 30248886ce (Makefile: disable default implicit
rules, 2010-01-26), let's move it to the top of "shared.mak" so it'll
apply to all our Makefiles.
This doesn't benefit the main Makefile at all, since it already had
the rule, but since we're including shared.mak in other Makefiles
starts to benefit them. E.g. running the 'man" target is now faster:
$ git -c hyperfine.hook.setup= hyperfine -L rev HEAD~1,HEAD~0 -s 'make -C Documentation man' 'make -C Documentation -j1 man'
Benchmark 1: make -C Documentation -j1 man' in 'HEAD~1
Time (mean ± σ): 121.7 ms ± 8.8 ms [User: 105.8 ms, System: 18.6 ms]
Range (min … max): 112.8 ms … 148.4 ms 26 runs
Benchmark 2: make -C Documentation -j1 man' in 'HEAD~0
Time (mean ± σ): 97.5 ms ± 8.0 ms [User: 80.1 ms, System: 20.1 ms]
Range (min … max): 89.8 ms … 111.8 ms 32 runs
Summary
'make -C Documentation -j1 man' in 'HEAD~0' ran
1.25 ± 0.14 times faster than 'make -C Documentation -j1 man' in 'HEAD~1'
The reason for that can be seen when comparing that run with
"--debug=a". Without this change making a target like "git-status.1"
will cause "make" to consider not only "git-status.txt", but
"git-status.txt.o", as well as numerous other implicit suffixes such
as ".c", ".cc", ".cpp" etc. See [1] for a more detailed before/after
example.
So this is causing us to omit a bunch of work we didn't need to
do. For making "git-status.1" the "--debug=a" output is reduced from
~140k lines to ~6k.
1. https://lore.kernel.org/git/220222.86bkyz875k.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Combine the definitions of $(FIND_SOURCE_FILES) and $(LIB_H) to speed
up the Makefile, as these are the two main expensive $(shell) commands
that we execute unconditionally.
When see what was in $(FOUND_SOURCE_FILES) that wasn't in $(LIB_H) via
the ad-hoc test of:
$(error $(filter-out $(LIB_H),$(filter %.h,$(ALL_SOURCE_FILES))))
$(error $(filter-out $(ALL_SOURCE_FILES),$(filter %.h,$(LIB_H))))
We'll get, respectively:
Makefile:850: *** t/helper/test-tool.h. Stop.
Makefile:850: *** . Stop.
I.e. we only had a discrepancy when it came to
t/helper/test-tool.h. In terms of correctness this was broken before,
but now works:
$ make t/helper/test-tool.hco
HDR t/helper/test-tool.h
This speeds things up a lot:
$ git -c hyperfine.hook.setup= hyperfine -L rev HEAD~1,HEAD~0 -s 'make NO_TCLTK=Y' 'make -j1 NO_TCLTK=Y' --warmup 10 -M 10
Benchmark 1: make -j1 NO_TCLTK=Y' in 'HEAD~1
Time (mean ± σ): 159.9 ms ± 6.8 ms [User: 137.2 ms, System: 28.0 ms]
Range (min … max): 154.6 ms … 175.9 ms 10 runs
Benchmark 2: make -j1 NO_TCLTK=Y' in 'HEAD~0
Time (mean ± σ): 100.0 ms ± 1.3 ms [User: 84.2 ms, System: 20.2 ms]
Range (min … max): 98.8 ms … 102.8 ms 10 runs
Summary
'make -j1 NO_TCLTK=Y' in 'HEAD~0' ran
1.60 ± 0.07 times faster than 'make -j1 NO_TCLTK=Y' in 'HEAD~1'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Override built-in rules of GNU make that use a wildcard target. This
can speeds things up significantly as we don't need to stat() so many
files. GNU make does that by default to see if it can retrieve their
contents from RCS or SCCS. See [1] for an old mailing list discussion
about how to disable these.
The speed-up may vary. I've seen 1-10% depending on the speed of the
local disk, caches, -jN etc. Running:
strace -f -c -S calls make -j1 NO_TCLTK=Y
Shows that we reduce the number of syscalls we make, mostly in "stat"
calls.
We could also invoke make with "-r" by setting "MAKEFLAGS = -r"
early. Doing so might make us a bit faster still. But doing so is a
much bigger hammer, since it will disable all built-in rules,
some (all?) of which can be seen with:
make -f/dev/null -p | grep -v -e ^# -e ^$
We may have something that relies on them, so let's go for the more
isolated optimization here that gives us most or all of the wins.
1. https://lists.gnu.org/archive/html/help-make/2002-11/msg00063.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have various behavior that's shared across our Makefiles, or that
really should be (e.g. via defined templates). Let's create a
top-level "shared.mak" to house those sorts of things, and start by
adding the ".DELETE_ON_ERROR" flag to it.
See my own 7b76d6bf22 (Makefile: add and use the ".DELETE_ON_ERROR"
flag, 2021-06-29) and db10fc6c09 (doc: simplify Makefile using
.DELETE_ON_ERROR, 2021-05-21) for the addition and use of the
".DELETE_ON_ERROR" flag.
I.e. this changes the behavior of existing rules in the altered
Makefiles (except "Makefile" & "Documentation/Makefile"). I'm
confident that this is safe having read the relevant rules in those
Makfiles, and as the GNU make manual notes that it isn't the default
behavior is out of an abundance of backwards compatibility
caution. From edition 0.75 of its manual, covering GNU make 4.3:
[Enabling '.DELETE_ON_ERROR' is] almost always what you want
'make' to do, but it is not historical practice; so for
compatibility, you must explicitly request it.
This doesn't introduce a bug by e.g. having this
".DELETE_ON_ERROR" flag only apply to this new shared.mak, Makefiles
have no such scoping semantics.
It does increase the danger that any Makefile without an explicit "The
default target of this Makefile is..." snippet to define the default
target as "all" could have its default rule changed if our new
shared.mak ever defines a "real" rule. In subsequent commits we'll be
careful not to do that, and such breakage would be obvious e.g. in the
case of "make -C t".
We might want to make that less fragile still (e.g. by using
".DEFAULT_GOAL" as noted in the preceding commit), but for now let's
simply include "shared.mak" without adding that boilerplate to all the
Makefiles that don't have it already. Most of those are already
exposed to that potential caveat e.g. due to including "config.mak*".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the "contrib/scalar/Makefile" be stylistically consistent with
the top-level "Makefile" in first declaring "all" to be the default
rule, followed by including other Makefile snippets.
This adjusts code added in 0a43fb2202 (scalar: create a rudimentary
executable, 2021-12-03), it further ensures that when we add another
"include" file in a subsequent commit that the included file won't be
the one to define our default target.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an interactive session, "git am" without arguments, or even
worse, "git am --whitespace file", waits silently for the user to
feed the patches from the standard input (presumably by typing or
copy-pasting). Give a feedback message to the user when this
happens, as it is unlikely that the user meant to do so.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that cmd_reflog_delete has been libified an exported it into a new
reflog.c library so we can call it directly from builtin/stash.c. This
not only gives us a performance gain since we don't need to create a
subprocess, but it also allows us to use the ref transactions api in the
future.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently stash shells out to reflog in order to delete refs. In an
effort to reduce how much we shell out to a subprocess, libify the
functionality that stash needs into reflog.c.
Add a reflog_delete function that is pretty much the logic in the while
loop in builtin/reflog.c cmd_reflog_delete(). This is a function that
builtin/reflog.c and builtin/stash.c can both call.
Also move functions needed by reflog_delete and export them.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is missing test coverage to ensure that the resulting reflogs
after a git stash drop has had its old oid rewritten if applicable, and
if the refs/stash has been updated if applicable.
Add two tests that verify both of these happen.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typically with sparse checkouts, we expect files outside the sparsity
patterns to be marked as SKIP_WORKTREE and be missing from the working
tree. Sometimes this expectation would be violated however; including
in cases such as:
* users grabbing files from elsewhere and writing them to the worktree
(perhaps by editing a cached copy in an editor, copying/renaming, or
even untarring)
* various git commands having incomplete or no support for the
SKIP_WORKTREE bit[1,2]
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
When the SKIP_WORKTREE bit in the index did not reflect the presence of
the file in the working tree, it traditionally caused confusion and was
difficult to detect and recover from. So, in a sparse checkout, since
af6a51875a (repo_read_index: clear SKIP_WORKTREE bit from files present
in worktree, 2022-01-14), Git automatically clears the SKIP_WORKTREE
bit at index read time for entries corresponding to files that are
present in the working tree.
There is another workflow, however, where it is expected that paths
outside the sparsity patterns appear to exist in the working tree and
that they do not lose the SKIP_WORKTREE bit, at least until they get
modified. A Git-aware virtual file system[4] takes advantage of its
position as a file system driver to expose all files in the working
tree, fetch them on demand using partial clone on access, and tell Git
to pay attention to them on demand by updating the sparse checkout
pattern on writes. This means that commands like "git status" only have
to examine files that have potentially been modified, whereas commands
like "ls" are able to show the entire codebase without requiring manual
updates to the sparse checkout pattern.
Thus since af6a51875a, Git with such Git-aware virtual file systems
unsets the SKIP_WORKTREE bit for all files and commands like "git
status" have to fetch and examine them all.
Introduce a configuration setting sparse.expectFilesOutsideOfPatterns to
allow limiting the tracked set of files to a small set once again. A
Git-aware virtual file system or other application that wants to
maintain files outside of the sparse checkout can set this in a
repository to instruct Git not to check for the presence of
SKIP_WORKTREE files. The setting defaults to false, so most users of
sparse checkout will still get the benefit of an automatically updating
index to recover from the variety of difficult issues detailed in
af6a51875a for paths with SKIP_WORKTREE set despite the path being
present.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] The three long paragraphs in the middle of
https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
[4] such as the vfsd described in
https://lore.kernel.org/git/20220207190320.2960362-1-jonathantanmy@google.com/
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-recursive would only report messages from inner merges when the
GIT_MERGE_VERBOSITY was set to 5. Do the same for merge-ort.
Note that somewhat reverts 0d83d8240d ("merge-ort: mark conflict/warning
messages from inner merges as omittable", 2022-02-02) based on two
facts:
* This commit basically removes the showing of messages from inner
merges as well, at least by default. The only difference is that
users can request to get them back by turning up the verbosity.
* Messages from inner merges are specially annotated since 4a3d86e1bb
("merge-ort: make informational messages from recursive merges
clearer", 2022-02-17). The ability to distinguish them from outer
merge comments make them less problematic to include, and easier
for humans to parse.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a tree oid is invalid, parse_tree_indirect() can return NULL. Check
for NULL instead of proceeding as though it were a valid pointer and
segfaulting.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The untracked cache test uses an avoid_racy function to deal with
an mtime-resolution challenge in testing: If an untracked cache
entry's mtime falls in the same second as the mtime of the index
the untracked cache was stored in, then it cannot be trusted.
Explicitly delaying tests is a simple effective strategy to
avoid these issues, but should be avoided where possible.
Switch from a delay-based strategy to instead backdating
all file changes using test-tool chmtime, where that is an
option, to shave 9 seconds off the test run time.
Don't update test cases that delay for other reasons, for now at
least (4 seconds).
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mingw_utime implementation in mingw.c does not support
directories. This means that "test-tool chmtime" fails on Windows when
targeting directories. This has previously been noted and sidestepped
temporarily by Jeff Hostetler, in "t/helper/test-chmtime: skip
directories on Windows" in the "Builtin FSMonitor Part 2" work, but
not yet fixed.
It would make sense to backdate file and folder changes in untracked
cache tests, to avoid needing to insert explicit delays/pauses in the
tests.
Add support for directory date manipulation in mingw_utime by
replacing the file-oriented _wopen() call with the
directory-supporting CreateFileW() windows API explicitly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable use of 'merged_sparse_dir' in 'threeway_merge'. As with two-way
merge, the contents of each conflicted sparse directory are merged without
referencing the index, avoiding sparse index expansion.
As with two-way merge, the 't/t1092-sparse-checkout-compatibility.sh' test
'read-tree --merge with edit/edit conflicts in sparse directories' confirms
that three-way merges with edit/edit changes (both with and without
conflicts) inside a sparse directory result in the correct index state or
error message. To ensure the index is not unnecessarily expanded, add
three-way merge cases to 'sparse index is not expanded: read-tree'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable two-way merge with 'git read-tree' without expanding the sparse
index. When in a sparse index, a two-way merge will trivially succeed as
long as there are not changes to the same sparse directory in multiple trees
(i.e., sparse directory-level "edit-edit" conflicts). If there are such
conflicts, the merge will fail despite the possibility that individual files
could merge cleanly.
In order to resolve these "edit-edit" conflicts, "conflicted" sparse
directories are - rather than rejected - merged by traversing their
associated trees by OID. For each child of the sparse directory:
1. Files are merged as normal (see Documentation/git-read-tree.txt for
details).
2. Subdirectories are treated as sparse directories and merged in
'twoway_merge'. If there are no conflicts, they are merged according to
the rules in Documentation/git-read-tree.txt; otherwise, the subdirectory
is recursively traversed and merged.
This process allows sparse directories to be individually merged at the
necessary depth *without* expanding a full index.
The 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --merge with
edit/edit conflicts in sparse directories' tests two-way merges with 1)
changes inside sparse directories that do not conflict and 2) changes that
do conflict (with the correct file(s) reported in the error message).
Additionally, add two-way merge cases to 'sparse index is not expanded:
read-tree' to confirm that the index is not expanded regardless of whether
edit/edit conflicts are present in a sparse directory.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git read-tree' is provided with a prefix, expand the index only if the
prefix is equivalent to a sparse directory or contained within one. If the
index is not expanded in these cases, 'ce_in_traverse_path' will indicate
that the relevant sparse directory is not in the prefix/traverse path,
skipping past it and not unpacking the appropriate tree(s).
If the prefix is in-cone, its sparse subdirectories (if any) will be
traversed correctly without index expansion.
The behavior of 'git read-tree' with prefixes 1) inside of cone, 2) equal to
a sparse directory, and 3) inside a sparse directory are all tested as part
of the 't/t1092-sparse-checkout-compatibility.sh' test 'read-tree --prefix',
ensuring that the sparse index case works the way it did prior to this
change as well as matching non-sparse index sparse-checkout.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable use of sparse index in 'git read-tree'. The integration in this patch
is limited only to usage of 'read-tree' that does not need additional
functional changes for the sparse index to behave as expected (i.e., produce
the same user-facing results as a non-sparse index sparse-checkout). To
ensure no unexpected behavior occurs, the index is explicitly expanded when:
* '--no-sparse-checkout' is specified (because it disables sparse-checkout)
* '--prefix' is specified (if the prefix is inside a sparse directory, the
prefixed tree cannot be properly traversed)
* two or more <tree-ish> arguments are specified ('twoway_merge' and
'threeway_merge' do not yet support merging sparse directories)
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests focused on how 'git read-tree' behaves in sparse checkouts. Extra
emphasis is placed on interactions with files outside the sparse cone, e.g.
merges with out-of-cone conflicts.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exit with an error if a prefix provided to `git read-tree --prefix` begins
with '/'. In most cases, prefixes like this result in an "invalid path"
error; however, the repository root would be interpreted as valid when
specified as '--prefix=/'. This is due to leniency around trailing directory
separators on prefixes (e.g., allowing both '--prefix=my-dir' and
'--prefix=my-dir/') - the '/' in the prefix is actually the *trailing*
slash, although it could be misinterpreted as a *leading* slash.
To remove the confusing repo root-as-'/' case and make it clear that
prefixes should not begin with '/', exit with an error if the first
character of the provided prefix is '/'.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable the 'recursive' diff option for the diff executed as part of 'git
status'. Without the 'recursive' enabled, 'git status' reports index
changes incorrectly when the following conditions were met:
* sparse index is enabled
* there is a difference between the index and HEAD in a file inside a
*subdirectory* of a sparse directory
* the sparse directory index entry is *not* expanded in-core
Because it is not recursive by default, the diff in 'git status' reports
changes only at the level of files and directories that are immediate
children of a sparse directory, rather than recursing into directories with
changes to identify the modified file(s). As a result, 'git status' reports
the immediate subdirectory itself as "modified".
Example:
$ git init
$ mkdir -p sparse/sub
$ echo test >sparse/sub/foo
$ git add .
$ git commit -m "commit 1"
$ echo somethingelse >sparse/sub/foo
$ git add .
$ git commit -a -m "commit 2"
$ git sparse-checkout set --cone --sparse-index 'sparse'
$ git reset --soft HEAD~1
$ git status
On branch master
You are in a sparse checkout.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: sparse/sub
Enabling the 'recursive' diff option in 'wt_status_collect_changes_index'
corrects this issue by allowing the diff to recurse into subdirectories of
sparse directories to find modified files. Given the same repository setup
as the example above, the corrected result of `git status` is:
$ git status
On branch master
You are in a sparse checkout.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: sparse/sub/foo
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prevent the repository root from being collapsed into a sparse directory by
treating an empty path as "inside the sparse-checkout". When collapsing a
sparse index (e.g. in 'git sparse-checkout reapply'), the root directory
typically could not become a sparse directory due to the presence of in-cone
root-level files and directories. However, if no such in-cone files or
directories were present, there was no explicit check signaling that the
"repository root path" (an empty string, in the case of
'convert_to_sparse(...)') was in-cone, and a sparse directory index entry
would be created from the repository root directory.
The documentation in Documentation/git-sparse-checkout.txt explicitly states
that the files in the root directory are expected to be in-cone for a
cone-mode sparse-checkout. Collapsing the root into a sparse directory entry
violates that assumption, as sparse directory entries are expected to be
outside the sparse cone and have SKIP_WORKTREE enabled. This invalid state
in turn causes issues with commands that interact with the index, e.g.
'git status'.
Treating an empty (root) path as in-cone prevents the creation of a root
sparse directory in 'convert_to_sparse(...)'. Because the repository root is
otherwise never compared with sparse patterns (in both cone-mode and
non-cone sparse-checkouts), the new check does not cause additional changes
to how sparse patterns are applied.
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Generation Data Chunk was implemented and tested in e8b63005c
(commit-graph: implement generation data chunk, 2021-01-16), but the
test was carefully constructed to work on systems with 32-bit dates.
Since the corrected commit date offsets still required more than 31
bits, this triggered writing the generation_data_overflow chunk.
However, upon closer look, the
write_graph_chunk_generation_data_overflow() method writes the offsets
to the chunk (as dictated by the format) but fill_commit_graph_info()
treats the value in the chunk as if it is the full corrected commit date
(not an offset). For some reason, this does not cause an issue when
using the FUTURE_DATE specified in t5318-commit-graph.sh, but it does
show up as a failure in 'git commit-graph verify' if we increase that
FUTURE_DATE to be above four billion.
Fix this error and create a 64-bit timestamp version of the test so we
can test these larger values.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'read_generation_data' member of 'struct commit_graph' was
introduced by 1fdc383c5 (commit-graph: use generation v2 only if entire
chain does, 2021-01-16). The intention was to avoid using corrected
commit dates if not all layers of a commit-graph had that data stored.
The logic in validate_mixed_generation_chain() at that point incorrectly
initialized read_generation_data to 1 if and only if the tip
commit-graph contained the Corrected Commit Date chunk.
This was "fixed" in 448a39e65 (commit-graph: validate layers for
generation data, 2021-02-02) to validate that read_generation_data was
either non-zero for all layers, or it would set read_generation_data to
zero for all layers.
The problem here is that read_generation_data is not initialized to be
non-zero anywhere!
This change initializes read_generation_data immediately after the chunk
is parsed, so each layer will have its value present as soon as
possible.
The read_generation_data member is used in fill_commit_graph_info() to
determine if we should use the corrected commit date or the topological
levels stored in the Commit Data chunk. Due to this bug, all previous
versions of Git were defaulting to topological levels in all cases!
This can be measured with some performance tests. Using the Linux kernel
as a testbed, I generated a complete commit-graph containing corrected
commit dates and tested the 'new' version against the previous, 'old'
version.
First, rev-list with --topo-order demonstrates a 26% improvement using
corrected commit dates:
hyperfine \
-n "old" "$OLD_GIT rev-list --topo-order -1000 v3.6" \
-n "new" "$NEW_GIT rev-list --topo-order -1000 v3.6" \
--warmup=10
Benchmark 1: old
Time (mean ± σ): 57.1 ms ± 3.1 ms
Range (min … max): 52.9 ms … 62.0 ms 55 runs
Benchmark 2: new
Time (mean ± σ): 45.5 ms ± 3.3 ms
Range (min … max): 39.9 ms … 51.7 ms 59 runs
Summary
'new' ran
1.26 ± 0.11 times faster than 'old'
These performance improvements are due to the algorithmic improvements
given by walking fewer commits due to the higher cutoffs from corrected
commit dates.
However, this comes at a cost. The additional I/O cost of parsing the
corrected commit dates is visible in case of merge-base commands that do
not reduce the overall number of walked commits.
hyperfine \
-n "old" "$OLD_GIT merge-base v4.8 v4.9" \
-n "new" "$NEW_GIT merge-base v4.8 v4.9" \
--warmup=10
Benchmark 1: old
Time (mean ± σ): 110.4 ms ± 6.4 ms
Range (min … max): 96.0 ms … 118.3 ms 25 runs
Benchmark 2: new
Time (mean ± σ): 150.7 ms ± 1.1 ms
Range (min … max): 149.3 ms … 153.4 ms 19 runs
Summary
'old' ran
1.36 ± 0.08 times faster than 'new'
Performance issues like this are what motivated 702110aac (commit-graph:
use config to specify generation type, 2021-02-25).
In the future, we could fix this performance problem by inserting the
corrected commit date offsets into the Commit Date chunk instead of
having that data in an extra chunk.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing the generation numbers for a commit-graph, we compute
the corrected commit dates and then check if their offsets from the
actual dates is too large to fit in the 32-bit Generation Data chunk.
However, there is a problem with this approach: if we have parsed the
generation data from the previous commit-graph, then we continue the
loop because the corrected commit date is already computed. This causes
an under-count in the number of overflow values.
It is incorrect to add an increment to num_generation_data_overflows
next to this 'continue' statement, because we might start
double-counting commits that are computed because of the depth-first
search walk from a commit with an earlier OID.
Instead, iterate over the full commit list at the end, checking the
offsets to see how many grow beyond the maximum value.
Create a new t5328-commit-graph-64-bit-time.sh test script to handle
special cases of testing 64-bit timestamps. This helps demonstrate this
bug in more cases. It still won't hit all potential cases until the next
change, which reenables reading generation numbers. Use the skip_all
trick from 0a2bfccb9c (t0051: use "skip_all" under !MINGW in
single-test file, 2022-02-04) to make the output clean when run on a
32-bit system.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The graph_git_behavior helper is useful for testing that certain Git
commands behave the same when using the commit-graph and when not using
the commit-graph. Extract it to a new lib-commit-graph.sh file for use
in new test scripts that will split out from t5318.
While doing this extraction, also extract graph_read_expect and the
logic for priming the test_oid_cache.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be helpful to verify that the 'struct commit_graph' that results
from parsing a commit-graph is correctly structured. The existence of
different chunks is not enough to verify that all of the optional
features are correctly enabled.
Update 'test-tool read-graph' to output an "options:" line that includes
information for different parts of the struct commit_graph.
In particular, this change demonstrates that the read_generation_data
option is never being enabled, which will be fixed in a later change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading references via `files_read_raw_ref()` we always consult
both the loose reference, and if that wasn't found, we also consult the
packed-refs file. While this makes sense to read a generic reference, it
is wasteful in the case where we only care about symbolic references:
the packed-refs backend does not support them, and thus it cannot ever
return one for us.
Special-case reading of symbolic references for the files backend such
that we always skip asking the packed-refs backend.
We use `refs_read_symbolic_ref()` extensively to determine whether we
need to skip updating local symbolic references during a fetch, which is
why the change results in a significant speedup when doing fetches in
repositories with huge numbers of references. The following benchmark
executes a mirror-fetch in a repository with about 2 million references
via `git fetch --prune --no-write-fetch-head +refs/*:refs/*`:
Benchmark 1: HEAD~
Time (mean ± σ): 68.372 s ± 2.344 s [User: 65.629 s, System: 8.786 s]
Range (min … max): 65.745 s … 70.246 s 3 runs
Benchmark 2: HEAD
Time (mean ± σ): 60.259 s ± 0.343 s [User: 61.019 s, System: 7.245 s]
Range (min … max): 60.003 s … 60.649 s 3 runs
Summary
'HEAD' ran
1.13 ± 0.04 times faster than 'HEAD~'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two cases in the remote code where we check whether a reference
is symbolic or not, but don't mind in case it doesn't exist or in case
it exists but is a non-symbolic reference. Convert these two callsites
to use the new `refs_read_symbolic_ref()` function, whose intent is to
implement exactly that usecase.
No change in behaviour is expected from this change.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching from a remote repository we will by default write what has
been fetched into the special FETCH_HEAD reference. The order in which
references are written depends on whether the reference is for merge or
not, which, despite some other conditions, is also determined based on
whether the old object ID the reference is being updated from actually
exists in the repository.
To write FETCH_HEAD we thus loop through all references thrice: once for
the references that are about to be merged, once for the references that
are not for merge, and finally for all references that are ignored. For
every iteration, we then look up the old object ID to determine whether
the referenced object exists so that we can label it as "not-for-merge"
if it doesn't exist. It goes without saying that this can be expensive
in case where we are fetching a lot of references.
While this is hard to avoid in the case where we're writing FETCH_HEAD,
users can in fact ask us to skip this work via `--no-write-fetch-head`.
In that case, we do not care for the result of those lookups at all
because we don't have to order writes to FETCH_HEAD in the first place.
Skip this busywork in case we're not writing to FETCH_HEAD. The
following benchmark performs a mirror-fetch in a repository with about
two million references via `git fetch --prune --no-write-fetch-head
+refs/*:refs/*`:
Benchmark 1: HEAD~
Time (mean ± σ): 75.388 s ± 1.942 s [User: 71.103 s, System: 8.953 s]
Range (min … max): 73.184 s … 76.845 s 3 runs
Benchmark 2: HEAD
Time (mean ± σ): 69.486 s ± 1.016 s [User: 65.941 s, System: 8.806 s]
Range (min … max): 68.864 s … 70.659 s 3 runs
Summary
'HEAD' ran
1.08 ± 0.03 times faster than 'HEAD~'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During packfile negotiation the client will send "want" and "want-ref"
lines to the server to tell it which objects it is interested in. The
server-side parses each of those and looks them up to see whether it
actually has requested objects. This lookup is performed by calling
`parse_object()` directly, which thus hits the object database. In the
general case though most of the objects the client requests will be
commits. We can thus try to look up the object via the commit-graph
opportunistically, which is much faster than doing the same via the
object database.
Refactor parsing of both "want" and "want-ref" lines to do so.
The following benchmark is executed in a repository with a huge number
of references. It uses cached request from git-fetch(1) as input to
git-upload-pack(1) that contains about 876,000 "want" lines:
Benchmark 1: HEAD~
Time (mean ± σ): 7.113 s ± 0.028 s [User: 6.900 s, System: 0.662 s]
Range (min … max): 7.072 s … 7.168 s 10 runs
Benchmark 2: HEAD
Time (mean ± σ): 6.622 s ± 0.061 s [User: 6.452 s, System: 0.650 s]
Range (min … max): 6.535 s … 6.727 s 10 runs
Summary
'HEAD' ran
1.07 ± 0.01 times faster than 'HEAD~'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ps/fetch-atomic:
fetch: make `--atomic` flag cover pruning of refs
fetch: make `--atomic` flag cover backfilling of tags
refs: add interface to iterate over queued transactional updates
fetch: report errors when backfilling tags fails
fetch: control lifecycle of FETCH_HEAD in a single place
fetch: backfill tags before setting upstream
fetch: increase test coverage of fetches
Add "fast_unwind_on_malloc=0" to LSAN_OPTIONS to get more meaningful
stack traces from LSAN. This isn't required under ASAN which will emit
traces such as this one for a leak in "t/t0006-date.sh":
$ ASAN_OPTIONS=detect_leaks=1 ./t0006-date.sh -vixd
[...]
Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x488b94 in strdup (t/helper/test-tool+0x488b94)
#1 0x9444a4 in xstrdup wrapper.c:29:14
#2 0x5995fa in parse_date_format date.c:991:24
#3 0x4d2056 in show_dates t/helper/test-date.c:39:2
#4 0x4d174a in cmd__date t/helper/test-date.c:116:3
#5 0x4cce89 in cmd_main t/helper/test-tool.c:127:11
#6 0x4cd1e3 in main common-main.c:52:11
#7 0x7fef3c695e49 in __libc_start_main csu/../csu/libc-start.c:314:16
#8 0x422b09 in _start (t/helper/test-tool+0x422b09)
SUMMARY: AddressSanitizer: 3 byte(s) leaked in 1 allocation(s).
Aborted
Whereas LSAN would emit this instead:
$ ./t0006-date.sh -vixd
[...]
Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
#1 0x7f2be1d614aa in strdup string/strdup.c:42:15
SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
Aborted
Now we'll instead git this sensible stack trace under
LSAN. I.e. almost the same one (but starting with "malloc", as is
usual for LSAN) as under ASAN:
Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x4323b8 in malloc (t/helper/test-tool+0x4323b8)
#1 0x7f012af5c4aa in strdup string/strdup.c:42:15
#2 0x5cb164 in xstrdup wrapper.c:29:14
#3 0x495ee9 in parse_date_format date.c:991:24
#4 0x453aac in show_dates t/helper/test-date.c:39:2
#5 0x453782 in cmd__date t/helper/test-date.c:116:3
#6 0x451d95 in cmd_main t/helper/test-tool.c:127:11
#7 0x451f1e in main common-main.c:52:11
#8 0x7f012aef5e49 in __libc_start_main csu/../csu/libc-start.c:314:16
#9 0x42e0a9 in _start (t/helper/test-tool+0x42e0a9)
SUMMARY: LeakSanitizer: 3 byte(s) leaked in 1 allocation(s).
Aborted
As the option name suggests this does make things slower, e.g. for
t0001-init.sh we're around 10% slower:
$ hyperfine -L v 0,1 'LSAN_OPTIONS=fast_unwind_on_malloc={v} make T=t0001-init.sh' -r 3
Benchmark 1: LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh
Time (mean ± σ): 2.135 s ± 0.015 s [User: 1.951 s, System: 0.554 s]
Range (min … max): 2.122 s … 2.152 s 3 runs
Benchmark 2: LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh
Time (mean ± σ): 1.981 s ± 0.055 s [User: 1.769 s, System: 0.488 s]
Range (min … max): 1.941 s … 2.044 s 3 runs
Summary
'LSAN_OPTIONS=fast_unwind_on_malloc=1 make T=t0001-init.sh' ran
1.08 ± 0.03 times faster than 'LSAN_OPTIONS=fast_unwind_on_malloc=0 make T=t0001-init.sh'
I think that's more than worth it to get the more meaningful stack
traces, we can always provide LSAN_OPTIONS=fast_unwind_on_malloc=0 for
one-off "fast" runs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the GIT_BUILD_DIR from a path like "/path/to/build/t/.." to
"/path/to/build". The "TEST_DIRECTORY" here is already made an
absolute path a few lines above this.
We could simply do $(cd "$TEST_DIRECTORY"/.." && pwd) here, but as
noted in the preceding commit the "$TEST_DIRECTORY" can't be anything
except the path containing this test-lib.sh file at this point, so we
can more cheaply and equally strip the "/t" off the end.
This change will be helpful to LSAN_OPTIONS which will want to strip
the build directory path from filenames, which we couldn't do if we
had a "/.." in there.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct a misleading comment added by me in 62f539043c (test-lib:
Allow overriding of TEST_DIRECTORY, 2010-08-19), and add an assertion
that TEST_DIRECTORY cannot point to any directory except the "t"
directory in the top-level of git.git.
This assertion is in effect not new, since we'd already die if that
wasn't the case[1], but it and the updated commentary help to make
that clearer.
The existing comments were also on the wrong arms of the
"if". I.e. the "allow tests to override this" was on the "test -z"
arm. That came about due to a combination of 62f539043c and
85176d7251 (test-lib.sh: convert $TEST_DIRECTORY to an absolute path,
2013-11-17).
Those earlier comments could be read as allowing the "$TEST_DIRECTORY"
to be some path outside of t/. As explained in the updated comment
that's impossible, rather it was meant for *tests* that ran outside of
t/, i.e. the "t0000-basic.sh" tests that use "lib-subtest.sh".
Those tests have a different working directory, but they set the
"TEST_DIRECTORY" to the same path for bootstrapping. The comments now
reflect that, and further comment on why we have a hard dependency on
this.
1. https://lore.kernel.org/git/220222.86o82z8als.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change our ASAN_OPTIONS and LSAN_OPTIONS to set defaults for those
variables, rather than punting out entirely if we already have them in
the environment.
We want to take any user-provided settings over our own, but we can do
that by prepending our defaults to the variable. The libsanitizer
options parsing has "last option wins" semantics.
It's now possible to do e.g.:
LSAN_OPTIONS=report_objects=1 ./t0006-date.sh
And not have the "report_objects=1" setting overwrite our sensible
default of "abort_on_error=1", but by prepending to the list we ensure
that:
LSAN_OPTIONS=report_objects=1:abort_on_error=0 ./t0006-date.sh
Will take the desired "abort_on_error=0" over our default.
See b0f4c9087e (t: support clang/gcc AddressSanitizer, 2014-12-08)
for the original pattern being altered here, and
85b81b35ff (test-lib: set LSAN_OPTIONS to abort by default,
2017-09-05) for when LSAN_OPTIONS was added in addition to the
then-existing ASAN_OPTIONS.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is expected that an empty/unpopulated untracked cache structure can
be written to the index - by update-index, or by a "git status" call
that sees the untracked cache should be enabled and is not, but is
running with options that make the untracked cache non-applicable in
that run (eg a pathspec).
Currently, if that happens, then subsequent "git status" calls end up
populating the untracked cache, but not writing the index (not saving
their work) - so the performance outcome is almost identical to the
cache being altogether disabled.
This continues until the index gets written with the untracked cache
populated, for some *other* reason, such as a working tree change.
Detect the condition where an empty untracked cache exists in the
index and we will collect the list of untracked paths, and queue an
index write under that condition, so that the collected untracked
paths can be written out to the untracked cache extension in the
index.
This change depends on previous fixes to t7519 for the "ignore .git
changes when invalidating UNTR" test case to pass - before this fix,
the test never actually did anything as it was not set up correctly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In its current state, the t7519 test dealing with untracked cache
assumes that "git update-index --untracked-cache" will *populate* the
untracked cache. This is not correct - it will only add an empty
untracked cache structure to the index.
If we're going to compare two git status runs with something
interesting happening in-between, we need to ensure that the index is
in a stable/steady state *before* that first run.
Achieve this by adding another prior "git status" run.
At this stage this change does nothing, because there is a bug,
addressed in the next patch, whereby once the empty untracked cache
structure is added by the update-index invocation, the untracked cache
gets updated in every subsequent "git status" call, but the index with
these updates does not get written down.
That bug actually invalidates this entire test case - but we're fixing
that next.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7519 there is a test that writes files to disk, and immediately
writes the index with the untracked cache. Because of
mtime-comparison logic that uses a 1-second resolution, this means
the cached entries are not trusted/used under some circumstances
(see read-cache.c#is_racy_stat()).
Untracked cache tests in t7063 use a 1-second delay to avoid this
issue, but we don't want to introduce arbitrary slowdowns, so instead
use test-tool chmtime to backdate the files slightly. The t7063
delays are a #leftoverbit, to be worked on in a separate series.
This change doesn't actually affect the outcome of the test, but does
enhance its validity, and becomes relevant after later changes.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The script uses "git show -s" to display the title of the merge
commit being studied, without explicitly disabling the pager, which
is not a safe thing to do in a script.
For example, when the pager is set to "less" with "-SF" options (-S
tells the pager not to fold lines but allow horizontal scrolling to
show the overly long lines, -F tells the pager not to wait if the
output in its entirety is shown on a single page), and the title of
the merge commit is longer than the width of the terminal, the pager
will wait until the end-user tells it to quit after showing the
single line.
Explicitly disable the pager with this "git show" invocation to fix
this.
The command uses the "--pretty=format:..." format, which adds LF in
between each pair of commits it outputs, which means that the label
for the merge being learned from will be followed by the next
message on the same line. "--pretty=tformat:..." is what we should
instead, which adds LF after each commit, or a more modern way to
spell it, i.e. "--format=...". This existing breakage becomes
easier to see, now we no longer use the pager.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users who are accustomed to doing `git checkout <tag>` assume that
`git switch <tag>` will do the same thing. Inform them of the --detach
option so they aren't left wondering why `git switch` doesn't work but
`git checkout` does.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the read_object_with_reference() function to take an "enum
object_type". It was not prepared to handle an arbitrary "const
char *type", as it was itself calling type_from_string().
Let's change the only caller that passes in user data to use
type_from_string(), and convert the rest to use e.g. "OBJ_TREE"
instead of "tree_type".
The "cat-file" caller is not on the codepath that
handles"--allow-unknown", so the type_from_string() there is safe. Its
use of type_from_string() doesn't functionally differ from that of the
pre-image.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split off a *_literally() variant of the write_object_file_prepare()
function. To do this create a new "hash_object_body()" static helper.
We now defer the type_name() call until the very last moment in
format_object_header() for those callers that aren't "hash-object
--literally".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the hash_object_file() function to take an "enum
object_type".
Since a preceding commit all of its callers are passing either
"{commit,tree,blob,tag}_type", or the result of a call to type_name(),
the parse_object() caller that would pass NULL is now using
stream_object_signature().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before 0c3db67cc8 (hash-object --literally: fix buffer overrun with
extra-long object type, 2015-05-04) the hash-object code being changed
here called write_sha1_file() to both hash and write a loose
object. Before that we'd use hash_sha1_file() to if "-w" wasn't
provided, and otherwise call write_sha1_file().
Now we'll always call the same function for both writing. Let's rename
it from hash_*_literally() to write_*_literally(). Even though the
write_*() might not actually write if HASH_WRITE_OBJECT isn't in
"flags", having it be more similar to write_object_file_flags() than
hash_object_file(), but carrying a name that would suggest that it's a
variant of the latter is confusing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the check_object_signature() function into that non-streaming
version (it accepts an already filled "buf"), and a new
stream_object_signature() which will retrieve the object from storage,
and hash it on-the-fly.
All of the callers of check_object_signature() were effectively
calling two different functions, if we go by cyclomatic
complexity. I.e. they'd either take the early "if (map)" branch and
return early, or not. This has been the case since the "if (map)"
condition was added in 090ea12671 (parse_object: avoid putting whole
blob in core, 2012-03-07).
We can then further simplify the resulting check_object_signature()
function since only one caller wanted to pass a non-NULL "buf" and a
non-NULL "real_oidp". That "read_loose_object()" codepath used by "git
fsck" can instead use hash_object_file() followed by oideq().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those users of the object API that misused
check_object_signature() by assuming it returned any non-zero when the
OID didn't match the expected value to check <0 instead. In practice
all of this code worked before, but it wasn't consistent with rest of
the users of the API.
Let's also clarify what the <0 return value means in API docs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the API documentation for check_object_signature() to cache.h,
where its prototype is declared. This is in preparation for adding a
companion function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the name of the second argument to check_object_signature() to
be "buf" in object-file.c, making it consistent with the prototype in
cache.h
This fixes an inconsistency that's been with us since 2ade934026 (Add
"check_sha1_signature()" helper function, 2005-04-08), and makes a
subsequent commit's diff smaller, as we'll move these API docs to
cache.h.
While we're at it fix a small grammar error in the documentation,
dropping an "an" before "in-core object-data".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the write_object_file() function to take an "enum object_type"
instead of a "const char *type". Its callers either passed
{commit,tree,blob,tag}_type and can pass the corresponding OBJ_* type
instead, or were hardcoding strings like "blob".
This avoids the back & forth fragility where the callers of
write_object_file() would have the enum type, and convert it
themselves via type_name(). We do have to now do that conversion
ourselves before calling write_object_file_prepare(), but those
codepaths will be similarly adjusted in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a convenience function to wrap the xsnprintf() command that
generates loose object headers. This code was copy/pasted in various
parts of the codebase, let's define it in one place and re-use it from
there.
All except one caller of it had a valid "enum object_type" for us,
it's only write_object_file_prepare() which might need to deal with
"git hash-object --literally" and a potential garbage type. Let's have
the primary API use an "enum object_type", and define a *_literally()
function that can take an arbitrary "const char *" for the type.
See [1] for the discussion that prompted this patch, i.e. new code in
object-file.c that wanted to copy/paste the xsnprintf() invocation.
In the case of fast-import.c the callers unfortunately need to cast
back & forth between "unsigned char *" and "char *", since
format_object_header() ad encode_in_pack_object_header() take
different signedness.
1. https://lore.kernel.org/git/211213.86bl1l9bfz.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hash_object_file() function added in abdc3fc842 (Add
hash_sha1_file(), 2006-10-14) did not have a meaningful return value,
and it never has.
One was seemingly added to avoid adding braces to the "ret = "
assignments being modified here. Let's instead assign "0" to the "ret"
variables at the beginning of the relevant functions, and have them
return "void".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the declaration of the "ret" and "re_allocated"
variables. It's not our usual style to group variable declarations
simply because they share a type, we'd only prefer to do so when the
two are closely related (e.g. "int i, j"). This change makes a
subsequent and meaningful change's diff smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Messages "ort" merge backend prepares while dealing with conflicted
paths were unnecessarily confusing since it did not differentiate
inner merges and outer merges.
* en/ort-inner-merge-conflict-report:
merge-ort: make informational messages from recursive merges clearer
Workaround we have for versions of PCRE2 before their version 10.36
were in effect only for their versions newer than 10.36 by mistake,
which has been corrected.
* rs/pcre-invalid-utf8-fix-fix:
grep: fix triggering PCRE2_NO_START_OPTIMIZE workaround
Setting core.untrackedCache to true failed to add the untracked
cache extension to the index.
* ds/core-untracked-cache-config:
dir: force untracked cache with core.untrackedCache
Plug (some) memory leaks around parse_date_format().
* ab/date-mode-release:
date API: add and use a date_mode_release()
date API: add basic API docs
date API: provide and use a DATE_MODE_INIT
date API: create a date.h, split from cache.h
cache.h: remove always unused show_date_human() declaration
Some code clean-up in the "git grep" machinery.
* ab/grep-patterntype:
grep: simplify config parsing and option parsing
grep.c: do "if (bool && memchr())" not "if (memchr() && bool)"
grep.h: make "grep_opt.pattern_type_option" use its enum
grep API: call grep_config() after grep_init()
grep.c: don't pass along NULL callback value
built-ins: trust the "prefix" from run_builtin()
grep tests: add missing "grep.patternType" config tests
grep tests: create a helper function for "BRE" or "ERE"
log tests: check if grep_config() is called by "log"-like cmds
grep.h: remove unused "regex_t regexp" from grep_opt
"git clone --filter=... --recurse-submodules" only makes the
top-level a partial clone, while submodules are fully cloned. This
behaviour is changed to pass the same filter down to the submodules.
* js/apply-partial-clone-filters-recursively:
clone, submodule: pass partial clone filters to submodules
Unify more messages to help l10n.
* ja/i18n-common-messages:
i18n: fix some misformated placeholders in command synopsis
i18n: remove from i18n strings that do not hold translatable parts
i18n: factorize "invalid value" messages
i18n: factorize more 'incompatible options' messages
Further tweaks on progress API.
* ab/only-single-progress-at-once:
pack-bitmap-write.c: don't return without stop_progress()
progress API: unify stop_progress{,_msg}(), fix trace2 bug
progress.c: refactor stop_progress{,_msg}() to use helpers
progress.c: use dereferenced "progress" variable, not "(*p_progress)"
progress.h: format and be consistent with progress.c naming
progress.c tests: test some invalid usage
progress.c tests: make start/stop commands on stdin
progress.c test helper: add missing braces
leak tests: fix a memory leak in "test-progress" helper
"git sparse-checkout" wants to work with per-worktree configuration,
but did not work well in a worktree attached to a bare repository.
* ds/sparse-checkout-requires-per-worktree-config:
config: make git_configset_get_string_tmp() private
worktree: copy sparse-checkout patterns and config on add
sparse-checkout: set worktree-config correctly
config: add repo_config_set_worktree_gently()
worktree: create init_worktree_config()
Documentation: add extensions.worktreeConfig details
Error output given in response to an ambiguous object name has been
improved.
* ab/ambiguous-object-name:
object-name: re-use "struct strbuf" in show_ambiguous_object()
object-name: iterate ambiguous objects before showing header
object-name: show date for ambiguous tag objects
object-name: make ambiguous object output translatable
object-name: explicitly handle bad tags in show_ambiguous_object()
object-name: explicitly handle OBJ_BAD in show_ambiguous_object()
object-name tests: add tests for ambiguous object blind spots
Change a few existing non-designated initializer assignments to use
"partial" designated initializer assignments. I.e. we're now omitting
the "NULL" or "0" fields and letting the initializer take care of them
for us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "struct refspec_item" at the top of refspec.c to use
designated initializers. Let's keep the "= 0" assignments for
self-documentation purposes, even though they're now redundant.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "struct stream_filter_vtbl" and "struct stream_filter"
assignments in convert.c to use designated initializers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "struct userdiff_driver" assignmentns to use designated
initializers, but let's keep the PATTERNS() and IPATTERN() convenience
macros to avoid churn, but have them defined in terms of designated
initializers.
For the "driver_true" and "driver_false" let's have the compiler
implicitly initialize most of the fields, but let's leave a redundant
".binary = 0" for "driver_true" to make it obvious that it's the
opposite of the the ".binary = 1" for "driver_false".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the preceding commit, change another file-level struct
assignment to use designated initializers.
Retain the ".name = NULL" etc. in the case of the first element of
"unknown hash algorithm", to make it explicit that we're intentionally
not setting those, it's not just that we forgot.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the "static struct tr2_dst" assignments in trace2/* to use
designated initializers. I don't think it improves readability to
include the explicit 0-ing out of the
fd/initialized/need_close/too_many_files members, so let's have those
be initialized implicitly by the compiler.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the preceding commit, change another set of file-level struct
assignments to use designated initializers.
As before the "= NULL" assignments are redundant, but we're keeping
them for self-documentation purposes. The comments left to explain the
pre-image can now be removed in favor of working code that relays the
same information to the reader.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cut down a lot on the verbosity of the "server" assignment in
imap-send.c using designated initializers, only the "ssl_verify"
member was being set to a non-NULL non-0 value.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When generating a message for a stash, "git stash" only records the
part of the branch name to the right of the last "/". e.g. if HEAD is at
"foo/bar/baz", "git stash" generates a message prefixed with "WIP on
baz:" instead of "WIP on foo/bar/baz:".
Fix this by using skip_prefix() to skip "refs/heads/" instead of looking
for the last instance of "/".
Reported-by: Kraymer <kraymer@gmail.com>
Reported-by: Daniel Hahler <git@thequod.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a small courtesy to users, report what limit was breached. This
is especially useful when a push exceeds a server-defined limit, since
the user is unlikely to have configured the limit (their host did).
Also demonstrate the human-readable message in a test.
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --graph --graph" used to leak a graph structure, and there
was no way to countermand "--graph" that appear earlier on the
command line. A "--no-graph" option has been added and resource
leakage has been plugged.
* ah/log-no-graph:
log: add a --no-graph option
log: fix memory leak if --graph is passed multiple times
Fix tests that are unnecessarily specific to ref-files backend.
* hw/t1410-adjust-test-for-reftable:
t1410: mark bufsize boundary test as REFFILES
t1410: use test-tool ref-store to inspect reflogs
A couple of optimization to "git fetch".
* ps/fetch-optim-with-commit-graph:
fetch: skip computing output width when not printing anything
fetch-pack: use commit-graph when computing cutoff
e77aa336f1 ("ls-files: optionally recurse into submodules", 2016-10-10)
taught ls-files the --recurse-submodules argument, but only in a limited
set of circumstances. In particular, --stage was unsupported, perhaps
because there was no repo_find_unique_abbrev(), which was only
introduced in 8bb95572b0 ("sha1-name.c: add
repo_find_unique_abbrev_r()", 2019-04-16). This function is needed for
using --recurse-submodules with --stage.
Now that we have repo_find_unique_abbrev(), teach support for this
combination of arguments.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test_path_is_file_not_symlink(), test_path_is_dir_not_symlink()
and test_path_is_symlink(). Case of use for the first one
in test t/t3903-stash.sh to replace "test -f" because that function
explicitly want the file not to be a symlink.
Give more friendly error message.
Signed-off-by: COGONI Guillaume <cogoni.guillaume@gmail.com>
Co-authored-by: BRESSAT Jonathan <git.jonathan.bressat@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_path_is_* to replace test [-d|-f] because that give more
explicit debugging information. And it doesn't change the semantics.
Signed-off-by: COGONI Guillaume <cogoni.guillaume@gmail.com>
Co-authored-by: BRESSAT Jonathan <git.jonathan.bressat@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless required)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide.
Amend the usage strings that don't follow the style convention/guide.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a formatting regression in 1b81d8cb19 (help: use command-list.txt
for the source of guides, 2018-05-20). Adjust the output of "git help
--guides" and any other future single-section commands so that a
newline isn't inserted before the only section being printed.
This changes the output from:
$ git help --guides
The Git concept guides are:
[...]
To:
$ git help --guides
The Git concept guides are:
[...]
That we started printing an extra "\n" in 1b81d8cb19 wasn't intended,
but an emergent effect of moving all of the printing of "git help"
output to code that was ready to handle printing N sections.
With 1b81d8cb19 we started using the "print_cmd_by_category()"
function added earlier in the same series, or in cfb22a02ab (help:
use command-list.h for common command list, 2018-05-10).
Fixing this formatting nit is easy enough. Let's have all of the
output that would like to be "\n"-separated from other lines emit its
own "\n". We then adjust "print_cmd_by_category()" to only print a
"\n" to delimit the sections it's printing out.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the ability to only emit git's own usage information under
--all. This also allows us to extend the "test_section_spacing" tests
added in a preceding commit to test "git help --all"
output.
Previously we could not do that, as the tests might find a git-*
command in the "$PATH", which would make the output differ from one
setup to another.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add more sanity checking to "git help" usage by erroring out if these
man viewer options are combined with incompatible command-modes that
will never use these documentation viewers.
This continues the work started in d35d03cf93 (help: simplify by
moving to OPT_CMDMODE(), 2021-09-22) of adding more sanity checking to
"git help". Doing this allows us to clarify the "SYNOPSIS" in the
documentation, and the "git help -h" output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do the same for the "--all" option that I did for "--guides" in
9856ea6785 (help: correct usage & behavior of "git help --guides",
2021-09-22). I.e. we've documented it as ignoring non-option
arguments, let's have it error out instead.
As with other changes made in 62f035aee3 (Merge branch
'ab/help-config-vars', 2021-10-13) this is technically a change in
behavior, but in practice it's just a bug fix. We were ignoring this
before, but by erroring we can simplify our documentation and
synopsis, as well as avoid user confusion as they wonder what the
difference between e.g. "git help --all" and "git help --all status"
is (there wasn't any difference).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the errors added in d35d03cf93 (help: simplify by moving to
OPT_CMDMODE(), 2021-09-22) to quote the offending option at the user
when invoked as e.g.:
git help --guides garbage
Now instead of:
fatal: this option doesn't take any other arguments
We'll emit:
fatal: the '--guides' option doesn't take any non-option arguments
Let's also rename the function, as it will be extended to do other
checks that aren't "no extra argc" in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split up the listing of commands and aliases from
list_all_cmds_help(). This will make a subsequent functional change
smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's logic in "help.c"'s "print_cmd_by_category()" to emit "help"
output with particular spacing, which doesn't make much sense when
emitting only one section with "help -g".
Let's add tests for the current spacing in preparation for a
subsequent whitespace formatting fix, and make sure that that fix
doesn't cause regressions for the "git" and "git help" output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code in "help.c" that used printf_ln() without format
specifiers to use puts() instead, as other existing code in the file
does. Let's also change related code to use puts() instead of the
equivalent of calling "printf" with a "%s\n" format.
This formatting-only change will make a subsequent functional change
easier to read, as it'll be changing code that's consistently using
the same functions to do the same things.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a missing "]" to documentation added in 63eae83f8f (help: add "-a
--verbose" to list all commands with synopsis, 2018-05-20). This made
it seem as though "--[no-]verbose" can only be provided with "--all",
not "-a". The corresponding usage information in the C
code ("builtin_help_usage") does not have the same problem.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is part of the reftable API, so it should use the
reftable_ prefix
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ID => ref map is trimming object IDs to a disambiguating prefix.
Check that we are computing their length correctly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing the same hash many times, we might decide to use a
length-1 object ID prefix for the ObjectID => ref table, which is out
of spec.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The public interface (reftable_writer) already ensures that keys are
written in strictly increasing order, and an empty key by definition
fails this check.
However, by also enforcing this at the block layer, it is easier to
verify that records (which are written into blocks) never have to
consider the possibility of empty keys.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Empty keys can only be written as ref records with empty names. The
log record has a logical timestamp in the key, so the key is never
empty.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The spec says 2 <= object_id_len <= 31. We are lenient and allow 1,
but we forbid 0, so we can be sure that we never read a 0-length key.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The value is stored in a 5-bit field, so we can't support more without
a format version upgrade.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The builtin "add -p" reads the key "F2" as three separate keys "^[",
"O" and "Q". The "Q" causes it to quit which is probably not what the
user was expecting. This is because it uses poll() to check for
pending input when reading escape sequences but reads the input with
getchar() which is buffered by default and so hoovers up all the
pending input leading poll() think there isn't anything pending. Fix
this by calling setbuf() to disable input buffering if
interactive.singlekey is set.
Looking at the comment above mingw_getchar() in terminal.c I wonder if
that function is papering over this bug and could be removed.
Unfortunately I don't have access to windows to test that.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If VMIN and VTIME are both set to zero then the terminal performs
non-blocking reads which means that read_key_without_echo() returns
EOF if there is no key press pending. This results in the user being
unable to select anything when running "git add -p". Fix this by
explicitly setting VMIN and VTIME when enabling non-canonical mode.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When disable_bits() changes the terminal attributes it uses
sigchain_push_common() to restore the terminal if a signal is received
before restore_term() is called. However there is no corresponding
call to sigchain_pop_common() when the settings are restored so the
signal handler is left on the sigchain stack. This leaves the stack
unbalanced so code such as
sigchain_push_common(my_handler);
...
read_key_without_echo(...);
...
sigchain_pop_common();
pops the handler pushed by disable_bits() rather than the one it
intended to. Additionally "git add -p" changes the terminal settings
every time it reads a key press so the stack can grow significantly.
In order to fix this save_term() now sets up the signal handler so
restore_term() can unconditionally call sigchain_pop_common(). There
are no callers of save_term() outside of terminal.c as the only
external caller was removed by e3f7e01b50 ("Revert "editor: save and
reset terminal after calling EDITOR"", 2021-11-22). Any future callers
of save_term() should benefit from having the signal handler set up
for them.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Break out of the loop to ensure restore_term() is called before
returning.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pipes ignore error codes of LHS command and thus we should not use them
with Git in tests. As an alternative, use a 'tmp' file to write the Git
output so we can test the exit code.
Signed-off-by: Shubham Mishra <shivam828787@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the last of multiple changes to git-worktree.txt, starting at
the LIST OUTPUT FORMAT section.
The EXAMPLES section has an instance of "working tree" that must stay as
it is, because it is not talking about a worktree, but an example of why
a user might want to create a worktree.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the sixth of multiple changes to git-worktree.txt, restricted to
the DETAILS section.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the fifth of multiple changes to git-worktree.txt, restricted to
the CONFIGURATION FILE section.
While here, clear up some language to improve readability.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the fourth of multiple changes to git-worktree.txt, restricted
to the REFS section.
This section previously described "per working tree" refs but they are
now replaced with "per-worktree" refs, which matches the definition in
glossary-content.txt.
The first paragraph of this section was also a bit confusing, so it is
cleaned up to make it easier to understand.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the third of multiple changes to git-worktree.txt, restricted to
the OPTIONS section.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the second of multiple changes to git-worktree.txt, restricted
to the COMMANDS section.
There is some language around the movement of "the working tree of a
linked worktree" which is used once, but the remaining uses are left as
just moving "a linked worktree" for brevity.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is helpful to distinguish between a 'working tree' and a 'worktree'.
A worktree contains a working tree plus additional metadata. This
metadata includes per-worktree refs and worktree-specific config.
This is the first of multiple changes to git-worktree.txt, restricted to
the DESCRIPTION section.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ability to add the --no-checkout flag to 'git worktree' was added in
ef2a0ac9a0 (worktree: add: introduce --checkout option, 2016-03-29).
Recently, we noticed that add_worktree() is rather complicated, so
extract the logic for this checkout process to simplify the method.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This logic was introduced by 5325591 (worktree: copy sparse-checkout
patterns and config on add, 2022-02-07), but some feedback came in that
the add_worktree() method was already too complex. It is better to
extract this logic into a helper method to reduce this complexity.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This logic was introduced by 5325591 (worktree: copy sparse-checkout
patterns and config on add, 2022-02-07), but some feedback came in that
the add_worktree() method was already too complex. It is better to
extract this logic into a helper method to reduce this complexity.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two messages differ only by the config key name, which should not
be translated. Extract those keys so the messages can be translated from
the same string.
Reported-by: Jean-Noël AVILA <jn.avila@free.fr>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "else" branches of the HAVE_VARIADIC_MACROS macro, which
have been unconditionally omitted since 765dc16888 (git-compat-util:
always enable variadic macros, 2021-01-28).
Since were always omitted, anyone trying to use a compiler without
variadic macro support to compile a git since version
git v2.31.0 or later would have had a compilation error. 10 months
across a few releases since then should have been enough time for
anyone who cared to run into that and report the issue.
In addition to that, for anyone unsetting HAVE_VARIADIC_MACROS we've
been emitting extremely verbose warnings since at least
ee4512ed48 (trace2: create new combined trace facility,
2019-02-22). That's because there is no such thing as a
"region_enter_printf" or "region_leave_printf" format, so at least
under GCC and Clang everything that includes trace.h (almost every
file) emits a couple of warnings about that.
There's a large benefit to being able to have a hard dependency rely
on variadic macros, the code surrounding usage.c is hard to maintain
if we need to write two implementations of everything, and by relying
on "__FILE__" and "__LINE__" along with "__VA_ARGS__" we can in the
future make error(), die() etc. log where they were called from. We've
also recently merged d67fc4bf0b (Merge branch 'bc/require-c99',
2021-12-10) which further cements our hard dependency on C99.
So let's delete the fallback code, and update our CodingGuidelines to
note that we depend on this. The added bullet-point starts with
lower-case for consistency with other bullet-points in that section.
The diff in "trace.h" is relatively hard to read, since we need to
retain the existing API docs, which were comments on the code used if
HAVE_VARIADIC_MACROS was not defined.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a comment added in e208f9cc75 (make error()'s constant return
value more visible, 2012-12-15). It's not correct that this is GCC-ism
anymore, it's code that uses standard C99 features.
The comment being changed here pre-dates the HAVE_VARIADIC_MACROS
define, which we got in e05bed960d (trace: add 'file:line' to all
trace output, 2014-07-12).
The original implementation of an error() macro) in e208f9cc75 used a
GCC-ism with the paste operator (see the commit message for mention of
it), but that was dropped later by 9798f7e5f9 (Use __VA_ARGS__ for all
of error's arguments, 2013-02-08), giving us the C99-portable version
we have now.
While we could remove the __GNUC__ define here, it might cause issues
for other compilers or static analysis systems, so let's not. See
87fe5df365 (inline constant return from error() function, 2014-05-06)
for one such issue.
See also e05bed960d (trace: add 'file:line' to all trace output,
2014-07-12) for another comment about GNUC's handling of __VA_ARGS__.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct strmap paths member of merge_options_internal is perhaps the
most central data structure to all of merge-ort. Because all the paths
involved in the merge need to be kept until the merge is complete, this
"paths" data structure traditionally took responsibility for owning all
the allocated paths. When the merge is over, those paths were free()d
as part of free()ing this strmap.
In commit 6697ee01b5 (merge-ort: switch our strmaps over to using
memory pools, 2021-07-30), we changed the allocations for pathnames to
come from a memory pool. That meant the ownership changed slightly;
there were no individual free() calls to make, instead the memory pool
owned all those paths and they were free()d all at once.
Unfortunately unique_path() was written presuming the pre-memory-pool
model, and allocated a path on the heap and left it in the strmap for
later free()ing. Modify it to return a path allocated from the memory
pool instead.
Note that there's one instance -- in record_conflicted_index_entries()
-- where the returned string from unique_path() was only used very
temporarily and thus had been immediately free()'d. This codepath was
associated with an ugly skip-worktree workaround that has since been
better fixed by the in-flight en/present-despite-skipped topic. This
workaround probably makes sense to excise once that topic merges down,
but for now, just remove the immediate free() and allow the returned
string to be free()d when the memory pool is released.
This fixes the following memory leak as reported by valgrind:
==PID== 65 bytes in 1 blocks are definitely lost in loss record 79 of 134
==PID== at 0xADDRESS: malloc
==PID== by 0xADDRESS: realloc
==PID== by 0xADDRESS: xrealloc (wrapper.c:126)
==PID== by 0xADDRESS: strbuf_grow (strbuf.c:98)
==PID== by 0xADDRESS: strbuf_vaddf (strbuf.c:394)
==PID== by 0xADDRESS: strbuf_addf (strbuf.c:335)
==PID== by 0xADDRESS: unique_path (merge-ort.c:733)
==PID== by 0xADDRESS: process_entry (merge-ort.c:3678)
==PID== by 0xADDRESS: process_entries (merge-ort.c:4037)
==PID== by 0xADDRESS: merge_ort_nonrecursive_internal (merge-ort.c:4621)
==PID== by 0xADDRESS: merge_ort_internal (merge-ort.c:4709)
==PID== by 0xADDRESS: merge_incore_recursive (merge-ort.c:4760)
==PID== by 0xADDRESS: merge_ort_recursive (merge-ort-wrappers.c:57)
==PID== by 0xADDRESS: try_merge_strategy (merge.c:753)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
detect_and_process_renames() detects renames on both sides of history
and then combines these into a single diff_queue_struct. The combined
diff_queue_struct needs to be able to hold the renames found on either
side, and since it knows the (maximum) size it needs, it pre-emptively
grows the array to the appropriate size:
ALLOC_GROW(combined.queue,
renames->pairs[1].nr + renames->pairs[2].nr,
combined.alloc);
It then collects the items from each side:
collect_renames(opt, &combined, MERGE_SIDE1, ...)
collect_renames(opt, &combined, MERGE_SIDE2, ...)
Note, though, that collect_renames() sometimes determines that some
pairs are unnecessary and does not include them in the combined array.
When it is done, detect_and_process_renames() frees this memory:
if (combined.nr) {
...
free(combined.queue);
}
The problem is that sometimes even when there are pairs, none of them
are necessary. Instead of checking combined.nr, just remove the
if-check; free() knows to skip NULL pointers. This change fixes the
following memory leak, as reported by valgrind:
==PID== 192 bytes in 1 blocks are definitely lost in loss record 107 of 134
==PID== at 0xADDRESS: malloc
==PID== by 0xADDRESS: realloc
==PID== by 0xADDRESS: xrealloc (wrapper.c:126)
==PID== by 0xADDRESS: detect_and_process_renames (merge-ort.c:3134)
==PID== by 0xADDRESS: merge_ort_nonrecursive_internal (merge-ort.c:4610)
==PID== by 0xADDRESS: merge_ort_internal (merge-ort.c:4709)
==PID== by 0xADDRESS: merge_incore_recursive (merge-ort.c:4760)
==PID== by 0xADDRESS: merge_ort_recursive (merge-ort-wrappers.c:57)
==PID== by 0xADDRESS: try_merge_strategy (merge.c:753)
==PID== by 0xADDRESS: cmd_merge (merge.c:1676)
==PID== by 0xADDRESS: run_builtin (git.c:461)
==PID== by 0xADDRESS: handle_builtin (git.c:713)
==PID== by 0xADDRESS: run_argv (git.c:780)
==PID== by 0xADDRESS: cmd_main (git.c:911)
==PID== by 0xADDRESS: main (common-main.c:52)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In sparse-checkout add/set under cone mode, the arguments passed are
supposed to be directories rather than gitignore-style patterns.
However, given the amount of effort spent in the manual discussing
patterns, it is easy for users to assume they need to pass patterns such
as
/foo/*
or
!/bar/*/
or perhaps they really do ignore the directory rule and specify a
random gitignore-style pattern like
*.c
To help catch such mistakes, throw an error if any of the positional
arguments:
* starts with any of '/!'
* contains any of '*?[]'
Inform users they can pass --skip-checks if they have a directory that
really does have such special characters in its name. (We exclude '\'
because of sparse-checkout's special handling of backslashes; see
the MINGW test in t1091.46.)
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The set and add subcommands accept multiple positional arguments.
The meaning of these arguments differs slightly in the two modes:
Cone mode only accepts directories. If given a file, it would
previously treat it as a directory, causing not just the file itself to
be included but all sibling files as well -- likely against users'
expectations. Throw an error if the specified path is a file in the
index. Provide a --skip-checks argument to allow users to override
(e.g. for the case when the given path IS a directory on another
branch).
Non-cone mode accepts general gitignore patterns. There are many
reasons to avoid this mode, but one possible reason to use it instead of
cone mode: to be able to select individual files within a directory.
However, if a file is passed to set/add in non-cone mode, you won't be
selecting a single file, you'll be selecting a file with the same name
in any directory. Thus users will likely want to prefix any paths they
specify with a leading '/' character; warn users if the patterns they
specify exactly name a file because it means they are likely missing
such a leading slash.
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cone mode, non-option arguments to set & add are clearly paths, and
as such, we should pay attention to prefix.
In non-cone mode, it is not clear that folks intend to provide paths
since the inputs are gitignore-style patterns. Paying attention to
prefix would prevent folks from doing things like
git sparse-checkout add /.gitattributes
git sparse-checkout add '/toplevel-dir/*'
In fact, the former will result in
fatal: '/.gitattributes' is outside repository...
while the later will result in
fatal: Invalid path '/toplevel-dir': No such file or directory
despite the fact that both are valid gitignore-style patterns that would
select real files if added to the sparse-checkout file. This might lead
people to just use the path without the leading slash, potentially
resulting in them grabbing files with the same name throughout the
directory hierarchy contrary to their expectations. See also [1] and
[2]. Adding prefix seems to just be fraught with error; so for now
simply throw an error in non-cone mode when sparse-checkout set/add are
run from a subdirectory.
[1] https://lore.kernel.org/git/e1934710-e228-adc4-d37c-f706883bd27c@gmail.com/
[2] https://lore.kernel.org/git/CABPp-BHXZ-XLxY0a3wCATfdq=6-EjW62RzbxKAoFPeXfJswD2w@mail.gmail.com/
Helped-by: Junio Hamano <gitster@pobox.com>
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit f2e3a218e8 ("sparse-checkout: enable `set` to initialize
sparse-checkout mode", 2021-12-14) made the `set` command able to
initialize sparse-checkout mode, but it also had to function when
sparse-checkout mode was already setup and the user just wanted to
change the sparsity paths. So, if the user passed --cone or --no-cone,
then we should override the current setting, but if they didn't pass
either, we should use whatever the current cone mode setting is.
Unfortunately, there was a small error in the logic in that it would not
set the in-memory cone mode value (core_sparse_checkout_one) when
--no-cone was specified, but since it did set the config setting on
disk, any subsequent git invocation would correctly get non-cone mode.
As such, the error did not previously matter. However, a subsequent
commit will add some logic that depends on core_sparse_checkout_cone
being set to the correct mode, so make sure it is set consistently with
the config values we will be writing to disk.
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 4e256731d6 ("sparse-checkout: enable reapply to take
--[no-]{cone,sparse-index}", 2021-12-14) made it so that reapply could
take additional options but added no tests. Tests would have shown that
the feature doesn't work because the initial values are set AFTER
parsing the command line options instead of before. Add a test and set
the initial value at the appropriate time.
Reviewed-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Indent the here-docs and use "test_cmp" instead of "diff" in tests
added in ec55559f93 (push: Add support for pre-push hooks,
2013-01-13). Let's also use the more typical "expect" instead of
"expected" to be consistent with the rest of the test file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the tests added in ec55559f93 (push: Add support for pre-push
hooks, 2013-01-13) to exhaustively test for the exact input we're
expecting. This ensures that we e.g. don't miss a trailing newline.
Appending to a file called "actual" is the established convention in
this test for hooks, see the rest of the tests added in
ec55559f93 (push: Add support for pre-push hooks, 2013-01-13). Let's
follow that convention here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"working tree" and "per-worktree ref" were in glossary, but
"worktree" itself wasn't, which has been corrected.
* jc/glossary-worktree:
glossary: describe "worktree"
"git cmd -h" outside a repository should error out cleanly for many
commands, but instead it hit a BUG(), which has been corrected.
* js/short-help-outside-repo-fix:
t0012: verify that built-ins handle `-h` even without gitdir
checkout/fetch/pull/pack-objects: allow `-h` outside a repository
When there is no object to write .bitmap file for, "git
multi-pack-index" triggered an error, instead of just skipping,
which has been corrected.
* tb/midx-no-bitmap-for-no-objects:
midx: prevent writing a .bitmap without any objects
"git branch" learned the "--recurse-submodules" option.
* gc/branch-recurse-submodules:
branch.c: use 'goto cleanup' in setup_tracking() to fix memory leaks
branch: add --recurse-submodules option for branch creation
builtin/branch: consolidate action-picking logic in cmd_branch()
branch: add a dry_run parameter to create_branch()
branch: make create_branch() always create a branch
branch: move --set-upstream-to behavior to dwim_and_setup_tracking()
Because a deletion of ref would need to remove it from both the
loose ref store and the packed ref store, a delete-ref operation
that logically removes one ref may end up invoking ref-transaction
hook twice, which has been corrected.
* ps/avoid-unnecessary-hook-invocation-with-packed-refs:
refs: skip hooks when deleting uncovered packed refs
refs: do not execute reference-transaction hook on packing refs
refs: demonstrate excessive execution of the reference-transaction hook
refs: allow skipping the reference-transaction hook
refs: allow passing flags when beginning transactions
refs: extract packed_refs_delete_refs() to allow control of transaction
Use an internal call to reset_head() helper function instead of
spawning "git checkout" in "rebase", and update code paths that are
involved in the change.
* pw/use-in-process-checkout-in-rebase:
rebase -m: don't fork git checkout
rebase --apply: set ORIG_HEAD correctly
rebase --apply: fix reflog
reset_head(): take struct rebase_head_opts
rebase: cleanup reset_head() calls
create_autostash(): remove unneeded parameter
reset_head(): make default_reflog_action optional
reset_head(): factor out ref updates
reset_head(): remove action parameter
rebase --apply: don't run post-checkout hook if there is an error
rebase: do not remove untracked files on checkout
rebase: pass correct arguments to post-checkout hook
t5403: refactor rebase post-checkout hook tests
rebase: factor out checkout for up to date branch
"receive-pack" checks if it will do any ref updates (various
conditions could reject a push) before received objects are taken
out of the temporary directory used for quarantine purposes, so
that a push that is known-to-fail will not leave crufts that a
future "gc" needs to clean up.
* cb/clear-quarantine-early-on-all-ref-update-errors:
receive-pack: purge temporary data if no command is ready to run
As part of our code of conduct, we maintain a list of active members on
the Project Leadership Committee, which serves a couple of purposes. The
details are in 3f9ef874a7 (CODE_OF_CONDUCT: mention individual
project-leader emails, 2019-09-26), but the gist is as follows:
- It makes it clear that people with a CoC complaint may contact
members individually as opposed to the general PLC list (in case the
subject of their complaint has to do with one of the committee
members).
- It also serves as the de-facto list of people on the PLC, which
isn't committed anywhere else in the tree.
As of [1], Peff is no longer a member of Git's Project Leadership
Committee. Let's update the list of active members accordingly [2].
This also gives us a convenient opportunity to thank Peff for his many
years of service on the PLC, during which he helped the Git community in
more ways than we can easily list here.
[1]: https://lore.kernel.org/git/YboaAe4LWySOoAe7@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/CAP8UFD2XxP9r3PJ4GQjxUbV=E1ASDq1NDgB-h+S=v-bZQ7DYwQ@mail.gmail.com/
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new flag --batch-command that accepts commands and arguments
from stdin, similar to git-update-ref --stdin.
At GitLab, we use a pair of long running cat-file processes when
accessing object content. One for iterating over object metadata with
--batch-check, and the other to grab object contents with --batch.
However, if we had --batch-command, we wouldn't need to keep both
processes around, and instead just have one --batch-command process
where we can flip between getting object info, and getting object
contents. Since we have a pair of cat-file processes per repository,
this means we can get rid of roughly half of long lived git cat-file
processes. Given there are many repositories being accessed at any given
time, this can lead to huge savings.
git cat-file --batch-command
will enter an interactive command mode whereby the user can enter in
commands and their arguments that get queued in memory:
<command1> [arg1] [arg2] LF
<command2> [arg1] [arg2] LF
When --buffer mode is used, commands will be queued in memory until a
flush command is issued that execute them:
flush LF
The reason for a flush command is that when a consumer process (A)
talks to a git cat-file process (B) and interactively writes to and
reads from it in --buffer mode, (A) needs to be able to control when
the buffer is flushed to stdout.
Currently, from (A)'s perspective, the only way is to either
1. kill (B)'s process
2. send an invalid object to stdin.
1. is not ideal from a performance perspective as it will require
spawning a new cat-file process each time, and 2. is hacky and not a
good long term solution.
With this mechanism of queueing up commands and letting (A) issue a
flush command, process (A) can control when the buffer is flushed and
can guarantee it will receive all of the output when in --buffer mode.
--batch-command also will not allow (B) to flush to stdout until a flush
is received.
This patch adds the basic structure for adding command which can be
extended in the future to add more commands. It also adds the following
two commands (on top of the flush command):
contents <object> LF
info <object> LF
The contents command takes an <object> argument and prints out the object
contents.
The info command takes an <object> argument and prints out the object
metadata.
These can be used in the following way with --buffer:
info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
flush LF
info <object> LF
flush LF
When used without --buffer:
info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
info <object> LF
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
maybe_remove_timestamp() takes arguments, but it would be useful to have
a function that reads from stdin and strips the timestamp. This would
allow tests to pipe data into a function to remove timestamps, and
wouldn't have to always assign a variable. This is especially helpful
when the data is multiple lines.
Keep maybe_remove_timestamp() the same, but add a remove_timestamp
helper that reads from stdin.
The tests in the next patch will make use of this.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future patch introduces a new --batch-command flag. Including --batch
and --batch-check, we will have a total of three batch modes. print_contents
is the only boolean on the batch_options sturct used to distinguish
between the different modes. This makes the code harder to read.
To reduce potential confusion, replace print_contents with an enum to
help readability and clarity.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, we will add an enum on the batch_options struct that
indicates which type of batch operation will be used: --batch,
--batch-check and the soon to be --batch-command that will read
commands from stdin. --batch-command mode might get confused with
the cmdmode flag.
There is value in renaming cmdmode in any case. cmdmode refers to how
the result output of the blob will be transformed, either according to
--filter or --textconv. So transform_mode is a more descriptive name
for the flag.
Rename cmdmode to transform_mode in cat-file.c
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line completion script (in contrib/) learned to
complete all Git subcommands, including the ones that are normally
hidden, when GIT_COMPLETION_SHOW_ALL_COMMANDS is used.
* ab/complete-show-all-commands:
completion: add a GIT_COMPLETION_SHOW_ALL_COMMANDS
completion tests: re-source git-completion.bash in a subshell
Style updates on a test script helper.
* sy/modernize-t-lib-read-tree-m-3way:
t/lib-read-tree-m-3way: indent with tabs
t/lib-read-tree-m-3way: modernize style
"git update-index", "git checkout-index", and "git clean" are
taught to work better with the sparse checkout feature.
* vd/sparse-clean-etc:
update-index: reduce scope of index expansion in do_reupdate
update-index: integrate with sparse index
update-index: add tests for sparse-checkout compatibility
checkout-index: integrate with sparse index
checkout-index: add --ignore-skip-worktree-bits option
checkout-index: expand sparse checkout compatibility tests
clean: integrate with sparse index
reset: reorder wildcard pathspec conditions
reset: fix validation in sparse index test
"git log" and friends learned an option --exclude-first-parent-only
to propagate UNINTERESTING bit down only along the first-parent
chain, just like --first-parent option shows commits that lack the
UNINTERESTING bit only along the first-parent chain.
* jz/rev-list-exclude-first-parent-only:
git-rev-list: add --exclude-first-parent-only flag
Unlike "git apply", "git patch-id" did not handle patches with
hunks that has only 1 line in either preimage or postimage, which
has been corrected.
* jz/patch-id-hunk-header-parsing-fix:
patch-id: fix scan_hunk_header on diffs with 1 line of before/after
patch-id: fix antipatterns in tests
Prepare more test scripts for the introduction of reftable.
* hn/reftable-tests:
t5312: prepare for reftable
t1405: mark test that checks existence as REFFILES
t1405: explictly delete reflogs for reftable
When "git subtree" wants to create a merge, it used "git merge" and
let it be affected by end-user's "merge.ff" configuration, which
has been corrected.
* tk/subtree-merge-not-ff-only:
subtree: force merge commit
This is another simple change with a long explanation...
merge-recursive and merge-ort are both based on the same recursive idea:
if there is more than one merge base, merge the merge bases (which may
require first merging the merge bases of the merges bases, etc.). The
depth of the inner merge is recorded via a variable called "call_depth",
which we'll bring up again later. Naturally, the inner merges
themselves can have conflicts and various messages generated about those
files.
merge-recursive immediately prints to stdout as it goes, at the risk of
printing multiple conflict notices for the same path separated far apart
from each other with many intervenining conflict notices for other paths
between them. And this is true even if there are no inner merges
involved. An example of this was given in [1] and apparently caused
some confusion:
CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
...dozens of conflicts for OTHER paths...
CONFLICT (content): Merge conflicts in B
In contrast, merge-ort collects messages and stores them by path so that
it can print them grouped by path. Thus, the same case handled by
merge-ort would have output of the form:
CONFLICT (rename/add): Rename A->B in HEAD. B added in otherbranch
CONFLICT (content): Merge conflicts in B
...dozens of conflicts for OTHER paths...
This is generally helpful, but does make a separate bug more
problematic. In particular, while merge-recursive might report the
following for a recursive merge:
Auto-merging dir.c
Auto-merging midx.c
CONFLICT (content): Merge conflict in midx.c
Auto-merging diff.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
merge-ort would instead report:
Auto-merging diff.c
Auto-merging dir.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
Auto-merging midx.c
CONFLICT (content): Merge conflict in midx.c
The fact that messages for the same file are together is probably
helpful in general, but with the indentation missing for the inner
merge it unfortunately serves to confuse. This probably would lead
users to wonder:
* Why is Git reporting that "dir.c" is being merged twice?
* If midx.c has conflicts, why do I not see any when I open up the
file and why are no conflicts shown in the index?
Fix this output confusion by changing the output to clearly
differentiate the messages for outer merges from the ones for inner
merges, changing the above output from merge-ort to:
Auto-merging diff.c
From inner merge: Auto-merging dir.c
Auto-merging dir.c
CONFLICT (content): Merge conflict in dir.c
From inner merge: Auto-merging midx.c
From inner merge: CONFLICT (content): Merge conflict in midx.c
(Note: the number of spaces after the 'From inner merge:' is
2*call_depth).
One other thing to note here, that I didn't notice until typing up this
commit message, is that merge-recursive does not print any messages from
the inner merges by default; the extra verbosity has to be requested.
merge-ort currently has no verbosity controls and always prints these.
We may also want to change that, but for now, just make the output
clearer with these extra markings and indentation.
[1] https://lore.kernel.org/git/CAGyf7-He4in8JWUh9dpAwvoPkQz9hr8nCBpxOxhZEd8+jtqTpg@mail.gmail.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
PCRE2 bug 2642 was fixed in version 10.36. Our 95ca1f987e (grep/pcre2:
better support invalid UTF-8 haystacks, 2021-01-24) worked around it on
older versions by setting the flag PCRE2_NO_START_OPTIMIZE. 797c359978
(grep/pcre2: use compile-time PCREv2 version test, 2021-02-18) switched
it around to set the flag on 10.36 and higher instead, while it claimed
to use "the same test done at compile-time".
Switch the condition back to apply the workaround on PCRE2 versions
_before_ 10.36.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_FORCE_UNTRACKED_CACHE environment variable writes the untracked
cache more frequently than the core.untrackedCache config variable. This
is due to how read_directory() handles the creation of an untracked
cache.
Before this change, Git would not create the untracked cache extension
for an index that did not already have one. Users would need to run a
command such as 'git update-index --untracked-cache' before the index
would actually contain an untracked cache.
In particular, users noticed that the untracked cache would not appear
even with core.untrackedCache=true. Some users reported setting
GIT_FORCE_UNTRACKED_CACHE=1 in their engineering system environment to
ensure the untracked cache would be created.
The decision to not write the untracked cache without an environment
variable tracks back to fc9ecbeb9 (dir.c: don't flag the index as dirty
for changes to the untracked cache, 2018-02-05). The motivation of that
change is that writing the index is expensive, and if the untracked
cache is the only thing that needs to be written, then it is more
expensive than the benefit of the cache. However, this also means that
the untracked cache never gets populated, so the user who enabled it via
config does not actually get the extension until running 'git
update-index --untracked-cache' manually or using the environment
variable.
We have had a version of this change in the microsoft/git fork for a few
major releases now. It has been working well to get users into a good
state. Yes, that first index write is slow, but the remaining index
writes are much faster than they would be without this change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching with the `--prune` flag we will delete any local
references matching the fetch refspec which have disappeared on the
remote. This step is not currently covered by the `--atomic` flag: we
delete branches even though updating of local references has failed,
which means that the fetch is not an all-or-nothing operation.
Fix this bug by passing in the global transaction into `prune_refs()`:
if one is given, then we'll only queue up deletions and not commit them
right away.
This change also improves performance when pruning many branches in a
repository with a big packed-refs file: every references is pruned in
its own transaction, which means that we potentially have to rewrite
the packed-refs files for every single reference we're about to prune.
The following benchmark demonstrates this: it performs a pruning fetch
from a repository with a single reference into a repository with 100k
references, which causes us to prune all but one reference. This is of
course a very artificial setup, but serves to demonstrate the impact of
only having to write the packed-refs file once:
Benchmark 1: git fetch --prune --atomic +refs/*:refs/* (HEAD~)
Time (mean ± σ): 2.366 s ± 0.021 s [User: 0.858 s, System: 1.508 s]
Range (min … max): 2.328 s … 2.407 s 10 runs
Benchmark 2: git fetch --prune --atomic +refs/*:refs/* (HEAD)
Time (mean ± σ): 1.369 s ± 0.017 s [User: 0.715 s, System: 0.641 s]
Range (min … max): 1.346 s … 1.400 s 10 runs
Summary
'git fetch --prune --atomic +refs/*:refs/* (HEAD)' ran
1.73 ± 0.03 times faster than 'git fetch --prune --atomic +refs/*:refs/* (HEAD~)'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching references from a remote we by default also fetch all tags
which point into the history we have fetched. This is a separate step
performed after updating local references because it requires us to walk
over the history on the client-side to determine whether the remote has
announced any tags which point to one of the fetched commits.
This backfilling of tags isn't covered by the `--atomic` flag: right
now, it only applies to the step where we update our local references.
This is an oversight at the time the flag was introduced: its purpose is
to either update all references or none, but right now we happily update
local references even in the case where backfilling failed.
Fix this by pulling up creation of the reference transaction such that
we can pass the same transaction to both the code which updates local
references and to the code which backfills tags. This allows us to only
commit the transaction in case both actions succeed.
Note that we also have to start passing the transaction into
`find_non_local_tags()`: this function is responsible for finding all
tags which we need to backfill. Right now, it will happily return tags
which have already been updated with our local references. But when we
use a single transaction for both local references and backfilling then
it may happen that we try to queue the same reference update twice to
the transaction, which consequently triggers a bug. We thus have to skip
over any tags which have already been queued.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no way for a caller to see whether a reference update has
already been queued up for a given reference transaction. There are
multiple alternatives to provide this functionality:
- We may add a function that simply tells us whether a specific
reference has already been queued. If implemented naively then
this would potentially be quadratic in runtime behaviour if this
question is asked repeatedly because we have to iterate over all
references every time. The alternative would be to add a hashmap
of all queued reference updates to speed up the lookup, but this
adds overhead to all callers.
- We may add a flag to `ref_transaction_add_update()` that causes it
to skip duplicates, but this has the same runtime concerns as the
first alternative.
- We may add an interface which lets callers collect all updates
which have already been queued such that he can avoid re-adding
them. This is the most flexible approach and puts the burden on
the caller, but also allows us to not impact any of the existing
callsites which don't need this information.
This commit implements the last approach: it allows us to compute the
map of already-queued updates once up front such that we can then skip
all subsequent references which are already part of this map.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the backfilling of tags fails we do not report this error to the
caller, but only report it implicitly at a later point when reporting
updated references. This leaves callers unable to act upon the
information of whether the backfilling succeeded or not.
Refactor the function to return an error code and pass it up the
callstack. This causes us to correctly propagate the error back to the
user of git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two different locations where we're appending to FETCH_HEAD:
first when storing updated references, and second when backfilling tags.
Both times we open the file, append to it and then commit it into place,
which is essentially duplicate work.
Improve the lifecycle of updating FETCH_HEAD by opening and committing
it once in `do_fetch()`, where we pass the structure down to the code
which wants to append to it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch code flow is a bit hard to understand right now:
1. We optionally prune all references which have vanished on the
remote side.
2. We fetch and update all other references locally.
3. We update the upstream branch in the gitconfig.
4. We backfill tags pointing into the history we have just fetched.
It is quite confusing that we fetch objects and update references in
both (2) and (4), which is further stressed by the point that we use a
`skip` goto label to jump from (3) to (4) in case we fail to update the
gitconfig as expected.
Reorder the code to first update all local references, and only after we
have done so update the upstream branch information. This improves the
code flow and furthermore makes it easier to refactor the way we update
references together.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git-fetch(1) with the `--atomic` flag the expectation is that
either all of the references are updated, or alternatively none are in
case the fetch fails. While we already have tests for this, we do not
have any tests which exercise atomicity either when pruning deleted refs
or when backfilling tags. This gap in test coverage hides that we indeed
don't handle atomicity correctly for both of these cases.
Add test cases which cover these testing gaps to demonstrate the broken
behaviour. Note that tests are not marked as `test_expect_failure`: this
is done to explicitly demonstrate the current known-wrong behaviour, and
they will be fixed up as soon as we fix the underlying bugs.
While at it this commit also adds another test case which demonstrates
that backfilling of tags does not return an error code in case the
backfill fails. This bug will also be fixed by a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Removal of unused code and doc.
* js/no-more-legacy-stash:
stash: stop warning about the obsolete `stash.useBuiltin` config setting
stash: remove documentation for `stash.useBuiltin`
add: remove support for `git-legacy-stash`
git-sh-setup: remove remnant bits referring to `git-legacy-stash`
"git diff --diff-filter=aR" is now parsed correctly.
* js/diff-filter-negation-fix:
diff-filter: be more careful when looking for negative bits
diff.c: move the diff filter bits definitions up a bit
docs(diff): lose incorrect claim about `diff-files --diff-filter=A`
Interaction between fetch.negotiationAlgorithm and
feature.experimental configuration variables has been corrected.
* en/fetch-negotiation-default-fix:
repo-settings: rename the traditional default fetch.negotiationAlgorithm
repo-settings: fix error handling for unknown values
repo-settings: fix checking for fetch.negotiationAlgorithm=default
The build procedure has been taught to notice older version of zlib
and enable our replacement uncompress2() automatically.
* ab/auto-detect-zlib-compress2:
compat: auto-detect if zlib has uncompress2()
A bug that made multi-pack bitmap and the object order out-of-sync,
making the .midx data corrupt, has been fixed.
* tb/midx-bitmap-corruption-fix:
pack-bitmap.c: gracefully fallback after opening pack/MIDX
midx: read `RIDX` chunk when present
t/lib-bitmap.sh: parameterize tests over reverse index source
t5326: move tests to t/lib-bitmap.sh
t5326: extract `test_rev_exists`
t5326: drop unnecessary setup
pack-revindex.c: instrument loading on-disk reverse index
midx.c: make changing the preferred pack safe
t5326: demonstrate bitmap corruption after permutation
"git log --remerge-diff" shows the difference from mechanical merge
result and the result that is actually recorded in a merge commit.
* en/remerge-diff:
diff-merges: avoid history simplifications when diffing merges
merge-ort: mark conflict/warning messages from inner merges as omittable
show, log: include conflict/warning messages in --remerge-diff headers
diff: add ability to insert additional headers for paths
merge-ort: format messages slightly different for use in headers
merge-ort: mark a few more conflict messages as omittable
merge-ort: capture and print ll-merge warnings in our preferred fashion
ll-merge: make callers responsible for showing warnings
log: clean unneeded objects during `log --remerge-diff`
show, log: provide a --remerge-diff capability
Problems identified by Coverity in the reftable code have been
corrected.
* hn/reftable-coverity-fixes:
reftable: add print functions to the record types
reftable: make reftable_record a tagged union
reftable: remove outdated file reftable.c
reftable: implement record equality generically
reftable: make reftable-record.h function signatures const correct
reftable: handle null refnames in reftable_ref_record_equal
reftable: drop stray printf in readwrite_test
reftable: order unittests by complexity
reftable: all xxx_free() functions accept NULL arguments
reftable: fix resource warning
reftable: ignore remove() return value in stack_test.c
reftable: check reftable_stack_auto_compact() return value
reftable: fix resource leak blocksource.c
reftable: fix resource leak in block.c error path
reftable: fix OOB stack write in print functions
The command line completion (in contrib/) learns to complete
arguments to give to "git sparse-checkout" command.
* ld/sparse-index-bash-completion:
completion: handle unusual characters for sparse-checkout
completion: improve sparse-checkout cone mode directory completion
completion: address sparse-checkout issues
The "struct option" added in 4a28847839 (diff.c: prepare to use
parse_options() for parsing, 2019-01-27) would be free'd in the case
of diff_setup_done() being called.
But not all codepaths that allocate it reach that,
e.g. "t6427-diff3-conflict-markers.sh" will now free memory that it
didn't free before. By using FREE_AND_NULL() here (which
diff_setup_done() also does) we ensure that we free the memory, and
that we won't have double-free's.
Before this running:
./t6427-diff3-conflict-markers.sh -vixd --run=7
Would report:
SUMMARY: LeakSanitizer: 7823 byte(s) leaked in 6 allocation(s).
But now we'll report:
SUMMARY: LeakSanitizer: 703 byte(s) leaked in 5 allocation(s).
I.e. the largest leak in that particular test has now been addressed.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have the diff_free() function call clear_pathspec(). Since the
diff_flush() function calls this all its callers can be simplified to
rely on it instead.
When I added the diff_free() function in e900d494dc (diff: add an API
for deferred freeing, 2021-02-11) I simply missed this, or wasn't
interested in it. Let's consolidate this now. This means that any
future callers (and I've got revision.c in mind) that embed a "struct
diff_options" can simply call diff_free() instead of needing know that
it has an embedded pathspec.
This does fix a bunch of leaks, but I can't mark any test here as
passing under the SANITIZE=leak testing mode because in
886e1084d7 (builtin/: add UNLEAKs, 2017-10-01) an UNLEAK(rev) was
added, which plasters over the memory
leak. E.g. "t4011-diff-symlink.sh" would report fewer leaks with this
fix, but because of the UNLEAK() reports none.
I'll eventually loop around to removing that UNLEAK(rev) annotation as
I'll fix deeper issues with the revisions API leaking. This is one
small step on the way there, a new freeing function in revisions.c
will want to call this diff_free().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other users of xdiff such as libgit2 need to be able to handle
allocation failures. These allocation failures were previously
ignored.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard "goto out" pattern rather than repeating very similar
code after checking for each error. This will simplify the next commit
that starts handling allocation failures that are currently ignored.
On error xdl_do_diff() frees the environment so we need to take care
to avoid a double free in that case. xdl_build_script() does not
assign a result unless it is successful so there is no possibility of
a double free if it fails.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other users of libxdiff such as libgit2 need to be able to handle
allocation failures. As NULL is a valid return value the function
signature is changed to be able report allocation failures.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the patience and histogram algorithms initialize the
environment they do not free it if there is an error. In contrast for
the Myers algorithm the environment is initalized in xdl_do_diff() and
it is freed if there is an error. Fix this by always initializing the
environment in xdl_do_diff() and freeing it there if there is an
error. Remove the comment in do_patience_diff() about the environment
being freed by xdl_diff() as it is not accurate because (a) xdl_diff()
does not do that if there is an error and (b) xdl_diff() is not the
only caller.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in the parse_date_format() function by providing a
new date_mode_release() companion function.
By using this in "t/helper/test-date.c" we can mark the
"t0006-date.sh" test as passing when git is compiled with
SANITIZE=leak, and whitelist it to run under
"GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
"TEST_PASSES_SANITIZE_LEAK=true" to the test itself.
The other tests that expose this memory leak (i.e. take the
"mode->type == DATE_STRFTIME" branch in parse_date_format()) are
"t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
easily fixed leak in "ref-filter.c", and brings the failures in
"t6300-for-each-ref.sh" down from 51 to 48.
Fixing the remaining leaks will have to wait until there's a
release_revisions() in "revision.c", as they have to do with leaks via
"struct rev_info".
There is also a leak in "builtin/blame.c" due to its call to
parse_date_format() to parse the "blame.date" configuration. However
as it declares a file-level "static struct date_mode blame_date_mode"
to track the data, LSAN will not report it as a leak. It's possible to
get valgrind(1) to complain about it with e.g.:
valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md
But let's focus on things LSAN complains about, and are thus
observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to
fixing memory leaks in "builtin/blame.c", but as doing so would
require some re-arrangement of cmd_blame() let's leave it for some
other time.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add basic API doc comments to date.h, and while doing so move the the
parse_date_format() function adjacent to show_date(). This way all the
"struct date_mode" functions are grouped together. Documenting the
rest is one of our #leftoverbits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide and use a DATE_MODE_INIT macro. Most of the users of struct
date_mode" use it via pretty.h's "struct pretty_print_context" which
doesn't have an initialization macro, so we're still bound to being
initialized to "{ 0 }" by default.
But we can change the couple of callers that directly declared a
variable on the stack to instead use the initializer, and thus do away
with the "mode.local = 0" added in add00ba2de (date: make "local"
orthogonal to date format, 2015-09-03).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.
The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.
Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.
We could simply include this new header in "cache.h", but as this
change shows these functions weren't common enough to warrant
including in it in the first place. By moving them out of cache.h
changes to this API will no longer cause a (mostly) full re-build of
the project when "make" is run.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There has never been a show_date_human() function on the "master"
branch in git.git. This declaration was added in b841d4ff43 (Add
`human` format to test-tool, 2019-01-28).
A look at the ML history reveals that it was leftover cruft from an
earlier version of that commit[1].
1. https://lore.kernel.org/git/20190118061805.19086-5-ischis2@cox.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the parsing of "grep.patternType" and
"grep.extendedRegexp". This changes no behavior, but gets rid of
complex parsing logic that isn't needed anymore.
When "grep.patternType" was introduced in 84befcd0a4 (grep: add a
grep.patternType configuration setting, 2012-08-03) we promised that:
1. You can set "grep.patternType", and "[setting it to] 'default'
will return to the default matching behavior".
In that context "the default" meant whatever the configuration
system specified before that change, i.e. via grep.extendedRegexp.
2. We'd support the existing "grep.extendedRegexp" option, but ignore
it when the new "grep.patternType" option is set. We said we'd
only ignore the older "grep.extendedRegexp" option "when the
`grep.patternType` option is set to a value other than
'default'".
In a preceding commit we changed grep_config() to be called after
grep_init(), which means that much of the complexity here can go
away.
As before both "grep.patternType" and "grep.extendedRegexp" are
last-one-wins variable, with "grep.extendedRegexp" yielding to
"grep.patternType", except when "grep.patternType=default".
Note that as the previously added tests indicate this cannot be done
on-the-fly as we see the config variables, without introducing more
state keeping. I.e. if we see:
-c grep.extendedRegexp=false
-c grep.patternType=default
-c extendedRegexp=true
We need to select ERE, since grep.patternType=default unselects that
variable, which normally has higher precedence, but we also need to
select BRE in cases of:
-c grep.extendedRegexp=true \
-c grep.extendedRegexp=false
Which would not be the case for this, which select ERE:
-c grep.patternType=extended \
-c grep.extendedRegexp=false
Therefore we cannot do this on-the-fly in grep_config without also
introducing tracking variables for not only the pattern type, but what
the source of that pattern type was.
So we need to decide on the pattern after our config was fully
parsed. Let's do that by deferring the decision on the pattern type
until it's time to compile it in compile_regexp().
By that time we've not only parsed the config, but also handled the
command-line options. Those will set "opt.pattern_type_option" (*not*
"opt.extended_regexp_option"!).
At that point all we need to do is see if "grep.patternType" was
UNSPECIFIED in the end (including an explicit "=default"), if so we'll
use the "grep.extendedRegexp" configuration, if any.
See my 07a3d41173 (grep: remove regflags from the public grep_opt
API, 2017-06-29) for addition of the two comments being removed here,
i.e. the complexity noted in that commit is now going away.
1. https://lore.kernel.org/git/patch-v8-09.10-c211bb0c69d-20220118T155211Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code in compile_regexp() to check the cheaper boolean
"!opt->pcre2" condition before the "memchr()" search.
This doesn't noticeably optimize anything, but makes the code more
obvious and conventional. The line wrapping being added here also
makes a subsequent commit smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "pattern_type_option" member of "struct grep_opt" to use
the enum type we use for it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grep_init() function used the odd pattern of initializing the
passed-in "struct grep_opt" with a statically defined "grep_defaults"
struct, which would be modified in-place when we invoked
grep_config().
So we effectively (b) initialized config, (a) then defaults, (c)
followed by user options. Usually those are ordered as "a", "b" and
"c" instead.
As the comments being removed here show the previous behavior needed
to be carefully explained as we'd potentially share the populated
configuration among different instances of grep_init(). In practice we
didn't do that, but now that it can't be a concern anymore let's
remove those comments.
This does not change the behavior of any of the configuration
variables or options. That would have been the case if we didn't move
around the grep_config() call in "builtin/log.c". But now that we call
"grep_config" after "git_log_config" and "git_format_config" we'll
need to pass in the already initialized "struct grep_opt *".
See 6ba9bb76e0 (grep: copy struct in one fell swoop, 2020-11-29) and
7687a0541e (grep: move the configuration parsing logic to grep.[ch],
2012-10-09) for the commits that added the comments.
The memcpy() pattern here will be optimized away and follows the
convention of other *_init() functions. See 5726a6b401 (*.c *_init():
define in terms of corresponding *_INIT macro, 2021-07-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change grep_cmd_config() to stop passing around the always-NULL "cb"
value. When this code was added in 7e8f59d577 (grep: color patterns
in output, 2009-03-07) it was non-NULL, but when that changed in
15fabd1bbd (builtin/grep.c: make configuration callback more
reusable, 2012-10-09) this code was left behind.
In a subsequent change I'll start using the "cb" value, this will make
it clear which functions we call need it, and which don't.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code in "builtin/grep.c" and "builtin/ls-tree.c" to trust the
"prefix" passed from "run_builtin()". The "prefix" we get from setup.c
is either going to be NULL or a string of length >0, never "".
So we can drop the "prefix && *prefix" checks added for
"builtin/grep.c" in 0d042fecf2 (git-grep: show pathnames relative to
the current directory, 2006-08-11), and for "builtin/ls-tree.c" in
a69dd585fc (ls-tree: chomp leading directories when run from a
subdirectory, 2005-12-23).
As seen in code in revision.c that was added in cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12) we already have existing code that does away with this
assertion.
This makes it easier to reason about a subsequent change to the
"prefix_length" code in grep.c in a subsequent commit, and since we're
going to the trouble of doing that let's leave behind an assert() to
promise this to any future callers.
For "builtin/grep.c" it would be painful to pass the "prefix" down the
callchain of:
cmd_grep -> grep_tree -> grep_submodule -> grep_cache -> grep_oid ->
grep_source_name
So for the code that needs it in grep_source_name() let's add a
"grep_prefix" variable similar to the existing "ls_tree_prefix".
While at it let's move the code in cmd_ls_tree() around so that we
assign to the "ls_tree_prefix" right after declaring the variables,
and stop assigning to "prefix". We only subsequently used that
variable later in the function after clobbering it. Let's just use our
own "grep_prefix" instead.
Let's also add an assert() in git.c, so that we'll make this promise
about the "prefix" to any current and future callers, as well as to
any readers of the code.
Code history:
* The strlen() in "grep.c" hasn't been used since 493b7a08d8 (grep:
accept relative paths outside current working directory, 2009-09-05).
When that code was added in 0d042fecf2 (git-grep: show pathnames
relative to the current directory, 2006-08-11) we used the length.
But since 493b7a08d8 we haven't used it for anything except a
boolean check that we could have done on the "prefix" member
itself.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the grep tests to assert that setting
"grep.patternType=extended" followed by "grep.patternType=default"
will behave as if "--basic-regexp" was provided, and not as
"--extended-regexp". In a subsequent commit we'll need to treat
"grep.patternType=default" as a special-case, but let's make sure we
ignore it if it's being set to "default" following an earlier
non-"default" "grep.patternType" setting.
Let's also test what happens when we have a sequence of "extended"
followed by "default" and "fixed". In that case the "fixed" should
prevail, as well as tests to check that a "grep.extendedRegexp=true"
followed by a "grep.extendedRegexp=false" behaves as though
"grep.extendedRegexp" wasn't provided.
See [1] for the source of some of these tests, and their
initial (pseudocode) implementation, and [2] for a later discussion
about a breakage due to missing testing (which had been noted in [1]
all along).
1. https://lore.kernel.org/git/xmqqv8zf6j86.fsf@gitster.g/
2. https://lore.kernel.org/git/xmqqpmoczwtu.fsf@gitster.g/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the repeated test code for finding out whether a given set of
configuration will pick basic, extended or fixed into a new
"test_pattern_type" helper function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the tests added in my 9df46763ef (log: add exhaustive tests
for pattern style options & config, 2017-05-20) to check not only
whether "git log" handles "grep.patternType", but also "git show"
etc.
It's sufficient to check whether we match a "fixed" or a "basic" regex
here to see if these codepaths correctly invoked grep_config(). We
don't need to check the details of their regular expression matching
as the "log" test does.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This "regex_t" in grep_opt has not been used since
f9b9faf6f8 (builtin-grep: allow more than one patterns., 2006-05-02),
we still use a "regex_t" for compiling regexes, but that's in the
"grep_pat" struct".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
34ae3b70 (name-rev: deprecate --stdin in favor of --annotate-stdin,
2022-01-05) added --annotate-stdin to replace --stdin as a clearer
flag name. Since --stdin is to be deprecated, we should replace
--stdin in the output from "git name-rev -h".
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stolee transitioned from Microsoft to GitHub in July 2020, but continued
to use <dstolee@microsoft.com> because it was a valid address. He also
used <stolee@gmail.com> to communicate with the mailing list since
writing plaintext emails is difficult in Outlook. However, recent issues
with GMail delaying mailing list messages created a need to change his
primary email address.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `eol` takes effect on text files only when the index has the
contents in LF line endings. Paths with contents in CRLF line
endings in the index may become dirty unless text=auto.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git fetch --prune" failed to prune the refs it wanted to
prune, the command issued error messages but exited with exit
status 0, which has been corrected.
* tg/fetch-prune-exit-code-fix:
fetch --prune: exit with error if pruning fails
Update the contributor-facing documents on proposed log messages.
* jc/doc-log-messages:
SubmittingPatches: explain why we care about log messages
CodingGuidelines: hint why we value clearly written log messages
SubmittingPatches: write problem statement in the log in the present tense
Pick a better random number generator and use it when we prepare
temporary filenames.
* bc/csprng-mktemps:
wrapper: use a CSPRNG to generate random file names
wrapper: add a helper to generate numbers from a CSPRNG
Doc and test update around the eol attribute.
* bc/clarify-eol-attr:
docs: correct documentation about eol attribute
t0027: add tests for eol without text in .gitattributes
The protocol keyword 'ready' isn't meant for translation. Pass it as
parameter instead of spell it in die() message (and potentially confuse
translators).
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's useful to be able to countermand a previous --graph option, for
example if `git log --graph` is run via an alias.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating references via git-fetch(1), then by default we report to
the user which references have been changed. This output is formatted in
a nice table such that the different columns are aligned. Because the
first column contains abbreviated object IDs we thus need to iterate
over all refs which have changed and compute the minimum length for
their respective abbreviated hashes. While this effort makes sense in
most cases, it is wasteful when the user passes the `--quiet` flag: we
don't print the summary, but still compute the length.
Skip computing the summary width when the user asked for us to be quiet.
This gives us a speedup of nearly 10% when doing a mirror-fetch in a
repository with thousands of references being updated:
Benchmark 1: git fetch --quiet +refs/*:refs/* (HEAD~)
Time (mean ± σ): 96.078 s ± 0.508 s [User: 91.378 s, System: 10.870 s]
Range (min … max): 95.449 s … 96.760 s 5 runs
Benchmark 2: git fetch --quiet +refs/*:refs/* (HEAD)
Time (mean ± σ): 88.214 s ± 0.192 s [User: 83.274 s, System: 10.978 s]
Range (min … max): 87.998 s … 88.446 s 5 runs
Summary
'git fetch --quiet +refs/*:refs/* (HEAD)' ran
1.09 ± 0.01 times faster than 'git fetch --quiet +refs/*:refs/* (HEAD~)'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During packfile negotiation we iterate over all refs announced by the
remote side to check whether their IDs refer to commits already known to
us. If a commit is known to us already, then its date is a potential
cutoff point for commits we have in common with the remote side.
There is potentially a lot of commits announced by the remote depending
on how many refs there are in the remote repository, and for every one
of them we need to search for it in our object database and, if found,
parse the corresponding object to find out whether it is a candidate for
the cutoff date. This can be sped up by trying to look up commits via
the commit-graph first, which is a lot more efficient.
Benchmarks in a repository with about 2,1 million refs and an up-to-date
commit-graph show an almost 20% speedup when mirror-fetching:
Benchmark 1: git fetch +refs/*:refs/* (v2.35.0)
Time (mean ± σ): 115.587 s ± 2.009 s [User: 109.874 s, System: 11.305 s]
Range (min … max): 113.584 s … 118.820 s 5 runs
Benchmark 2: git fetch +refs/*:refs/* (HEAD)
Time (mean ± σ): 96.859 s ± 0.624 s [User: 91.948 s, System: 10.980 s]
Range (min … max): 96.180 s … 97.875 s 5 runs
Summary
'git fetch +refs/*:refs/* (HEAD)' ran
1.19 ± 0.02 times faster than 'git fetch +refs/*:refs/* (v2.35.0)'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test fiddles with files under .git/logs to recreate a condition
that is unlikely to warrant special attention under reftable, as
reflog blocks are zlib compressed.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the test compatible with reftable (it doesn't pass yet for
other reasons, unfortunately)
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have description on "per worktree ref", but "worktree" is not
described in the glossary. We do have "working tree", though.
Casually put, a "working tree" is what your editor and compiler
interacts with. "worktree" is a mechanism to allow one or more
"working tree"s to be attached to a repository and used to check out
different commits and branches independently, which includes not
just a "working tree" but also repository metadata like HEAD, the
index to support simultaneous use of them. Historically, we used
these terms interchangeably but we have been trying to use "working
tree" when we mean it, instead of "worktree".
Most of the existing references to "working tree" in the glossary do
refer primarily to the working tree portion, except for one that
said refs like HEAD and refs/bisect/* are per "working tree", but it
is more precise to say they are per "worktree".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `|` at line end already imples that the statement is not over.
So a `\` after that is redundant.
Signed-off-by: Jaydeep P Das <jaydeepjd.8914@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repo with a --filter and with --recurse-submodules
enabled, the partial clone filter only applies to the top-level repo.
This can lead to unexpected bandwidth and disk usage for projects which
include large submodules. For example, a user might wish to make a
partial clone of Gerrit and would run:
`git clone --recurse-submodules --filter=blob:5k https://gerrit.googlesource.com/gerrit`.
However, only the superproject would be a partial clone; all the
submodules would have all blobs downloaded regardless of their size.
With this change, the same filter can also be applied to submodules,
meaning the expected bandwidth and disk savings apply consistently.
To avoid changing default behavior, add a new clone flag,
`--also-filter-submodules`. When this is set along with `--filter` and
`--recurse-submodules`, the filter spec is passed along to git-submodule
and git-submodule--helper, such that submodule clones also have the
filter applied.
This applies the same filter to the superproject and all submodules.
Users who need to customize the filter per-submodule would need to clone
with `--no-recurse-submodules` and then manually initialize each
submodule with the proper filter.
Applying filters to submodules should be safe thanks to Jonathan Tan's
recent work [1, 2, 3] eliminating the use of alternates as a method of
accessing submodule objects, so any submodule object access now triggers
a lazy fetch from the submodule's promisor remote if the accessed object
is missing. This patch is a reworked version of [4], which was created
prior to Jonathan Tan's work.
[1]: 8721e2e (Merge branch 'jt/partial-clone-submodule-1', 2021-07-16)
[2]: 11e5d0a (Merge branch 'jt/grep-wo-submodule-odb-as-alternate',
2021-09-20)
[3]: 162a13b (Merge branch 'jt/no-abuse-alternate-odb-for-submodules',
2021-10-25)
[4]: https://lore.kernel.org/git/52bf9d45b8e2b72ff32aa773f2415bf7b2b86da2.1563322192.git.steadmon@google.com/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the logic to compute alignment requirement for our mem-pool.
* jc/mem-pool-alignment:
mem-pool: don't assume uintmax_t is aligned enough for all types
Mark in various places in the code that the sparse index and the
split index features are mutually incompatible.
* js/sparse-vs-split-index:
split-index: it really is incompatible with the sparse index
t1091: disable split index
sparse-index: sparse index is disallowed when split index is active
Cloning from a repository that does not yet have any branches or
tags but has other refs resulted in a "remote transport reported
error", which has been corrected.
* jt/clone-not-quite-empty:
clone: support unusual remote ref configurations
"git sparse-checkout init" failed to write into $GIT_DIR/info
directory when the repository was created without one, which has
been corrected to auto-create it.
* jt/sparse-checkout-leading-dir-fix:
sparse-checkout: create leading directory
More "config-based hooks".
* ab/config-based-hooks-2:
run-command: remove old run_hook_{le,ve}() hook API
receive-pack: convert push-to-checkout hook to hook.h
read-cache: convert post-index-change to use hook.h
commit: convert {pre-commit,prepare-commit-msg} hook to hook.h
git-p4: use 'git hook' to run hooks
send-email: use 'git hook run' for 'sendemail-validate'
git hook run: add an --ignore-missing flag
hooks: convert worktree 'post-checkout' hook to hook library
hooks: convert non-worktree 'post-checkout' hook to hook library
merge: convert post-merge to use hook.h
am: convert applypatch-msg to use hook.h
rebase: convert pre-rebase to use hook.h
hook API: add a run_hooks_l() wrapper
am: convert {pre,post}-applypatch to use hook.h
gc: use hook library for pre-auto-gc hook
hook API: add a run_hooks() wrapper
hook: add 'run' subcommand
The code path that verifies signatures made with ssh were made to
work better on a system with CRLF line endings.
* fs/ssh-signing-crlf:
gpg-interface: trim CR from ssh-keygen
"git name-rev --stdin" does not behave like usual "--stdin" at
all. Start the process of renaming it to "--annotate-stdin".
* jc/name-rev-stdin:
name-rev.c: use strbuf_getline instead of limited size buffer
name-rev: deprecate --stdin in favor of --annotate-stdin
"git fetch --negotiate-only" is an internal command used by "git
push" to figure out which part of our history is missing from the
other side. It should never recurse into submodules even when
fetch.recursesubmodules configuration variable is set, nor it
should trigger "gc". The code has been tightened up to ensure it
only does common ancestry discovery and nothing else.
* gc/fetch-negotiate-only-early-return:
fetch: help translators by reusing the same message template
fetch --negotiate-only: do not update submodules
fetch: skip tasks related to fetching objects
fetch: use goto cleanup in cmd_fetch()
"git add -p" rewritten in C regressed hunk splitting in some cases,
which has been corrected.
* pw/add-p-hunk-split-fix:
builtin add -p: fix hunk splitting
t3701: clean up hunk splitting tests
We explain that revs come first before the pathspec among command
line arguments, but did not spell out that dashed options come
before other args, which has been corrected.
* tl/doc-cli-options-first:
git-cli.txt: clarify "options first and then args"
The conditional inclusion mechanism of configuration files using
"[includeIf <condition>]" learns to base its decision on the
URL of the remote repository the repository interacts with.
* jt/conditional-config-on-remote-url:
config: include file if remote URL matches a glob
config: make git_config_include() static
The merge-ort misbehaved when merge.renameLimit configuration is
set too low and failed to find all renames.
* en/merge-ort-restart-optim-fix:
merge-ort: avoid assuming all renames detected
When trying to write a MIDX, we already prevent the case where there
weren't any packs present, and thus we would have written an empty MIDX.
But there is another "empty" case, which is more interesting, and we
don't yet handle. If we try to write a MIDX which has at least one pack,
but those packs together don't contain any objects, we will encounter a
BUG() when trying to use the bitmap corresponding to that MIDX, like so:
$ git rev-parse HEAD | git pack-objects --revs --use-bitmap-index --stdout >/dev/null
BUG: pack-revindex.c:394: pack_pos_to_midx: out-of-bounds object at 0
(note that in the above reproduction, both `--use-bitmap-index` and
`--stdout` are important, since without the former we won't even both to
load the .bitmap, and without the latter we wont attempt pack reuse).
The problem occurs when we try to discover the identity of the
preferred pack to determine which range if any of existing packs we can
reuse verbatim. This path is: `reuse_packfile_objects()` ->
`reuse_partial_packfile_from_bitmap()` -> `midx_preferred_pack()`.
#4 0x000055555575401f in pack_pos_to_midx (m=0x555555997160, pos=0) at pack-revindex.c:394
#5 0x00005555557502c8 in midx_preferred_pack (bitmap_git=0x55555599c280) at pack-bitmap.c:1431
#6 0x000055555575036c in reuse_partial_packfile_from_bitmap (bitmap_git=0x55555599c280, packfile_out=0x5555559666b0 <reuse_packfile>,
entries=0x5555559666b8 <reuse_packfile_objects>, reuse_out=0x5555559666c0 <reuse_packfile_bitmap>) at pack-bitmap.c:1452
#7 0x00005555556041f6 in get_object_list_from_bitmap (revs=0x7fffffffcbf0) at builtin/pack-objects.c:3658
#8 0x000055555560465c in get_object_list (ac=2, av=0x555555997050) at builtin/pack-objects.c:3765
#9 0x0000555555605e4e in cmd_pack_objects (argc=0, argv=0x7fffffffe920, prefix=0x0) at builtin/pack-objects.c:4154
Since neither the .bitmap or MIDX stores the identity of the
preferred pack, we infer it by trying to load the first object in
pseudo-pack order, and then asking the MIDX which pack was chosen to
represent that object.
But this fails our bounds check, since there are zero objects in the
MIDX to begin with, which results in the BUG().
We could catch this more carefully in `midx_preferred_pack()`, but
signaling the absence of a preferred pack out to all of its callers is
somewhat awkward.
Instead, let's avoid writing a MIDX .bitmap without any objects
altogether. We catch this case in `write_midx_internal()`, and emit a
warning if the caller indicated they wanted to write a bitmap before
clearing out the relevant flags. If we somehow got to
write_midx_bitmap(), then we will call BUG(), but this should now be an
unreachable path.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the __gitcomp_directories method to de-quote and handle unusual
characters in directory names. Although this initially involved an attempt
to re-use the logic in __git_index_files, this method removed
subdirectories (e.g. folder1/0/ became folder1/), so instead new custom
logic was placed directly in the __gitcomp_directories method.
Note there are two tests for this new functionality - one for spaces and
accents and one for backslashes and tabs. The backslashes and tabs test
uses FUNNYNAMES to avoid running on Windows. This is because:
1. Backslashes are explicitly not allowed in Windows file paths.
2. Although tabs appear to be allowed when creating a file in a Windows
bash shell, they actually are not renderable (and appear as empty boxes
in the shell).
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Lessley Dennington <lessleydennington@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use new __gitcomp_directories method to complete directory names in cone
mode sparse-checkouts. This method addresses the caveat of poor
performance in monorepos from the previous commit (by completing only one
level of directories).
The unusual character caveat from the previous commit will be fixed by the
final commit in this series.
Co-authored-by: Elijah Newren <newren@gmail.com>
Co-authored-by: Lessley Dennington <lessleydennington@gmail.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct multiple issues with tab completion of the git sparse-checkout
command. These issues were:
1. git sparse-checkout <TAB> previously resulted in an incomplete list of
subcommands (it was missing reapply and add).
2. Subcommand options were not tab-completable.
3. git sparse-checkout set <TAB> and git sparse-checkout add <TAB> showed
both file names and directory names. While this may be a less surprising
behavior for non-cone mode, cone mode sparse checkouts should complete
only directory names.
Note that while the new strategy of just using git ls-tree to complete on
directory names is simple and a step in the right direction, it does have
some caveats. These are:
1. Likelihood of poor performance in large monorepos (as a result of
recursively completing directory names).
2. Inability to handle paths containing unusual characters.
These caveats will be fixed by subsequent commits in this series.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We just fixed a class of recently introduced bugs where calling, say,
`git fetch -h` outside a repository would not show the usage but instead
show an ugly `BUG` message.
Let's verify that this does not regress anymore.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we taught these commands about the sparse index, we did not account
for the fact that the `cmd_*()` functions _can_ be called without a
gitdir, namely when `-h` is passed to show the usage.
A plausible approach to address this is to move the
`prepare_repo_settings()` calls right after the `parse_options()` calls:
The latter will never return when it handles `-h`, and therefore it is
safe to assume that we have a `gitdir` at that point, as long as the
built-in is marked with the `RUN_SETUP` flag.
However, it is unfortunately not that simple. In `cmd_pack_objects()`,
for example, the repo settings need to be fully populated so that the
command-line options `--sparse`/`--no-sparse` can override them, not the
other way round.
Therefore, we choose to imitate the strategy taken in `cmd_diff()`,
where we simply do not bother to prepare and initialize the repo
settings unless we have a `gitdir`.
This fixes https://github.com/git-for-windows/git/issues/3688
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This method was created in f1de981e8 (config: fix leaks from
git_config_get_string_const(), 2020-08-14) but its only use was in the
repo_config_get_string_tmp() method, also declared in config.h and
implemented in config.c. Since this is otherwise unused and is a very
similar implementation to git_configset_get_value(), let's remove this
declaration.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding a new worktree, it is reasonable to expect that we want to
use the current set of sparse-checkout settings for that new worktree.
This is particularly important for repositories where the worktree would
become too large to be useful. This is even more important when using
partial clone as well, since we want to avoid downloading the missing
blobs for files that should not be written to the new worktree.
The only way to create such a worktree without this intermediate step of
expanding the full worktree is to copy the sparse-checkout patterns and
config settings during 'git worktree add'. Each worktree has its own
sparse-checkout patterns, and the default behavior when the
sparse-checkout file is missing is to include all paths at HEAD. Thus,
we need to have patterns from somewhere, they might as well be the
current worktree's patterns. These are then modified independently in
the future.
In addition to the sparse-checkout file, copy the worktree config file
if worktree config is enabled and the file exists. This will copy over
any important settings to ensure the new worktree behaves the same as
the current one. The only exception we must continue to make is that
core.bare and core.worktree should become unset in the worktree's config
file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git sparse-checkout set/init` enables worktree-specific
configuration[*] by setting extensions.worktreeConfig=true, but neglects
to perform the additional necessary bookkeeping of relocating
`core.bare=true` and `core.worktree` from $GIT_COMMON_DIR/config to
$GIT_COMMON_DIR/config.worktree, as documented in git-worktree.txt. As a
result of this oversight, these settings, which are nonsensical for
secondary worktrees, can cause Git commands to incorrectly consider a
worktree bare (in the case of `core.bare`) or operate on the wrong
worktree (in the case of `core.worktree`). Fix this problem by taking
advantage of the recently-added init_worktree_config() which enables
`extensions.worktreeConfig` and takes care of necessary bookkeeping.
While at it, for backward-compatibility reasons, also stop upgrading the
repository format to "1" since doing so is (unintentionally) not
required to take advantage of `extensions.worktreeConfig`, as explained
by 11664196ac ("Revert "check_repository_format_gently(): refuse
extensions for old repositories"", 2020-07-15).
[*] The main reason to use worktree-specific config for the
sparse-checkout builtin was to avoid enabling sparse-checkout patterns
in one and causing a loss of files in another. If a worktree does not
have a sparse-checkout patterns file, then the sparse-checkout logic
will not kick in on that worktree.
Reported-by: Sean Allred <allred.sean@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some config settings, such as those for sparse-checkout, are likely
intended to only apply to one worktree at a time. To make this write
easier, add a new config API method, repo_config_set_worktree_gently().
This method will attempt to write to the worktree-specific config, but
will instead write to the common config file if worktree config is not
enabled. The next change will introduce a consumer of this method.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Upgrading a repository to use extensions.worktreeConfig is non-trivial.
There are several steps involved, including moving some config settings
from the common config file to the main worktree's config.worktree file.
The previous change updated the documentation with all of these details.
Commands such as 'git sparse-checkout set' upgrade the repository to use
extensions.worktreeConfig without following these steps, causing some
user pain in some special cases.
Create a helper method, init_worktree_config(), that will be used in a
later change to fix this behavior within 'git sparse-checkout set'. The
method is carefully documented in worktree.h.
Note that we do _not_ upgrade the repository format version to 1 during
this process. The worktree config extension must be considered by Git
and third-party tools even if core.repositoryFormatVersion is 0 for
historical reasons documented in 11664196ac ("Revert
"check_repository_format_gently(): refuse extensions for old
repositories"", 2020-07-15). This is a special case for this extension,
and newer extensions (such as extensions.objectFormat) still need to
upgrade the repository format version.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The extensions.worktreeConfig extension was added in 58b284a (worktree:
add per-worktree config files, 2018-10-21) and was somewhat documented
in Documentation/git-config.txt. However, the extensions.worktreeConfig
value was not specified further in the list of possible config keys. The
location of the config.worktree file is not specified, and there are
some precautions that should be mentioned clearly, but are only
mentioned in git-worktree.txt.
Expand the documentation to help users discover the complexities of
extensions.worktreeConfig by adding details and cross links in these
locations (relative to Documentation/):
- config/extensions.txt
- git-config.txt
- git-worktree.txt
The updates focus on items such as
* $GIT_DIR/config.worktree takes precedence over $GIT_COMMON_DIR/config.
* The core.worktree and core.bare=true settings are incorrect to have in
the common config file when extensions.worktreeConfig is enabled.
* The sparse-checkout settings core.sparseCheckout[Cone] are recommended
to be set in the worktree config.
As documented in 11664196ac ("Revert "check_repository_format_gently():
refuse extensions for old repositories"", 2020-07-15), this extension
must be considered regardless of the repository format version for
historical reasons.
A future change will update references to extensions.worktreeConfig
within git-sparse-checkout.txt, but a behavior change is needed before
making those updates.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in codepaths that use the "struct
transport_ls_refs_options" API. Since the introduction of the struct
in 39835409d1 (connect, transport: encapsulate arg in struct,
2021-02-05) the caller has been responsible for freeing it.
That commit in turn migrated code originally added in
402c47d939 (clone: send ref-prefixes when using protocol v2,
2018-07-20) and b4be74105f (ls-remote: pass ref prefixes when
requesting a remote's refs, 2018-03-15). Only some of those codepaths
were releasing the allocated resources of the struct, now all of them
will.
Mark the "t/t5511-refspec.sh" test as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target). Previously 24/47 tests would fail.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that happened when the --path option was
provided. This leak has been with us ever since the option was added
in 3970243150 (add --path option to git hash-object, 2008-08-03).
We can now mark "t1007-hash-object.sh" as passing when git is compiled
with SANITIZE=leak. It'll now run in the the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-index --refresh" has been taught to deal better with
racy timestamps (just like "git status" already does).
* ms/update-index-racy:
update-index: refresh should rewrite index in case of racy timestamps
t7508: add tests capturing racy timestamp handling
t7508: fix bogus mtime verification
test-lib: introduce API for verifying file mtime
Assorted updates to "git cat-file", especially "-h".
* ab/cat-file:
cat-file: s/_/-/ in typo'd usage_msg_optf() message
cat-file: don't whitespace-pad "(...)" in SYNOPSIS and usage output
cat-file: use GET_OID_ONLY_TO_DIE in --(textconv|filters)
object-name.c: don't have GET_OID_ONLY_TO_DIE imply *_QUIETLY
cat-file: correct and improve usage information
cat-file: fix remaining usage bugs
cat-file: make --batch-all-objects a CMDMODE
cat-file: move "usage" variable to cmd_cat_file()
cat-file docs: fix SYNOPSIS and "-h" output
parse-options API: add a usage_msg_optf()
cat-file tests: test messaging on bad objects/paths
cat-file tests: test bad usage
Fix a hand-rolled alloca() imitation that may have violated
alignment requirement of data being sorted in compatibility
implementation of qsort_s() and stable qsort().
* jc/qsort-s-alignment-fix:
stable-qsort: avoid using potentially unaligned access
compat/qsort_s.c: avoid using potentially unaligned access
"git apply" (ab)used the util pointer of the string-list to keep
track of how each symbolic link needs to be handled, which has been
simplified by using strset.
* rs/apply-symlinks-use-strset:
apply: use strsets to track symlinks
Code clean-up.
* rs/grep-expr-cleanup:
grep: use grep_and_expr() in compile_pattern_and()
grep: extract grep_binexp() from grep_or_expr()
grep: use grep_not_expr() in compile_pattern_not()
grep: use grep_or_expr() in compile_pattern_or()
* jh/p4-spawning-external-commands-cleanup:
git-p4: don't print shell commands as python lists
git-p4: pass command arguments as lists instead of using shell
git-p4: don't select shell mode using the type of the command argument
"git pull --rebase" ignored the rebase.autostash configuration
variable when the remote history is a descendant of our history,
which has been corrected.
* pb/pull-rebase-autostash-fix:
pull --rebase: honor rebase.autostash when fast-forwarding
* add '<>' around arguments where missing
* convert plurals into '...' forms
This applies the style guide for documentation.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the same message when an invalid value is passed to a command line
option or a configuration variable.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Find more incompatible options to factorize.
When more than two options are mutually exclusive, print the ones
which are actually on the command line.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have this file added in 06ba9d03e3 (t0051: test GIT_TRACE to a
windows named pipe, 2018-09-11) use the same "skip_all" pattern as an
existing Windows-only test added in 0e218f91c2 (mingw: unset PERL5LIB
by default, 2018-10-30) uses.
This way TAP consumers like "prove" will show a nice summary when the
test is skipped. Instead of:
$ prove t0051-windows-named-pipe.sh
[...]
t0051-windows-named-pipe.sh .. ok
[...]
We will prominently show a "skipped" notice:
$ prove t0051-windows-named-pipe.sh
[...]
t0051-windows-named-pipe.sh ... skipped: skipping Windows-specific tests
[...]
This is because we are now making use of the right TAP-y way to
communicate this to the consumer. I.e. skipping the whole test file,
v.s. skipping individual tests (in this case there's only one test).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To improve the submodules UX, we would like to teach Git to handle
branches in submodules. Start this process by teaching "git branch" the
--recurse-submodules option so that "git branch --recurse-submodules
topic" will create the `topic` branch in the superproject and its
submodules.
Although this commit does not introduce breaking changes, it does not
work well with existing --recurse-submodules commands because "git
branch --recurse-submodules" writes to the submodule ref store, but most
commands only consider the superproject gitlink and ignore the submodule
ref store. For example, "git checkout --recurse-submodules" will check
out the commits in the superproject gitlinks (and put the submodules in
detached HEAD) instead of checking out the submodule branches.
Because of this, this commit introduces a new configuration value,
`submodule.propagateBranches`. The plan is for Git commands to
prioritize submodule ref store information over superproject gitlinks if
this value is true. Because "git branch --recurse-submodules" writes to
submodule ref stores, for the sake of clarity, it will not function
unless this configuration value is set.
This commit also includes changes that support working with submodules
from a superproject commit because "branch --recurse-submodules" (and
future commands) need to read .gitmodules and gitlinks from the
superproject commit, but submodules are typically read from the
filesystem's .gitmodules and the index's gitlinks. These changes are:
* add a submodules_of_tree() helper that gives the relevant
information of an in-tree submodule (e.g. path and oid) and
initializes the repository
* add is_tree_submodule_active() by adding a treeish_name parameter to
is_submodule_active()
* add the "submoduleNotUpdated" advice to advise users to update the
submodules in their trees
Incidentally, fix an incorrect usage string that combined the 'list'
usage of git branch (-l) with the 'create' usage; this string has been
incorrect since its inception, a8dfd5eac4 (Make builtin-branch.c use
parse_options., 2007-10-07).
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug that's been here since 7cc8f97108 (pack-objects: implement
bitmap writing, 2013-12-21), we did not call stop_progress() if we
reached the early exit in this function.
We could call stop_progress() before we return, but better yet is to
defer calling start_progress() until we need it. For now this only
matters in practice because we'd previously omit the "region_leave"
for the progress trace2 event.
Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug that's been with us ever since 98a1364740 (trace2: log
progress time and throughput, 2020-05-12), when the
stop_progress_msg() API was used we didn't log a "region_leave" for
the "region_enter" we start in "start_progress_delay()".
The only user of the "stop_progress_msg()" function is
"index-pack". Let's add a previously failing test to check that we
have the same number of "region_enter" and "region_leave" events, with
"-v" we'll log progress even in the test environment.
In addition to that we've had a submarine bug here introduced with
9d81ecb52b (progress: add sparse mode to force 100% complete message,
2019-03-21). The "start_sparse_progress()" API would only do the right
thing if the progress was ended with "stop_progress()", not
"stop_progress_msg()".
The only user of that API uses "stop_progress()", but let's still fix
that along with the trace2 issue by making "stop_progress()" a trivial
wrapper for "stop_progress_msg()".
We can also drop the "if (progress)" test from
"finish_if_sparse()". It's now a helper for the small
"stop_progress_msg()" function. We'll already have returned from it if
"progress" is "NULL".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create two new static helpers for the stop_progress() and
stop_progress_msg() functions.
As we'll see in the subsequent commit having those two split up
doesn't make much sense, and results in a bug in how we log to
trace2. This narrow preparatory change makes the diff for that
subsequent change smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 98a1364740 (trace2: log progress time and throughput,
2020-05-12) stop_progress() dereferences a "struct progress **"
parameter in several places. Extract a dereferenced variable to reduce
clutter and make it clearer who needs to write to this parameter.
Now instead of using "*p_progress" several times in stop_progress() we
check it once for NULL and then use a dereferenced "progress" variable
thereafter. This uses the same pattern as the adjacent
stop_progress_msg() function, see ac900fddb7 (progress: don't
dereference before checking for NULL, 2020-08-10).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an inconsistency introduced in dc6a0757c4 (make struct progress
an opaque type, 2007-10-30) and rename the "progress" parameters to
stop_progress{,_msg}() to "p_progress". Now these match the
corresponding parameters in the *.c code.
While we're at it let's move the definition of the former below the
latter, a subsequent change will start defining stop_progress() in
terms of stop_progress_msg(). Let's also remove the excess whitespace
at the end of the file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test what happens when we "stop" without a "start", omit the "stop"
after a "start", or start two concurrent progress bars. This
extends the trace2 tests added in 98a1364740 (trace2: log progress
time and throughput, 2020-05-12).
These tests are not merely testing the helper, but invalid API usage
that can happen if the progress.c API is misused.
The "without stop" test will leak under SANITIZE=leak, since this
buggy use of the API will leak memory. But let's not skip it entirely,
or use the "!SANITIZE_LEAK" prerequisite check as we'd do with tests
that we're skipping due to leaks we haven't fixed yet. Instead
annotate the specific command that should skip leak checking with
custom $LSAN_OPTIONS[1].
1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the usage of the "test-tool progress" introduced in
2bb74b53a4 (Test the progress display, 2019-09-16) to take command
like "start" and "stop" on stdin, instead of running them implicitly.
This makes for tests that are easier to read, since the recipe will
mirror the API usage, and allows for easily testing invalid usage that
would yield (or should yield) a BUG(), e.g. providing two "start"
calls in a row. A subsequent commit will add such tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we have braces on one arm of an if/else all of them should have it,
per the CodingGuidelines's "When there are multiple arms to a
conditional[...]" advice. This formatting change makes a subsequent
commit smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in the test-progress helper, and mark the
corresponding "t0500-progress-display.sh" test as being leak-free
under SANITIZE=leak. This fixes a leak added in 2bb74b53a4 (Test the
progress display, 2019-09-16).
My 48f68715b1 (tr2: stop leaking "thread_name" memory, 2021-08-27)
had fixed another memory leak in this test (as it did some trace2
testing).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The plain quoted exclamation mark renders as italics in the
Windows pdf help manual.
Fix this with back-tick quoting and surrounding double quotes
as exemplified by the gitignore.txt guide.
While at it, fix the surrounding double quotes for the other
special characters usages.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a GIT_COMPLETION_SHOW_ALL_COMMANDS=1 configuration setting to go
with the existing GIT_COMPLETION_SHOW_ALL=1 added in
c099f579b9 (completion: add GIT_COMPLETION_SHOW_ALL env var,
2020-08-19).
This will include plumbing commands such as "cat-file" in "git <TAB>"
and "git c<TAB>" completion. Without/with this I have 134 and 243
completion with git <TAB>, respectively.
It was already possible to do this by tweaking
GIT_TESTING_PORCELAIN_COMMAND_LIST= from the outside, that testing
variable was added in 84a9713106 (completion: let git provide the
completable command list, 2018-05-20). Doing this before loading
git-completion.bash worked:
export GIT_TESTING_PORCELAIN_COMMAND_LIST="$(git --list-cmds=builtins,main,list-mainporcelain,others,nohelpers,alias,list-complete,config)"
But such testing variables are not meant to be used from the outside,
and we make no guarantees that those internal won't change. So let's
expose this as a dedicated configuration knob.
It would be better to teach --list-cmds=* a new category which would
include all of these groups, but that's a larger change that we can
leave for some other time.
1. https://lore.kernel.org/git/CAGP6POJ9gwp+t-eP3TPkivBLLbNb2+qj=61Mehcj=1BgrVOSLA@mail.gmail.com/
Reported-by: Hongyi Zhao <hongyi.zhao@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change tests of git-completion.bash that re-source it to do so inside
a subshell. Re-sourcing it will clobber variables it sets, and in the
case of the "GIT_COMPLETION_SHOW_ALL=1" test added in
ca2d62b787 (parse-options: don't complete option aliases by default,
2021-07-16) change the behavior of the completion persistently.
Aside from the addition of "(" and ")" on new lines this is an
indentation-only change, only the "(" and ")" lines are changed under
"git diff -w".
So let's change that test, and for good measure do the same for the
three tests that precede it, which were added in
8b0eaa41f2 (completion: clear cached --options when sourcing the
completion script, 2018-03-22). The may not be wrong, but doing this
establishes a more reliable pattern for future tests, which might use
these as a template to copy.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As Documentation/CodingGuidelines says, our shell scripts
(including tests) are to use HT for indentation, but this script
uses 4-column indent with SP. Fix this.
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many invocations of the test_expect_success command in this
file are written in old style where the command, an optional
prerequisite, and the test title are written on separate
lines, and the executable script string begins on its own
line, and these lines are pasted together with backslashes
as necessary.
An invocation of the test_expect_success command in modern
test scripts however writes the prerequisite and the title
on the same line as the test_expect_success command itself,
and ends the line with a single quote that begins the
executable script string.
Update the style for uniformity.
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove mistaken right square brackets from "git-diff"
usage string. Make the usage string conform to "git-diff"
documentation (Documentation/git-diff.txt).
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally diffs will contain a hunk header of the format
"@@ -2,2 +2,15 @@ code". However when there is only 1 line of
change, the unified diff format allows for the second comma
separated value to be omitted in either before or after
line counts.
This can produce hunk headers that look like
"@@ -2 +2,18 @@ code" or "@@ -2,2 +2 @@ code".
As a result, scan_hunk_header mistakenly returns the line
number as line count, which then results in unpredictable
parsing errors with the rest of the patch, including giving
multiple lines of output for a single commit.
Fix by explicitly setting line count to 1 when there is
no comma, and add a test.
apply.c contains this same logic except it is correct. A
worthwhile future project might be to unify these two diff
parsers so they both benefit from fixes.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up the tests for patch-id by moving file preparation
tasks inside the test body and redirecting files directly into
stdin instead of using 'cat'.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Doing diffs for merges are special; they should typically avoid history
simplification. For example, with
git log --diff-merges=first-parent -- path
the default history simplification would remove merge commits from
consideration if the file "path" matched the second parent. That is
counter to what the user wants when looking for first-parent diffs.
Similar comments can be made for --diff-merges=separate (which diffs
against both parents) and --diff-merges=remerge (which diffs against a
remerge of the merge commit).
However, history simplification still makes sense if not doing diffing
merges, and it also makes sense for the combined and dense-combined
forms of diffing merges (because both of those are defined to only show
a diff when the merge result at the relevant paths differs from *both*
parents).
So, for separate, first-parent, and remerge styles of diff-merges, turn
off history simplification.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recursive merge involves merging the merge bases of the two branches
being merged. Such an inner merge can itself generate conflict notices.
While such notices may be useful when initially trying to create a
merge, they seem to just be noise when investigating merges later with
--remerge-diff. (Especially when both sides of the outer merge resolved
the conflict the same way leading to no overall conflict.) Remove them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Conflicts such as modify/delete, rename/rename, or file/directory are
not representable via content conflict markers, and the normal output
messages notifying users about these were dropped with --remerge-diff.
While we don't want these messages randomly shown before the commit
and diff headers, we do want them to still be shown; include them as
part of the diff headers instead.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When additional headers are provided, we need to
* add diff_filepairs to diff_queued_diff for each paths in the
additional headers map which, unless that path is part of
another diff_filepair already found in diff_queued_diff
* format the headers (colorization, line_prefix for --graph)
* make sure the various codepaths that attempt to return early
if there are "no changes" take into account the headers that
need to be shown.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When users run
git show --remerge-diff $MERGE_COMMIT
or
git log -p --remerge-diff ...
stdout is not an appropriate location to dump conflict messages, but we
do want to provide them to users. We will include them in the diff
headers instead...but for that to work, we need for any multiline
messages to replace newlines with both a newline and a space. Add a new
flag to signal when we want these messages modified in such a fashion,
and use it in path_msg() to modify these messages this way. Also, allow
a special prefix to be specified for these headers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
path_msg() has the ability to mark messages as omittable, designed for
remerge-diff where we'll instead be showing conflict messages as diff
headers for a subsequent diff. While all these messages are very useful
when trying to create a merge initially, early use with the
--remerge-diff feature (the only user of this omittable conflict message
capability), suggests that the particular messages marked in this commit
are just noise when trying to see what changes users made to create a
merge commit. Mark them as omittable.
Note that there were already a few messages marked as omittable in
merge-ort when doing a remerge-diff, because the development of
--remerge-diff preceded the upstreaming of merge-ort and I was trying to
ensure merge-ort could handle all the necessary requirements. See
commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
output processing", 2020-12-03) for the initial details. For some
examples of already-marked-as-omittable messages, see either
"Auto-merging <path>" or some of the submodule update hints. This
commit just adds two more messages that should also be omittable.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of immediately printing ll-merge warnings to stderr, we save
them in our output strbuf. Besides allowing us to move these warnings
to a special file for --remerge-diff, this has two other benefits for
regular merges done by merge-ort:
* The deferral of messages ensures we can print all messages about
any given path together (merge-recursive was known to sometimes
intersperse messages about other paths, particularly when renames
were involved).
* The deferral of messages means we can avoid printing spurious
conflict messages when we just end up aborting due to local user
modifications in the way. (In contrast to merge-recursive.c which
prematurely checks for local modifications in the way via
unpack_trees() and gets the check wrong both in terms of false
positives and false negatives relative to renames, merge-ort does
not perform the local modifications in the way check until the
checkout() step after the full merge has been computed.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since some callers may want to send warning messages to somewhere other
than stdout/stderr, stop printing "warning: Cannot merge binary files"
from ll-merge and instead modify the return status of ll_merge() to
indicate when a merge of binary files has occurred. Message printing
probably does not belong in a "low-level merge" anyway.
This commit continues printing the message as-is, just from the callers
instead of within ll_merge(). Future changes will start handling the
message differently in the merge-ort codepath.
There was one special case here: the callers in rerere.c do NOT check
for and print such a message; since those code paths explicitly skip
over binary files, there is no reason to check for a return status of
LL_MERGE_BINARY_CONFLICT or print the related message.
Note that my methodology included first modifying ll_merge() to return
a struct, so that the compiler would catch all the callers for me and
ensure I had modified all of them. After modifying all of them, I then
changed the struct to an enum.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --remerge-diff option will need to create new blobs and trees
representing the "automatic merge" state. If one is traversing a
long project history, one can easily get hundreds of thousands of
loose objects generated during `log --remerge-diff`. However, none of
those loose objects are needed after we have completed our diff
operation; they can be summarily deleted.
Add a new helper function to tmp_objdir to discard all the contained
objects, and call it after each merge is handled.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this option is specified, we remerge all (two parent) merge commits
and diff the actual merge commit to the automatically created version,
in order to show how users removed conflict markers, resolved the
different conflict versions, and potentially added new changes outside
of conflict regions in order to resolve semantic merge problems (or,
possibly, just to hide other random changes).
This capability works by creating a temporary object directory and
marking it as the primary object store. This makes it so that any blobs
or trees created during the automatic merge are easily removable
afterwards by just deleting all objects from the temporary object
directory.
There are a few ways that this implementation is suboptimal:
* `log --remerge-diff` becomes slow, because the temporary object
directory can fill with many loose objects while running
* the log output can be muddied with misplaced "warning: cannot merge
binary files" messages, since ll-merge.c unconditionally writes those
messages to stderr while running instead of allowing callers to
manage them.
* important conflict and warning messages are simply dropped; thus for
conflicts like modify/delete or rename/rename or file/directory which
are not representable with content conflict markers, there may be no
way for a user of --remerge-diff to know that there had been a
conflict which was resolved (and which possibly motivated other
changes in the merge commit).
* when fixing the previous issue, note that some unimportant conflict
and warning messages might start being included. We should instead
make sure these remain dropped.
Subsequent commits will address these issues.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Give the traditional default fetch.negotiationAlgorithm the name
'consecutive'. Also allow a choice of 'default' to have Git decide
between the choices (currently, picking 'skipping' if
feature.experimental is true and 'consecutive' otherwise). Update the
documentation accordingly.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit af3a67de01 ("negotiator: unknown fetch.negotiationAlgorithm
should error out", 2018-08-01), error handling for an unknown
fetch.negotiationAlgorithm was added with the code die()ing. This was
also added to the documentation for the fetch.negotiationAlgorithm
option, to make it explicit that the code would die on unknown values.
This behavior was lost with commit aaf633c2ad ("repo-settings: create
feature.experimental setting", 2019-08-13). Restore it so that the
behavior again matches the documentation.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 3050b6dfc7 (repo-settings.c: simplify the setup,
2021-09-21), the branch for handling fetch.negotiationAlgorithm=default
was deleted. Since this value is documented in
Documentation/config/fetch.txt, restore the check for this value.
Note that this change caused an observable bug: if someone sets
feature.experimental=true in config, and then passes "-c
fetch.negotiationAlgorithm=default" on the command line in an attempt to
override the config, then the override is ignored. Fix the bug by not
ignoring the value of "default".
Technically, before commit 3050b6dfc7, repo-settings would treat any
fetch.negotiationAlgorithm value other than "skipping" or "noop" as a
request for "default", but I think it probably makes more sense to
ignore such broken requests and leave fetch.negotiationAlgorithm with
the default value rather than the value of "default". (If that sounds
confusing, note that "default" is usually the default value, but when
feature.experimental=true, "skipping" is the default value.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix misbehavior in Git.pm that dates back to the very first version of
the library in git.git added in b1edc53d06 (Introduce Git.pm (v4),
2006-06-24). When we fail to execute a command we shouldn't ignore all
signals, those can happen e.g. if abort() is called, or if the command
segfaults.
Because of this we'd consider e.g. a command that died due to LSAN
exiting with abort() successful, as is the case with the tests listed
as running successfully with SANITIZE=leak in 9081a421a6 (checkout:
fix "branch info" memory leaks, 2021-11-16). We did run them
successfully, but only because we ignored these errors.
This was then made worse by the use of "abort_on_error=1" for LSAN
added in 85b81b35ff (test-lib: set LSAN_OPTIONS to abort by default,
2017-09-05). Doing that makes sense, but without providing that option
we'd have a "$? >> 8" of "23" on failure, with abort_on_error=1 we'll
get "0".
All of our tests pass even without the SIGPIPE exception being added
here, but as the code appears to have been trying to ignore it let's
keep ignoring it for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a hidden ref, e.g.:
$ git push origin HEAD:refs/hidden/foo
"receive-pack" will reject our request with an error message like this:
! [remote rejected] HEAD -> refs/hidden/foo (deny updating a hidden ref)
The remote side ("git-receive-pack") will not create the hidden ref as
expected, but the pack file sent by "git-send-pack" is left inside the
remote repository. I.e. the quarantine directory is not purged as it
should be.
Add a checkpoint before calling "tmp_objdir_migrate()" and after calling
the "pre-receive" hook to purge that temporary data in the quarantine
area when there is no command ready to run.
The reason we do not add the checkpoint before the "pre-receive" hook,
but after it, is that the "pre-receive" hook is called with a switch-off
"skip_broken" flag, and all commands, even broken ones, should be fed
by calling "feed_receive_hook()".
Add a new test case in t5516 as well.
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Helped-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Chen Bojun <bojun.cbj@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Consolidate the logic for deciding when to create a new branch in
cmd_branch(), and save the result for reuse. Besides making the function
more explicit, this allows us to validate options that can only be used
when creating a branch. Such an option does not exist yet, but one will
be introduced in a subsequent commit.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a dry_run parameter to create_branch() such that dry_run = 1 will
validate a new branch without trying to create it. This will be used in
`git branch --recurse-submodules` to ensure that the new branch can be
created in all submodules.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the previous commit, there are no more invocations of
create_branch() that do not create a branch because:
* BRANCH_TRACK_OVERRIDE is no longer passed
* clobber_head_ok = true and force = false is never passed
Assert these situations, delete dead code and ensure that we're handling
clobber_head_ok and force correctly by introducing tests for `git branch
--force`. As a result, create_branch() now always creates a branch.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is preparation for a future commit that will simplify
create_branch() so that it always creates a branch. This will allow
create_branch() to accept a dry_run parameter (which is needed for "git
branch --recurse-submodules").
create_branch() used to always create a branch, but 4fc5006676 (Add
branch --set-upstream, 2010-01-18) changed it to also be able to set
tracking information without creating a branch.
Refactor the code that sets tracking information into its own functions
dwim_branch_start() and dwim_and_setup_tracking(). Also change an
invocation of create_branch() in cmd_branch() in builtin/branch.c to use
dwim_and_setup_tracking(), since that invocation is only for setting
tracking information (in "git branch --set-upstream-to").
As of this commit, create_branch() is no longer invoked in a way that
does not create branches.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `merge.ff` is set to `only` in .gitconfig, `git subtree pull` will
fail with error `fatal: Not possible to fast-forward, aborting.`, but
the command does want to make merges in these places. Add `--no-ff`
argument to `git merge` to enforce this behaviour.
Signed-off-by: Thomas Koutcher <thomas.koutcher@online.fr>
Reviewed-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests as REFFILES if they rely on packed refs. Use ref-store
helper to create bogus refs.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable backend doesn't support mere existence of reflogs.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Deleting a ref in reftable just records a (ObjectID => ZeroID)
transaction in the reflog. To ensure 'for_each_reflog()' test below
works, explictly delete reflogs for deleted refs.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pruning refs fails, we print an error to stderr, but still
exit 0 from 'git fetch'. Since this is a genuine error, fetch
should be exiting with some non-zero exit code. Make it so.
The --prune option was introduced in f360d844de ("builtin-fetch: add
--prune option", 2009-11-10). Unfortunately it's unclear from that
commit whether ignoring the exit code was an oversight or
intentional, but it feels like an oversight.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in 2.35 that roke the use of "rebase" and "stash"
in a secondary worktree.
* en/keep-cwd:
sequencer, stash: fix running from worktree subdir
The `git` executable has these two very useful options:
-C <directory>:
switch to the specified directory before performing any actions
-c <key>=<value>:
temporarily configure this setting for the duration of the
specified scalar subcommand
With this commit, we teach the `scalar` executable the same trick.
Note: It might look like a good idea to try to reuse the
`handle_options()` function in `git.c` instead of replicating only the
`-c`/`-C` part. However, that function is not only not in `libgit.a`, it
is also intricately entangled with the rest of the code in `git.c` that
is necessary e.g. to handle `--paginate`. Besides, no other option
handled by that `handle_options()` function is relevant to Scalar,
therefore the cost of refactoring vastly would outweigh the benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message when invoking a negotiate-only fetch without providing
any tips incorrectly refers to a --negotiate-tip=* argument. Fix this to
use the actual argument, --negotiation-tip=*.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These were introduced in commit 55dfcf9591 ("sparse-checkout: clear
tracked sparse dirs", 2021-09-08) and missed in my review at the time.
Plug the leaks.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--diff-filter=<bits>` option allows to filter the diff by certain
criteria, for example `R` to only show renamed files. It also supports
negating a filter via a down-cased letter, i.e. `r` to show _everything
but_ renamed files.
However, the code is a bit overzealous when trying to figure out whether
`git diff` should start with all diff-filters turned on because the user
provided a lower-case letter: if the `--diff-filter` argument starts
with an upper-case letter, we must not start with all bits turned on.
Even worse, it is possible to specify the diff filters in multiple,
separate options, e.g. `--diff-filter=AM [...] --diff-filter=m`.
Let's accumulate the include/exclude filters independently, and only
special-case the "only exclude filters were specified" case after
parsing the options altogether.
Note: The code replaced by this commit took pains to avoid setting any
unused bits of `options->filter`. That was unnecessary, though, as all
accesses happen via the `filter_bit_tst()` function using specific bits,
and setting the unused bits has no effect. Therefore, we can simplify
the code by using `~0` (or in this instance, `~<unwanted-bit>`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This prepares for a more careful handling of the `--diff-filter`
options over the next few commits.
This commit is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, before we had `--intent-to-add`, there was no way that `git
diff-files` could see added files: if a file did not exist in the index,
`git diff-files` would not show it because it looks only at worktree
files when there is an index entry at the same path.
We used this example in the documentation of the diff options to explain
that not every `--diff-filter=<option>` has an effect in all scenarios.
Even when we added `--intent-to-add`, the comment was still correct,
because initially we showed such files as modified instead of added.
However, when that bug was fixed in feea6946a5 (diff-files: treat
"i-t-a" files as "not-in-index", 2020-06-20), the comment in the
documentation became incorrect.
Let's just remove it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8a2cd3f512 (stash: remove the stash.useBuiltin setting, 2020-03-03),
we removed support for `stash.useBuiltin`, but left a warning in its
place.
After almost two years, and several major versions, it is time to remove
even that warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8a2cd3f512 (stash: remove the stash.useBuiltin setting, 2020-03-03),
we removed the setting, and for a couple of major versions, we still
documented the setting, telling users that it is gone.
We can now safely remove even the documentation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 90a6bb98d1 (legacy stash -p: respect the add.interactive.usebuiltin
setting, 2019-12-21), we added support to use the built-in `add -p` from
the scripted `stash -p`.
In 8a2cd3f512 (stash: remove the stash.useBuiltin setting, 2020-03-03),
we retired the scripted `stash` (including the scripted `stash -p`).
Therefore this support is no longer necessary.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8a2cd3f512 (stash: remove the stash.useBuiltin setting, 2020-03-03),
we removed `git-legacy-stash.sh`. But `git-sh-setup.sh` somehow still
thinks about it. Let's just not.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the "describe your changes well" section to cover whom we are
trying to help by doing so in the first place.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We give a guidance for proposed log message to write problem
statement first, followed by the reasoning behind, and recipe for,
the solution. Clarify that we describe the situation _before_ the
proposed patch is applied in the present tense (not in the past
tense e.g. "we used to do X, but thanks to this commit we now do Y")
for consistency.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce the allocations done by show_ambiguous_object() by moving the
"desc" strbuf into the "struct ambiguous_output" introduced in the
preceding commit.
This doesn't matter for optimization purposes, but since we're
accumulating a "struct strbuf advice" anyway let's follow that pattern
and add a "struct strbuf sb", we can then strbuf_reset() it rather
than calling strbuf_release() for each call to
show_ambiguous_object().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "The candidates are" header that's shown for ambiguous
objects to be shown after we've iterated over all of the objects.
If we get any errors while doing so we don't want to split up the the
header and the list as a result. The two will now be printed together,
as shown in the updated testcase.
As we're accumulating the lines into as "struct strbuf" before
emitting them we need to add a trailing newline to the call in
show_ambiguous_object(). This and the change from "The candidates
are:" to "The candidates are:\n%s" helps to give translators more
context.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the ambiguous tag object output nicer in the case of tag objects
such as ebf3c04b26 (Git 2.32, 2021-06-06) by including the date in
the "tagger" header. I.e.:
$ git rev-parse b7e68
error: short object ID b7e68 is ambiguous
hint: The candidates are:
hint: b7e68c41d9 tag 2021-06-06 - v2.32.0
hint: b7e68ae18e0 commit 2019-12-23 - bisect: use the standard 'if (!var)' way to check for 0
hint: b7e68f6b413 tree
hint: b7e68490b97 blob
b7e68
[...]
Before this we'd emit a "tag" line without a date, e.g.:
hint: b7e68c41d9 tag v2.32.0
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the output of show_ambiguous_object() added in [1] and last
tweaked in [2] and the preceding commit to be more friendly to
translators.
By being able to customize the "<SP><SP>%s\n" format we're even ready
for RTL languages, who'd presumably like to change that to
"%s<SP><SP>\n".
In the case of the existing "tag [tag could not be parsed]" output
we'll now instead emit "[bad tag, could not parse it]". This is
consistent with the "[bad object]" output. Rephrasing the message like
this is possible because we're not unconditionally adding the
type_name() at the beginning.
1. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
2016-09-26)
2. 5cc044e025 (get_short_oid: sort ambiguous objects by type,
then SHA-1, 2018-05-10)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up the handling of OBJ_BAD in the preceding commit and
explicitly handle those cases where parse_tag() fails, or we don't end
up with a non-NULL pointer in in tag->tag.
If we run into such a tag we'd previously be silent about it. We
really should also be handling these batter in parse_tag_buffer() by
being more eager to emit an error(), instead of silently aborting with
"return -1;".
One example of such a tag is the one that's tested for in
"t3800-mktag.sh", where the code takes the "size <
the_hash_algo->hexsz + 24" branch.
But in lieu of earlier missing "error" output let's show the user
something to indicate why we're not showing a tag message in these
cases, now instead of showing:
hint: deadbeef tag
We'll instead display:
hint: deadbeef tag [tag could not be parsed]
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the "unknown type" handling in the code that displays the
ambiguous object list to assert() that we're either going to get the
"real" object types we can pass to type_name(), or a -1 (OBJ_BAD)
return value from oid_object_info().
See [1] for the current output, and [1] for the commit that added the
"unknown type" handling.
We are never going to get an "unknown type" in the sense of custom
types crafted with "hash-object --literally", since we're not using
the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag.
If we manage to otherwise unpack such an object without errors we'll
die() in parse_loose_header_extended() called by sort_ambiguous()
before we get to show_ambiguous_object(), as is asserted by the test
added in the preceding commit.
So saying "unknown type" here was always misleading, we really meant
to say that we had a failure parsing the object at all, i.e. that we
had repository corruption. If the problem is only that it's type is
unknown we won't reach this code.
So let's emit a generic "[bad object]" instead. As our tests added in
the preceding commit show, we'll have emitted various "error" output
already in those cases.
We should do better in the truly "unknown type" cases, which we'd need
to handle if we were passing down the OBJECT_INFO_ALLOW_UNKNOWN_TYPE
flag. But let's leave that for some future improvement. In a
subsequent commit I'll improve the output we do show, and not having
to handle the "unknown type" (as in OBJECT_INFO_ALLOW_UNKNOWN_TYPE)
simplifies that change.
1. 5cc044e025 (get_short_oid: sort ambiguous objects by type,
then SHA-1, 2018-05-10)
2. 1ffa26c461 (get_short_sha1: list ambiguous objects on error,
2016-09-26)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the tests for ambiguous objects to check how we handle objects
where we return OBJ_BAD when trying to parse them. As noted in [1] we
have a blindspot when it comes to this behavior.
Since we need to add new test data here let's extend these tests to be
tested under SHA-256, in d7a2fc8249 (t1512: skip test if not using
SHA-1, 2018-05-13) all of the existing tests were skipped, as they
rely on specific SHA-1 object IDs.
For these tests it only matters that the first 4 characters of the OID
prefix are the same for both SHA-1 and SHA-256. This uses strings that
I mined, and have the same prefix when hashed with both.
We "test_cmp" the full output to guard against any future regressions,
and because a subsequent commit will tweak it. Showing a diff of how
the output changes is helpful to explain those subsequent commits.
The "sed" invocation in test_cmp_failed_rev_parse() doesn't need a
"/g" because under both SHA-1 and SHA-256 we'll wildcard match any
trailing part of the OID after our known starting prefix. We'd like to
convert all of that to just "..." for the "test_cmp" which follows.
1. https://lore.kernel.org/git/YZwbphPpfGk78w2f@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When opening a MIDX/pack-bitmap, we call open_midx_bitmap_1() or
open_pack_bitmap_1() respectively in a loop over the set of MIDXs/packs.
By design, these functions are supposed to be called over every pack and
MIDX, since only one of them should have a valid bitmap.
Ordinarily we return '0' from these two functions in order to indicate
that we successfully loaded a bitmap To signal that we couldn't load a
bitmap corresponding to the MIDX/pack (either because one doesn't exist,
or because there was an error with loading it), we can return '-1'. In
either case, the callers each enumerate all MIDXs/packs to ensure that
at most one bitmap per-kind is present.
But when we fail to load a bitmap that does exist (for example, loading
a MIDX bitmap without finding a corresponding reverse index), we'll
return -1 but leave the 'midx' field non-NULL. So when we fallback to
loading a pack bitmap, we'll complain that the bitmap we're trying to
populate already is "opened", even though it isn't.
Rectify this by setting the '->pack' and '->midx' field back to NULL as
appropriate. Two tests are added: one to ensure that the MIDX-to-pack
bitmap fallback works, and another to ensure we still complain when
there are multiple pack bitmaps in a repository.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a MIDX contains the new `RIDX` chunk, ensure that the reverse index
is read from it instead of the on-disk .rev file. Since we need to
encode the object order in the MIDX itself for correctness reasons,
there is no point in storing the same data again outside of the MIDX.
So, this patch stops writing separate .rev files, and reads it out of
the MIDX itself. This is possible to do with relatively little new code,
since the format of the RIDX chunk is identical to the data in the .rev
file. In other words, we can implement this by pointing the
`revindex_data` field at the reverse index chunk of the MIDX instead of
the .rev file without any other changes.
Note that we have two knobs that are adjusted for the new tests:
GIT_TEST_MIDX_WRITE_REV and GIT_TEST_MIDX_READ_RIDX. The former controls
whether the MIDX .rev is written at all, and the latter controls whether
we read the MIDX's RIDX chunk.
Both are necessary to ensure that the test added at the beginning of
this series continues to work. This is because we always need to write
the RIDX chunk in the MIDX in order to change its checksum, but we want
to make sure reading the existing .rev file still works (since the RIDX
chunk takes precedence by default).
Arguably this isn't a very interesting mode to test, because the
precedence rules mean that we'll always read the RIDX chunk over the
.rev file. But it makes it impossible for a user to induce corruption in
their repository by adjusting the test knobs (since if we had an
either/or knob they could stop writing the RIDX chunk, allowing them to
tweak the MIDX's object order without changing its checksum).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for reading the reverse index data out of the MIDX itself,
teach the `test_rev_exists` function to take an expected "source" for
the reverse index data.
When given "rev", it asserts that the MIDX's `.rev` file exists, and is
loaded when verifying the integrity of its bitmaps. Otherwise, it
ensures that trace2 reports the source of the reverse index data as the
same string which was given to test_rev_exists().
The following patch will implement reading the reverse index data from
the MIDX itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t5326, we have a handful of tests that we would like to run twice:
once using the MIDX's new `RIDX` chunk as the source of the
reverse-index cache, and once using the separate `.rev` file.
But because these tests mutate the state of the underlying repository,
and then make assumptions about those mutations occurring in a certain
sequence, simply running the tests twice in the same repository is
awkward.
Instead, extract the core of interesting tests into t/lib-bitmap.sh to
prepare for them to be run twice, each in a separate test script. This
means that they can each operate on a separate repository, removing any
concerns about mutating state.
For now, this patch is a strict cut-and-paste of some tests from t5326.
The tests which did not move are not interesting with respect to the
source of their reverse index data.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To determine which source of data is used for the MIDX's reverse index
cache, introduce a helper which forces loading the reverse index, and
then looks for the special trace2 event introduced in a previous commit.
For now, this helper just looks for when the legacy MIDX .rev file was
loaded, but in a subsequent commit will become parameterized over the
the reverse index's source.
This function replaces checking for the existence of the .rev file. We
could write a similar helper to ensure that the .rev file is cleaned up
after repacking, but it will make subsequent tests more difficult to
write, and provides marginal value since we already check that the MIDX
.bitmap file is removed.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.multiPackIndex config became true by default back in 18e449f86b
(midx: enable core.multiPackIndex by default, 2020-09-25), so it is no
longer necessary to enable it explicitly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit, we'll use the MIDX's new 'RIDX' chunk as a
source for the reverse index's data. But it will be useful for tests to
be able to determine whether the reverse index was loaded from the
separate .rev file, or from a chunk within the MIDX.
To instrument this, add a trace2 event which the tests can look for in
order to determine the reverse index's source.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch demonstrates a bug where a MIDX's auxiliary object
order can become out of sync with a MIDX bitmap.
This is because of two confounding factors:
- First, the object order is stored in a file which is named according
to the multi-pack index's checksum, and the MIDX does not store the
object order. This means that the object order can change without
altering the checksum.
- But the .rev file is moved into place with finalize_object_file(),
which link(2)'s the file into place instead of renaming it. For us,
that means that a modified .rev file will not be moved into place if
MIDX's checksum was unchanged.
This fix is to force the MIDX's checksum to change when the preferred
pack changes but the set of packs contained in the MIDX does not. In
other words, when the object order changes, the MIDX's checksum needs to
change with it (regardless of whether the MIDX is tracking the same or
different packs).
This prevents a race whereby changing the object order (but not the
packs themselves) enables a reader to see the new .rev file with the old
MIDX, or similarly seeing the new bitmap with the old object order.
But why can't we just stop hardlinking the .rev into place instead
adding additional data to the MIDX? Suppose that's what we did. Then
when we go to generate the new bitmap, we'll load the old MIDX bitmap,
along with the MIDX that it references. That's fine, since the new MIDX
isn't moved into place until after the new bitmap is generated. But the
new object order *has* been moved into place. So we'll read the old
bitmaps in the new order when generating the new bitmap file, meaning
that without this secondary change, bitmap generation itself would
become a victim of the race described here.
This can all be prevented by forcing the MIDX's checksum to change when
the object order does. By embedding the entire object order into the
MIDX, we do just that. That is, the MIDX's checksum will change in
response to any perturbation of the underlying object order. In t5326,
this will cause the MIDX's checksum to update (even without changing the
set of packs in the MIDX), preventing the stale read problem.
Note that this makes it safe to continue to link(2) the MIDX .rev file
into place, since it is now impossible to have a .rev file that is
out-of-sync with the MIDX whose checksum it references. (But we will do
away with MIDX .rev files later in this series anyway, so this is
somewhat of a moot point).
In theory, it is possible to store a "fingerprint" of the full object
order here, so long as that fingerprint changes at least as often as the
full object order does. Some possibilities here include storing the
identity of the preferred pack, along with the mtimes of the
non-preferred packs in a consistent order. But storing a limited part of
the information makes it difficult to reason about whether or not there
are gaps between the two that would cause us to get bitten by this bug
again.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch demonstrates a cause of bitmap corruption that can occur when
the contents of the multi-pack index does not change, but the underlying
object order does.
In this example, we have a MIDX containing two packs, each with a
distinct set of objects (pack A corresponds to the tree, blob, and
commit from the first patch, and pack B corresponds to the second
patch).
First, a MIDX is written where the 'A' pack is preferred. As expected,
the bitmaps generated there are in-tact. But then, we generate an
identical MIDX with a different object order: this time preferring pack
'B'.
Due to a bug which will be explained and fixed in the following commit,
the MIDX is updated, but the .rev file is not, causing the .bitmap file
to be read incorrectly. Specifically, the .bitmap file will contain
correct data, but the auxiliary object order in the .rev file is stale,
causing readers to get confused by reading the new bitmaps using the old
object order.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in 2.35 that roke the use of "rebase" and "stash"
in a secondary worktree.
* en/keep-cwd:
sequencer, stash: fix running from worktree subdir
The tree is not open for new development yet, but let's mark the
beginning of the new cycle before we start merging down regression
fix topics.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an apostrophe to "signatures" to indicate the possessive
relationship in "the signature's creation".
Signed-off-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Specifically, replace the tab between "the" and "first" with a space.
Signed-off-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the now-unused "failure_errno" parameter from the
refs_resolve_ref_unsafe() signature. In my recent 96f6623ada (Merge
branch 'ab/refs-errno-cleanup', 2021-11-29) series we made all of its
callers explicitly request the errno via an output parameter.
As that series shows all but one caller ended up passing in a
boilerplate "ignore_errno", since they only cared about whether the
return value was NULL or not, i.e. if the ref could be resolved.
There was one small issue with that series fixed with a follow-up in
31e3912369 (Merge branch 'ab/refs-errno-cleanup', 2022-01-14) a small
bug in that series was fixed.
After those two there was one caller left in sequencer.c that used the
"failure_errno', but as of the preceding commit it uses a boilerplate
"ignore_errno" instead.
This leaves the public refs API without any use of "failure_errno" at
all. We could still do with a bit of cleanup and generalization
between refs.c and refs/files-backend.c before the "reftable"
integration lands, but that's all internal to the reference code
itself.
So let's remove this output parameter. Not only isn't it used now, but
it's unlikely that we'll want it again in the future. We'd like to
slowly move the refs API to a more file-backend independent way of
communicating error codes, having it use a "failure_errno" was only
the first step in that direction. If this or any other function needs
to communicate what specifically is wrong with the requested "refname"
it'll be better to have the function set some output enum of
well-defined error states than piggy-backend on "errno".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code that was faithfully migrated to the new "resolve_errno"
API in ed90f04155 (refs API: make resolve_ref_unsafe() not set errno,
2021-10-16) to stop caring about the errno at all.
When we fail to resolve "HEAD" after the sequencer runs it doesn't
really help to say what the "errno" value is, since the fake backend
errno may or may not reflect anything real about the state of the
".git/HEAD". With the upcoming reftable backend this fakery will
become even more pronounced.
So let's just die() instead of die_errno() here. This will also help
simplify the refs_resolve_ref_unsafe() API. This was the only user of
it that wasn't ignoring the "failure_errno" output parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that reset_head() can handle the initial checkout of onto
correctly use it in the "merge" backend instead of forking "git
checkout". This opens the way for us to stop calling the
post-checkout hook in the future. Not running "git checkout" means
that "rebase -i/m" no longer recurse submodules when checking out
"onto" (thanks to Philippe Blain for pointing this out). As the rest
of rebase does not know what to do with submodules this is probably a
good thing. When using merge-ort rebase ought be able to handle
submodules correctly if it parsed the submodule config, such a change
is left for a future patch series.
The "apply" based rebase has avoided forking git checkout
since ac7f467fef ("builtin/rebase: support running "git rebase
<upstream>"", 2018-08-07). The code that handles the checkout was
moved into libgit by b309a97108 ("reset: extract reset_head() from
rebase", 2020-04-07).
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the start of a rebase, ORIG_HEAD is updated to the tip of the
branch being rebased. Unfortunately reset_head() always uses the
current value of HEAD for this which is incorrect if the rebase is
started with "git rebase <upstream> <branch>" as in that case
ORIG_HEAD should be updated to <branch>. This only affects the "apply"
backend as the "merge" backend does not yet use reset_head() for the
initial checkout. Fix this by passing in orig_head when calling
reset_head() and add some regression tests.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
move_to_original_branch() passes the message intended for the branch
reflog as `orig_head_msg`. Fix this by adding a `branch_msg` member to
struct reset_head_opts and add a regression test. Note that these
reflog messages do not respect GIT_REFLOG_ACTION. They are not alone
in that and will be fixed in a future series.
The "merge" backend already has tests that check both the branch and
HEAD reflogs.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function takes a confusingly large number of parameters which
makes it difficult to remember which order to pass them in. The
following commits will add a couple more parameters which makes the
problem worse. To address this change the function to take a struct of
options. Using a struct means that it is no longer necessary to
remember which order to pass the parameters in and anyone reading the
code can easily see which value is passed to each parameter.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If ORIG_HEAD is not set by passing RESET_ORIG_HEAD then there is no
need to pass anything for reflog_orig_head. In addition to the callers
fixed in this commit move_to_original_branch() also passes
reflog_orig_head without setting ORIG_HEAD. That caller is mistakenly
passing the message it wants to put in the branch reflog which is not
currently possible so we delay fixing that caller until we can pass
the message as the branch reflog.
A later commit will make it a BUG() to pass reflog_orig_head without
RESET_ORIG_HEAD, that changes cannot be done here as it needs to wait
for move_to_original_branch() to be fixed first.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default_reflog parameter of create_autostash() is passed to
reset_head(). However as creating a stash does not involve updating
any refs the parameter is not used by reset_head(). Removing the
parameter from create_autostash() simplifies the callers.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This parameter is only needed when a ref is going to be updated and
the caller does not pass an explicit reflog message. Callers that are
only discarding uncommitted changes in the working tree such as such
as "rebase --skip" or create_autostash() do not update any refs so
should not have to worry about passing this parameter.
This change is not intended to have any user visible changes. The
pointer comparison between `oid` and `&head_oid` checks that the
caller did not pass an oid to be checked out. As no callers pass
RESET_HEAD_RUN_POST_CHECKOUT_HOOK without passing an oid there are
no changes to when the post-checkout hook is run. As update_ref() only
updates the ref if the oid passed to it differs from the current ref
there are no changes to when HEAD is updated.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit we will stop trying to update HEAD when we are
removing uncommitted changes from the working tree. Move the code that
updates the refs to its own function in preparation for that.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only use of the action parameter is to setup the error messages
for unpack_trees(). All but two cases pass either "checkout" or
"reset". The case that passes "reset --hard" would be better passing
"reset" so that the error messages match the builtin reset command
like all the other callers that are doing a reset. The case that
passes "Fast-forwarded" is only updating HEAD and so the parameter is
unused in that case as it does not call unpack_trees(). The value to
pass to setup_unpack_trees_porcelain() can be determined by checking
whether flags contains RESET_HEAD_HARD without the caller having to
specify it.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hook should only be run if the worktree and refs were successfully
updated. This primarily affects "rebase --apply" but also "rebase
--merge" when it fast-forwards.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "git rebase [--apply|--merge] <upstream> <branch>" detects that
<upstream> is an ancestor of <branch> then it will fast-forward and
checkout <branch>. Normally a checkout or picking a commit during a
rebase will refuse to overwrite untracked files, however rebase does
overwrite untracked files when checking out <branch>.
The fix is to only set reset in `unpack_tree_opts` if flags contains
`RESET_HEAD_HARD`. t5403 may seem like an odd home for the new test
but it will be extended in the next commit to check that the
post-checkout hook is not run when the checkout fails.
The test for `!detach_head` dates back to the
original implementation of reset_head() in
ac7f467fef ("builtin/rebase: support running "git rebase <upstream>"",
2018-08-07) and was correct until e65123a71d
("builtin rebase: support `git rebase <upstream> <switch-to>`",
2018-09-04) started using reset_head() to checkout <switch-to> when
fast-forwarding.
Note that 480d3d6bf9 ("Change unpack_trees' 'reset' flag into an
enum", 2021-09-27) also fixes this bug as it changes reset_head() to
never remove untracked files. I think this fix is still worthwhile as
it makes it clear that the same settings are used for detached and
non-detached checkouts.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a rebase started with "rebase [--apply|--merge] <upstream> <branch>"
detects that <upstream> is an ancestor of <branch> then it fast-forwards
and checks out <branch>. Unfortunately in that case it passed the null
oid as the first argument to the post-checkout hook rather than the oid
of HEAD.
A side effect of this change is that the call to update_ref() which
updates HEAD now always receives the old value of HEAD. This provides
protection against another process updating HEAD during the checkout.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests only test the default backend and do not check that the
arguments passed to the hook are correct. Fix this by running the
tests with both backends and adding checks for the hook arguments.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is heavily indented and it will be convenient later in the
series to have it in its own function.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commits bc3ae46b42 ("rebase: do not attempt to remove
startup_info->original_cwd", 2021-12-09) and 0fce211ccc ("stash: do not
attempt to remove startup_info->original_cwd", 2021-12-09), we wanted to
allow the subprocess to know which directory the parent process was
running from, so that the subprocess could protect it. However...
When run from a non-main worktree, setup_git_directory() will note
that the discovered git directory
(/PATH/TO/.git/worktree/non-main-worktree) does not match
DEFAULT_GIT_DIR_ENVIRONMENT (see setup_discovered_git_dir()), and
decide to set GIT_DIR in the environment. This matters because...
Whenever git is run with the GIT_DIR environment variable set, and
GIT_WORK_TREE not set, it presumes that '.' is the working tree. So...
This combination results in the subcommand being very confused about
the working tree. Fix it by also setting the GIT_WORK_TREE environment
variable along with setting cmd.dir.
A possibly more involved fix we could consider for later would be to
make setup.c set GIT_WORK_TREE whenever (a) it discovers both the git
directory and the working tree and (b) it decides to set GIT_DIR in the
environment. I did not attempt that here as such would be too big of a
change for a 2.35.1 release.
Test-case-by: Glen Choo <chooglen@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a branchless and tagless but not refless remote using
protocol v0 or v1, Git calls transport_fetch_refs() with an empty ref
list. This makes the clone fail with the message "remote transport
reported error".
Git should have refrained from calling transport_fetch_refs(), just like
it does in the case that the remote is refless. Therefore, teach Git to
do this.
In protocol v2, this does not happen because the client passes
ref-prefix arguments that filter out non-branches and non-tags in the
ref advertisement, making the remote appear empty.
Note that this bug concerns logic in builtin/clone.c and only affects
cloning, not fetching.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a copy of uncompress2() implementation in compat/ so that we
can build with an older version of zlib that lack the function, and
the build procedure selects if it is used via the NO_UNCOMPRESS2
$(MAKE) variable. This is yet another "annoying" knob the porters
need to tweak on platforms that are not common enough to have the
default set in the config.mak.uname file.
Attempt to instead ask the system header <zlib.h> to decide if we
need the compatibility implementation. This is a deviation from the
way we have been handling the "compatiblity" features so far, and if
it can be done cleanly enough, it could work as a model for features
that need compatibility definition we discover in the future. With
that goal in mind, avoid expedient but ugly hacks, like shoving the
code that is conditionally compiled into an unrelated .c file, which
may not work in future cases---instead, take an approach that uses a
file that is independently compiled and stands on its own.
Compile and link compat/zlib-uncompress2.c file unconditionally, but
conditionally hide the implementation behind #if/#endif when zlib
version is 1.2.9 or newer, and unconditionally archive the resulting
object file in the libgit.a to be picked up by the linker.
There are a few things to note in the shape of the code base after
this change:
- We no longer use NO_UNCOMPRESS2 knob; if the system header
<zlib.h> claims a version that is more cent than the library
actually is, this would break, but it is easy to add it back when
we find such a system.
- The object file compat/zlib-uncompress2.o is always compiled and
archived in libgit.a, just like a few other compat/ object files
already are.
- The inclusion of <zlib.h> is done in <git-compat-util.h>; we used
to do so from <cache.h> which includes <git-compat-util.h> as the
first thing it does, so from the *.c codes, there is no practical
change.
- Until objects in libgit.a that is already used gains a reference
to the function, the reftable code will be the only one that
wants it, so libgit.a on the linker command line needs to appear
once more at the end to satisify the mutual dependency.
- Beat found a trick used by OpenSSL to avoid making the
conditionally-compiled object truly empty (apparently because
they had to deal with compilers that do not want to see an
effectively empty input file). Our compat/zlib-uncompress2.c
file borrows the same trick for portabilty.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mem_pool_alloc uses sizeof(uintmax_t) as a proxy for what should be
_Alignof(max_align_t) in C11. On most architectures this is sufficient
(though on m68k it is in fact overly strict, since the de-facto ABI,
which differs from the specified System V ABI, has the maximum alignment
of all types as 2 bytes), but on CHERI, and thus Arm's Morello
prototype, it is insufficient for any type that stores a pointer, which
must be aligned to 128 bits (on 64-bit architectures extended with
CHERI), whilst uintmax_t is a 64-bit integer.
Fix this by introducing our own approximation for max_align_t and a
means to compute _Alignof it without relying on C11. Currently this
union only contains uintmax_t and void *, but more types can be added as
needed.
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We added an unrelated sanity checking that leads to a BUG() while
plugging a leak, which triggered in a repository with symrefs in
the local branch namespace that point at a ref outside. Partially
revert the change to avoid triggering the BUG().
* ab/checkout-branch-info-leakfix:
checkout: avoid BUG() when hitting a broken repository
... at least for now. So let's error out if we are even trying to
initialize the split index when the index is sparse, or when trying to
write the split index extension for a sparse index.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 61feddcdf2 (tests: disable GIT_TEST_SPLIT_INDEX for sparse index
tests, 2021-08-26), it was already called out that the split index
feature is incompatible with the sparse index feature, and its commit
message wondered aloud whether more checks would be required to ensure
that the split index and sparse index features aren't enabled at the
same time.
We are about to introduce such additional checks, and indeed, t1091
would utterly fail with them. Therefore, let's preemptively disable the
split index for the entirety of t1091.
This partially reverts above-mentioned patch because it covered only one
test case whereas we want to cover the entire test script.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 6e773527b6 (sparse-index: convert from full to sparse, 2021-03-30),
we introduced initial support for a sparse index, and were careful to
avoid converting to a sparse index in the presence of a split index.
However, when we _just_ read a freshly-initialized index, it might not
contain a split index even if _writing_ it will add one by virtue of
being asked for via the `GIT_TEST_SPLIT_INDEX` variable.
We did not notice any problems with checking _only_ for `split_index`
(and not `GIT_TEST_SPLIT_INDEX`) right until both
`vd/sparse-sparsity-fix-on-read` _and_ `vd/sparse-reset` were merged.
Those two topics' interplay triggers a bug in conjunction with running
t1091.15 when `GIT_TEST_SPLIT_INDEX=true` in the following way:
`vd/sparse-sparsity-fix-on-read` ensures that the index is made sparse
right after reading, and `vd/sparse-reset` ensures that the index is
made non-sparse again unless running in the `--soft` mode. Since the
split index feature is incompatible with the sparse index feature, we
see a symptom like this:
fatal: position for replacement 4 exceeds base index size 4
Let's fix this by avoiding the conversion to a sparse index when
`GIT_TEST_SPLIT_INDEX=true`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 9081a421 (checkout: fix "branch info" memory leaks, 2021-11-16)
cleaned up existing memory leaks, we added an unrelated sanity check
to ensure that a local branch is truly local and not a symref to
elsewhere that dies with BUG() otherwise. This was misguided in two
ways. First of all, such a tightening did not belong to a leak-fix
patch. And the condition it detected was *not* a bug in our program
but a problem in user data, where warning() or die() would have been
more appropriate.
As the condition is not fatal (the result of computing the local
branch name in the code that is involved in the faulty check is only
used as a textual label for the commit), let's revert the code to
the original state, i.e. strip "refs/heads/" to compute the local
branch name if possible, and otherwise leave it NULL. The consumer
of the information in merge_working_tree() is prepared to see NULL
in there and act accordingly.
cf. https://bugzilla.redhat.com/show_bug.cgi?id=2042920
Reported-by: Petr Šplíchal <psplicha@redhat.com>
Reported-by: Todd Zullinger <tmz@pobox.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were two commit_lists created in cmd_merge() that were only
conditionally free()'d. Add a quick conditional call to
free_commit_list() for each of them at the end of the function.
Testing this commit against t6404 under valgrind shows that this patch
fixes the following two leaks:
16 bytes in 1 blocks are definitely lost in loss record 16 of 126
at 0x484086F: malloc (vg_replace_malloc.c:380)
by 0x69FFEB: do_xmalloc (wrapper.c:41)
by 0x6A0073: xmalloc (wrapper.c:62)
by 0x52A72D: commit_list_insert (commit.c:556)
by 0x47FC93: reduce_parents (merge.c:1114)
by 0x4801EE: collect_parents (merge.c:1214)
by 0x480B56: cmd_merge (merge.c:1465)
by 0x40686E: run_builtin (git.c:464)
by 0x406C51: handle_builtin (git.c:716)
by 0x406E96: run_argv (git.c:783)
by 0x40730A: cmd_main (git.c:914)
by 0x4E7DFA: main (common-main.c:56)
8 (16 direct, 32 indirect) bytes in 1 blocks are definitely lost in \
loss record 61 of 126
at 0x484086F: malloc (vg_replace_malloc.c:380)
by 0x69FFEB: do_xmalloc (wrapper.c:41)
by 0x6A0073: xmalloc (wrapper.c:62)
by 0x52A72D: commit_list_insert (commit.c:556)
by 0x52A8F2: commit_list_insert_by_date (commit.c:620)
by 0x5270AC: get_merge_bases_many_0 (commit-reach.c:413)
by 0x52716C: repo_get_merge_bases (commit-reach.c:438)
by 0x480E5A: cmd_merge (merge.c:1520)
by 0x40686E: run_builtin (git.c:464)
by 0x406C51: handle_builtin (git.c:716)
by 0x406E96: run_argv (git.c:783)
by 0x40730A: cmd_main (git.c:914)
There are still 3 leaks in chdir_notify_register() after this, but
chdir_notify_register() has been brought up on the list before and folks
were not a fan of fixing those, so I'm not touching them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for merge_incore_recursive(), modelled after
merge_recursive(), notes that
merge_bases will be consumed (emptied) so make a copy if you need it
However, in merge_ort_internal() (which merge_incore_recursive() calls),
it runs
merged_merge_bases = pop_commit(&merge_bases);
...
for (iter = merge_bases; iter; iter = iter->next) {
...
}
In other words, it only consumes the *first* entry of merge_bases, and
the rest it iterates through. If it iterated through all of them, the
caller could be responsible for free'ing the memory. If it consumed all
of them, the current documentation would be correct and the callers
would need to do nothing. The current middle ground makes it impossible
for callers to avoid memory leaks, since any attempt to use the
merge_bases it passes in would result in a use-after-free.
It turns out this part of the code was copied from merge-recursive.c,
which has had the same bug for 15.5 years. However, since we are trying
to keep merge-recursive.c stable as we sunset it, let's just fix the
leak in in merge_ort_internal() by having it actually consume all the
elements of the merge_bases commit_list.
Testing this commit against t6404 (the first testcase specifically
about recursive merges) under valgrind shows that this patch fixes
the following leak:
32 (16 direct, 16 indirect) bytes in 1 blocks are definitely lost \
in loss record 49 of 126
at 0x484086F: malloc (vg_replace_malloc.c:380)
by 0x69FFEB: do_xmalloc (wrapper.c:41)
by 0x6A0073: xmalloc (wrapper.c:62)
by 0x52A72D: commit_list_insert (commit.c:556)
by 0x47EC86: try_merge_strategy (merge.c:751)
by 0x48143B: cmd_merge (merge.c:1679)
by 0x40686E: run_builtin (git.c:464)
by 0x406C51: handle_builtin (git.c:716)
by 0x406E96: run_argv (git.c:783)
by 0x40730A: cmd_main (git.c:914)
by 0x4E7DFA: main (common-main.c:56)
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating the sparse-checkout file, Git does not create the leading
directory, "$GIT_DIR/info", if it does not exist. This causes problems
if the repository does not have that directory. Therefore, ensure that
the leading directory is created.
This is the only "open" in builtin/sparse-checkout.c that does not have
a leading directory check. (The other one in write_patterns_and_update()
does.)
Note that the test needs to explicitly specify a template when running
"git init" because the default template used in the tests has the
"info/" directory included.
Helped-by: Jose Lopes <jabolopes@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow the example set by 12909b6b (i18n: turn "options are
incompatible" into "cannot be used together", 2022-01-05) and use
the same message string to reduce the need for translation.
Reported-by: Jiang Xin <worldhello.net@gmail.com>
Helped-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This isn't used per se, but it is useful for debugging, especially
Windows CI failures.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reduces the amount of glue code, because we don't need a void
pointer or vtable within the structure.
The only snag is that reftable_index_record contain a strbuf, so it
cannot be zero-initialized. To address this, use reftable_new_record()
to return fresh instance, given a record type. Since
reftable_new_record() doesn't cause heap allocation anymore, it should
be balanced with reftable_record_release() rather than
reftable_record_destroy().
Thanks to Peff for the suggestion.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was renamed to generic.c, but the origin was never removed
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies unittests a little, and provides further coverage for
reftable_record_copy().
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a more practical ordering when working on refactorings of the
reftable code.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes NULL derefs in error paths. Spotted by Coverity.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This would trigger in the unlikely event that we are compacting, and
the next available file handle is 0.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the cleanup fails, there is nothing we can do.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This would be triggered in the unlikely event of fstat() failing on an
opened file.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test coverage for corrupt zlib data. Fix memory leaks demonstrated by
unittest.
This problem was discovered by a Coverity scan.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document that the accepted variants of the --track option are --track,
--track=direct, and --track=inherit. The equal sign in the latter two
cannot be replaced with whitespace; in general optional arguments need
to be attached firmly to their option.
Put "direct" consistently before "inherit", if only for the reasons
that the former is the default, explained first in the documentation,
and comes before the latter alphabetically.
Mention both modes in the short help so that readers don't have to look
them up in the full documentation. They are literal strings and thus
untranslatable. PARSE_OPT_LITERAL_ARGHELP is inferred due to the pipe
and parenthesis characters, so we don't have to provide that flag
explicitly.
Mention that -t has the same effect as --track and --track=direct.
There is no way to specify inherit mode using the short option, because
short options generally don't accept optional arguments.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trace2 subsystem can inherit certain information from parent
processes via environment variables; e.g., the parent command name and
session ID. This allows trace2 to note when a command is the child
process of another Git process, and to adjust various pieces of output
accordingly.
This behavior breaks certain tests that examine trace2 output when the
tests run as a child of another git process, such as in `git rebase -x
"make test"`.
While we could fix this by unsetting the relevant variables in the
affected tests (currently t0210, t0211, t0212, and t6421), this would
leave other tests vulnerable to similar breakage if new test cases are
added which inspect trace2 output. So fix this in general by unsetting
GIT_TRACE2_PARENT_NAME and GIT_TRACE2_PARENT_SID in test-lib.sh.
Reported-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The German translation for "'%s' is aliased to '%s'" is incorrect. It
switches the order of alias name and alias definition.
A better translation would be "'%s' ist ein Alias für '%s'". (Full stop
removed intentionally, because the original does not use one either.)
Signed-off-by: Matthias Rüster <matthias.ruester@gmail.com>
A recent upstream topic introduced checks for certain Git commands that
prevent them from deleting the current working directory, introducing
also a regression test that ensures that commands such as `git version`
_can_ run without a current working directory.
While technically not possible on Windows via the regular Win32 API, we
do run the regression tests in an MSYS2 Bash which uses a POSIX
emulation layer (the MSYS2/Cygwin runtime) where a really evil hack
_does_ allow to delete a directory even if it is the current working
directory.
Therefore, Git needs to be prepared for a missing working directory,
even on Windows.
This issue was not noticed in upstream Git because there was no caller
that tried to discover a Git directory with a deleted current working
directory in the test suite. But in the microsoft/git fork, we do want
to run `pre-command`/`post-command` hooks for every command, even for
`git version`, which means that we make precisely such a call. The bug
is not in that `pre-command`/`post-command` feature, though, but in
`mingw_getcwd()` and needs to be addressed there.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a run command cannot be executed or found, shells return exit code
126 or 127, respectively. Valid run commands are allowed to return
these codes as well to indicate bad revisions, though, for historical
reasons. This means typos can cause bogus bisect runs that go over the
full distance and end up reporting invalid results.
The best solution would be to reserve exit codes 126 and 127, like
71b0251cdd (Bisect run: "skip" current commit if script exit code is
125., 2007-10-26) did for 125, and abort bisect run when we get them.
That might be inconvenient for those who relied on the documentation
stating that 126 and 127 can be used for bad revisions, though.
The workaround used by this patch is to run the command on a known-good
revision and abort if we still get the same error code. This adds one
step to runs with scripts that use exit codes 126 and 127, but keeps
them supported, with one exception: It won't work with commands that
cannot recognize the (manually marked) known-good revision as such.
Run commands that use low exit codes are unaffected. Typos are reported
after executing the missing command twice and three checkouts (the first
step, the known good revision and back to the revision of the first
step).
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shells report non-executable and missing commands with exit codes 126
and 127, respectively. For historical reasons "git bisect run"
interprets them as indicating a bad commit, though. Document the
current behavior by adding basic tests that cover these cases.
Reported-by: Ramkumar Ramachandra <r@artagnon.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the cleanup code out of the loop and make sure all execution paths
pass through it to avoid leaking memory.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strvec "args" in bisect_run() is initialized and cleared, but never
added to. Nevertheless its first member is printed when reporting a
bisect_state() error. That's not useful, since it's always NULL.
Before d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13) the intended new state was reported if it
could not be set. Reinstate that behavior and remove the unused strvec.
Reported-by: Ramkumar Ramachandra <r@artagnon.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git fetch --negotiate-only` is an implementation detail of push
negotiation and, unlike most `git fetch` invocations, does not actually
update the main repository. Thus it should not update submodules even
if submodule recursion is enabled.
This is not just slow, it is wrong e.g. push negotiation with
"submodule.recurse=true" will cause submodules to be updated because it
invokes `git fetch --negotiate-only`.
Fix this by disabling submodule recursion if --negotiate-only was given.
Since this makes --negotiate-only and --recurse-submodules incompatible,
check for this invalid combination and die.
This does not use the "goto cleanup" introduced in the previous commit
because we want to recurse through submodules whenever a ref is fetched,
and this can happen without introducing new objects.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_fetch() does the following with the assumption that objects are
fetched:
* Run gc
* Write commit graphs (if enabled by fetch.writeCommitGraph=true)
However, neither of these tasks makes sense if objects are not fetched
e.g. `git fetch --negotiate-only` never fetches objects.
Speed up cmd_fetch() by bailing out early if we know for certain that
objects will not be fetched. cmd_fetch() can bail out early whenever
objects are not fetched, but for now this only considers
--negotiate-only.
The same optimization does not apply to `git fetch --dry-run` because
that actually fetches objects; the dry run refers to not updating refs.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace an early return with 'goto cleanup' in cmd_fetch() so that the
string_list is always cleared (the string_list_clear() call is purely
cleanup; the string_list is not reused). This makes cleanup consistent
so that a subsequent commit can use 'goto cleanup' to bail out early.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch -h" incorrectly said "--track[=direct|inherit]",
implying that "--trackinherit" is a valid option, which has been
corrected.
* js/branch-track-inherit:
branch,checkout: fix --track usage strings
FreeBSD 13.0 headers have unconditional dependency on C11 language
features, and adding -std=gnu99 to DEVELOPER_CFLAGS would just
break the developer build.
* jc/freebsd-without-c99-only-build:
Makefile: FreeBSD cannot do C99-or-below build
As Ævar pointed out in [1], the use of PARSE_OPT_LITERAL_ARGHELP with a
list of allowed parameters is not recommended. Both git-branch and
git-checkout were changed in d311566 (branch: add flags and config to
inherit tracking, 2021-12-20) to use this discouraged combination for
their --track flags.
Fix this by removing PARSE_OPT_LITERAL_ARGHELP, and changing the arghelp
to simply be "mode". Users may discover allowed values in the manual
pages.
[1]: https://lore.kernel.org/git/220111.86a6g3yqf9.gmgdl@evledraar.gmail.com/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a feature that supports config file inclusion conditional on
whether the repo has a remote with a URL that matches a glob.
Similar to my previous work on remote-suggested hooks [1], the main
motivation is to allow remote repo administrators to provide recommended
configs in a way that can be consumed more easily (e.g. through a
package installable by a package manager - it could, for example,
contain a file to be included conditionally and a post-install script
that adds the include directive to the system-wide config file).
In order to do this, Git reruns the config parsing mechanism upon
noticing the first URL-conditional include in order to find all remote
URLs, and these remote URLs are then used to determine if that first and
all subsequent includes are executed. Remote URLs are not allowed to be
configued in any URL-conditionally-included file.
[1] https://lore.kernel.org/git/cover.1623881977.git.jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "make DEVELOPER=YesPlease" builds, we try to help developers to
catch as many potential issues as they can by using -Wall and
turning compilation warnings into errors. In the same spirit, we
recently started adding -std=gnu99 to their CFLAGS, so that they can
notice when they accidentally used language features beyond C99.
It however turns out that FreeBSD 13.0 mistakenly uses C11 extension
in its system header files regardless of what __STDC_VERSION__ says,
which means that the platform (unless we tweak their system headers)
cannot be used for this purpose.
It seems that -std=gnu99 is only added conditionally even in today's
config.mak.dev, so it is fine if we dropped -std=gnu99 from there.
Which means that developers on FreeBSD cannot participate in vetting
use of features beyond C99, but there are developers on other
platforms who will, so it's not too bad.
We might want a more "fundamental" fix to make the platform capable
of taking -std=gnu99, like working around the use of unconditional
C11 extension in its system header files by supplying a set of
"replacement" definitions in our header files. We chose not to
pursue such an approach for two reasons at this point:
(1) The fix belongs to the FreeBSD project, not this project, and
such an upstream fix may happen hopefully in a not-too-distant
future.
(2) Fixing such a bug in system header files and working it around
can lead to unexpected breakages (other parts of their system
header files may not be expecting to see and do not work well
with our "replacement" definitions). This close to the final
release of this cycle, we have no time for that.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust build on RHEL 7 to explicitly ask C99 support and use
the fallback implementation of uncompress2 we ship.
* da/rhel7-lacks-uncompress2-and-c99:
build: centos/RHEL 7 ships with an older gcc and zlib
In commit 8b09a900a1 ("merge-ort: restart merge with cached renames to
reduce process entry cost", 2021-07-16), we noted that in the merge-ort
steps of
collect_merge_info()
detect_and_process_renames()
process_entries()
that process_entries() was expensive, and we could often make it cheaper
by changing this to
collect_merge_info()
detect_and_process_renames()
<cache all the renames, and restart>
collect_merge_info()
detect_and_process_renames()
process_entries()
because the second collect_merge_info() would be cheaper (we could avoid
traversing into some directories), the second
detect_and_process_renames() would be free since we had already detected
all renames, and then process_entries() has far fewer entries to handle.
However, this was built on the assumption that the first
detect_and_process_renames() actually detected all potential renames.
If someone has merge.renameLimit set to some small value, that
assumption is violated which manifests later with the following message:
$ git -c merge.renameLimit=1 rebase upstream
...
git: merge-ort.c:546: clear_or_reinit_internal_opts: Assertion
`renames->cached_pairs_valid_side == 0' failed.
Turn off this cache-renames-and-restart whenever we cannot detect all
renames, and add a testcase that would have caught this problem.
Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Tested-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current way we generate random file names is by taking the seconds
and microseconds, plus the PID, and mixing them together, then encoding
them. If this fails, we increment the value by 7777, and try again up
to TMP_MAX times.
Unfortunately, this is not the best idea from a security perspective.
If we're writing into TMPDIR, an attacker can guess these values easily
and prevent us from creating any temporary files at all by creating them
all first. Even though we set TMP_MAX to 16384, this may be achievable
in some contexts, even if unlikely to occur in practice.
Fortunately, we can simply solve this by using the system
cryptographically secure pseudorandom number generator (CSPRNG) to
generate a random 64-bit value, and use that as before. Note that there
is still a small bias here, but because a six-character sequence chosen
out of 62 characters provides about 36 bits of entropy, the bias here is
less than 2^-28, which is acceptable, especially considering we'll retry
several times.
Note that the use of a CSPRNG in generating temporary file names is also
used in many libcs. glibc recently changed from an approach similar to
ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in
this case. Even if the likelihood of an attack is low, we should still
be at least as responsible in creating temporary files as libc is.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many situations in which having access to a cryptographically
secure pseudorandom number generator (CSPRNG) is helpful. In the
future, we'll encounter one of these when dealing with temporary files.
To make this possible, let's add a function which reads from a system
CSPRNG and returns some bytes.
We know that all systems will have such an interface. A CSPRNG is
required for a secure TLS or SSH implementation and a Git implementation
which provided neither would be of little practical use. In addition,
POSIX is set to standardize getentropy(2) in the next version, so in the
(potentially distant) future we can rely on that.
For systems which lack one of the other interfaces, we provide the
ability to use OpenSSL's CSPRNG. OpenSSL is highly portable and
functions on practically every known OS, and we know it will have access
to some source of cryptographically secure randomness. We also provide
support for the arc4random in libbsd for folks who would prefer to use
that.
Because this is a security sensitive interface, we take some
precautions. We either succeed by filling the buffer completely as we
requested, or we fail. We don't return partial data because the caller
will almost never find that to be a useful behavior.
Specify a makefile knob which users can use to specify one or more
suitable CSPRNGs, and turn the multiple string options into a set of
defines, since we cannot match on strings in the preprocessor. We allow
multiple options to make the job of handling this in autoconf easier.
The order of options is important here. On systems with arc4random,
which is most of the BSDs, we use that, since, except on MirBSD and
macOS, it uses ChaCha20, which is extremely fast, and sits entirely in
userspace, avoiding a system call. We then prefer getrandom over
getentropy, because the former has been available longer on Linux, and
then OpenSSL. Finally, if none of those are available, we use
/dev/urandom, because most Unix-like operating systems provide that API.
We prefer options that don't involve device files when possible because
those work in some restricted environments where device files may not be
available.
Set the configuration variables appropriately for Linux and the BSDs,
including macOS, as well as Windows and NonStop. We specifically only
consider versions which receive publicly available security support
here. For the same reason, we don't specify getrandom(2) on Linux,
because CentOS 7 doesn't support it in glibc (although its kernel does)
and we don't want to resort to making syscalls.
Finally, add a test helper to allow this to be tested by hand and in
tests. We don't add any tests, since invoking the CSPRNG is not likely
to produce interesting, reproducible results.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some commands permit the user whether to provide options
first before args, or the reverse order. For example:
git push --dry-run <remote> <ref>
And:
git push <remote> <ref> --dry-run
Both of them is supported, but some commands do not, for instance:
git ls-remote --heads <remote>
And:
git ls-remote <remote> --heads
If <remote> only has one ref and it's name is "refs/heads/--heads", you
will get the same result, otherwise will not.This is because the former
in the second example will parse "--heads" as an "option" which means
to limit to only "refs/heads" when listing the remote references, the
latter treat "--heads" as an argument which means to filter the result
list with the given pattern.
Therefore, we want to specify a bit more in "gitcli.txt" about the way
we recommend and help to resolve the ambiguity around some git command
usage. The related disscussions locate at [1].
By the way, there are some issues with lowercase letters in the document,
which have been modified together.
[1] https://public-inbox.org/git/cover.1642129840.git.dyroneteng@gmail.com/
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When deleting refs from the loose-files refs backend, then we need to be
careful to also delete the same ref from the packed refs backend, if it
exists. If we don't, then deleting the loose ref would "uncover" the
packed ref. We thus always have to queue up deletions of refs for both
the loose and the packed refs backend. This is done in two separate
transactions, where the end result is that the reference-transaction
hook is executed twice for the deleted refs.
This behaviour is quite misleading: it's exposing implementation details
of how the files backend works to the user, in contrast to the logical
updates that we'd really want to expose via the hook. Worse yet, whether
the hook gets executed once or twice depends on how well-packed the
repository is: if the ref only exists as a loose ref, then we execute it
once, otherwise if it is also packed then we execute it twice.
Fix this behaviour and don't execute the reference-transaction hook at
all when refs in the packed-refs backend if it's driven by the files
backend. This works as expected even in case the refs to be deleted only
exist in the packed-refs backend because the loose-backend always queues
refs in its own transaction even if they don't exist such that they can
be locked for concurrent creation. And it also does the right thing in
case neither of the backends has the ref because that would cause the
transaction to fail completely.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reference-transaction hook is supposed to track logical changes to
references, but it currently also gets executed when packing refs in a
repository. This is unexpected and ultimately not all that useful:
packing refs is not supposed to result in any user-visible change to the
refs' state, and it ultimately is an implementation detail of how refs
stores work.
Fix this excessive execution of the hook when packing refs.
Reported-by: Waleed Khan <me@waleedkhan.name>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests which demonstate that we're executing the
reference-transaction hook too often in some cases, which thus leaks
implementation details about the reference store's implementation
itself. Behaviour will be fixed in follow-up commits.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reference-transaction hook is executing whenever we prepare, commit
or abort a reference transaction. While this is mostly intentional, in
case of the files backend we're leaking the implementation detail that
the store is in fact a composite store with one loose and one packed
backend to the caller. So while we want to execute the hook for all
logical updates, executing it for such implementation details is
unexpected.
Prepare for a fix by adding a new flag which allows to skip execution of
the hook.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not currently have any flags when creating reference transactions,
but we'll add one to disable execution of the reference transaction hook
in some cases.
Allow passing flags to `ref_store_transaction_begin()` to prepare for
this change.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When deleting loose refs, then we also have to delete the refs in the
packed backend. This is done by calling `refs_delete_refs()`, which
then uses the packed-backend's logic to delete refs. This doesn't allow
us to exercise any control over the reference transaction which is being
created in the packed backend, which is required in a subsequent commit.
Extract a new function `packed_refs_delete_refs()`, which hosts most of
the logic to delete refs except for creating the transaction itself.
Like this, we can easily create the transaction in the files backend
and thus exert more control over it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git 2.35 l10n round 1, a space between two words was missing in the
message from "branch.c", and it was fixed by commit 68d924e1de (branch:
missing space fix at line 313, 2022-01-11).
Do a batch update for teams (bg, fr, id, sv, tr and zh_CN) that have
already completed their works on l10n round 1.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Git 2.35-rc1
* tag 'v2.35.0-rc1':
Git 2.35-rc1
reftable tests: avoid "int" overflow, use "uint64_t"
reftable: avoid initializing structs from structs
t1450-fsck: exec-bit is not needed to make loose object writable
refs API: use "failure_errno", not "errno"
Last minute fixes before -rc1
build: NonStop ships with an older zlib
packfile: fix off-by-one error in decoding logic
t/gpg: simplify test for unknown key
branch: missing space fix at line 313
fmt-merge-msg: prevent use-after-free with signed tags
cache.h: drop duplicate `ensure_full_index()` declaration
lazyload: use correct calling conventions
fetch: fix deadlock when cleaning up lockfiles in async signals
GCC 4.8.5 is the default system compiler on centos7/RHEL7.
This version requires -std=c99 to enable c99 support.
zlib 1.2.7 on centos7/rhel7 lacks uncompress2().
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few portability tweaks.
* ab/reftable-build-fixes:
reftable tests: avoid "int" overflow, use "uint64_t"
reftable: avoid initializing structs from structs
Trying to clear the skip-worktree bit from files that are present does
present some computational overhead, for sparse-checkouts. (We do not
do the bit clearing in non-sparse-checkouts.) Optimize it as follows:
Rather than lstat()'ing every SKIP_WORKTREE path, take advantage of the
fact that entire directories will often be missing, especially for cone
mode and even more so ever since commit 55dfcf9591 ("sparse-checkout:
clear tracked sparse dirs", 2021-09-08). If we have already determined
that the parent directory of a file (or other previous ancestor) does
not exist, then the file cannot exist either so we do not need to
lstat() it separately.
Timings for p2000 included below, reformatted to fit in normal commit
message line lengths, which compare three things:
* Timings before this series
* Timings of the unoptimized version of
clear_skip_worktree_from_present_files() from a few commits ago
* Timings after the optimization in this commit
(NOTE: t/perf/ appears to have timing resolution only down to 0.01 s,
which presents significant measurement error when timings only differ by
0.01s. I don't trust any such timings below, and yet all the optimized
results differ by at most 0.01s.)
Test Before Series Unoptimized Optimized
-----------------------------------------------------------------------------
*git status*
full-v3 0.15(0.10+0.06) 0.32(0.16+0.17) +113.3% 0.16(0.10+0.07) +6.7%
full-v4 0.15(0.11+0.05) 0.32(0.17+0.16) +113.3% 0.16(0.11+0.05) +6.7%
sparse-v3 0.04(0.03+0.04) 0.04(0.02+0.05) +0.0% 0.04(0.02+0.05) +0.0%
sparse-v4 0.04(0.03+0.04) 0.04(0.02+0.05) +0.0% 0.04(0.03+0.05) +0.0%
*git add -A*
full-v3 0.40(0.30+0.07) 0.56(0.36+0.17) +40.0% 0.39(0.30+0.07) -2.5%
full-v4 0.37(0.28+0.07) 0.54(0.37+0.16) +45.9% 0.38(0.29+0.07) +2.7%
sparse-v3 0.06(0.04+0.05) 0.08(0.05+0.05) +33.3% 0.06(0.05+0.04) +0.0%
sparse-v4 0.05(0.03+0.05) 0.05(0.04+0.04) +0.0% 0.06(0.04+0.05) +20.0%
*git add .*
full-v3 0.40(0.31+0.07) 0.57(0.37+0.17) +42.5% 0.41(0.30+0.08) +2.5%
full-v4 0.38(0.30+0.06) 0.55(0.37+0.16) +44.7% 0.38(0.30+0.06) +0.0%
sparse-v3 0.06(0.04+0.05) 0.06(0.05+0.04) +0.0% 0.06(0.03+0.05) +0.0%
sparse-v4 0.06(0.05+0.05) 0.06(0.04+0.05) +0.0% 0.06(0.04+0.06) +0.0%
*git commit -a -m A*
full-v3 0.41(0.32+0.06) 0.58(0.39+0.17) +41.5% 0.42(0.32+0.07) +2.4%
full-v4 0.39(0.30+0.07) 0.56(0.38+0.17) +43.6% 0.40(0.31+0.07) +2.6%
sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0%
sparse-v4 0.04(0.03+0.05) 0.04(0.03+0.05) +0.0% 0.04(0.03+0.04) +0.0%
*git checkout -f -*
full-v3 0.56(0.46+0.07) 0.73(0.55+0.16) +30.4% 0.57(0.47+0.08) +1.8%
full-v4 0.54(0.45+0.07) 0.71(0.53+0.17) +31.5% 0.55(0.45+0.07) +1.9%
sparse-v3 0.06(0.04+0.04) 0.06(0.04+0.05) +0.0% 0.06(0.04+0.05) +0.0%
sparse-v4 0.05(0.05+0.04) 0.05(0.04+0.05) +0.0% 0.06(0.04+0.05) +20.0%
*git reset*
full-v3 0.34(0.26+0.05) 0.51(0.34+0.15) +50.0% 0.34(0.26+0.06) +0.0%
full-v4 0.32(0.24+0.06) 0.49(0.32+0.15) +53.1% 0.33(0.25+0.06) +3.1%
sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0%
sparse-v4 0.03(0.03+0.04) 0.03(0.02+0.04) +0.0% 0.03(0.03+0.04) +0.0%
*git reset --hard*
full-v3 0.57(0.46+0.07) 0.90(0.61+0.25) +57.9% 0.57(0.45+0.08) +0.0%
full-v4 0.54(0.46+0.05) 0.88(0.59+0.26) +63.0% 0.55(0.45+0.07) +1.9%
sparse-v3 0.07(0.03+0.03) 0.07(0.04+0.03) +0.0% 0.07(0.03+0.03) +0.0%
sparse-v4 0.06(0.03+0.03) 0.06(0.04+0.02) +0.0% 0.06(0.03+0.03) +0.0%
*git reset -- does-not-exist*
full-v3 0.35(0.27+0.06) 0.52(0.32+0.17) +48.6% 0.35(0.27+0.06) +0.0%
full-v4 0.33(0.26+0.05) 0.50(0.33+0.15) +51.5% 0.33(0.26+0.06) +0.0%
sparse-v3 0.04(0.03+0.04) 0.04(0.03+0.04) +0.0% 0.04(0.03+0.04) +0.0%
sparse-v4 0.04(0.02+0.04) 0.03(0.02+0.04) -25.0% 0.03(0.02+0.04) -25.0%
*git diff*
full-v3 0.07(0.04+0.04) 0.24(0.11+0.14) +242.9% 0.07(0.04+0.04) +0.0%
full-v4 0.07(0.03+0.05) 0.24(0.13+0.12) +242.9% 0.08(0.04+0.05) +14.3%
sparse-v3 0.02(0.01+0.04) 0.02(0.01+0.04) +0.0% 0.02(0.01+0.05) +0.0%
sparse-v4 0.02(0.02+0.03) 0.02(0.01+0.04) +0.0% 0.02(0.01+0.04) +0.0%
*git diff --cached*
full-v3 0.05(0.03+0.02) 0.22(0.12+0.09) +340.0% 0.05(0.03+0.01) +0.0%
full-v4 0.05(0.03+0.01) 0.23(0.12+0.11) +360.0% 0.05(0.03+0.02) +0.0%
sparse-v3 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 0.01(0.00+0.00) +0.0%
sparse-v4 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0% 0.01(0.00+0.00) +0.0%
*git blame f2/f4/a*
full-v3 0.18(0.13+0.05) 0.52(0.29+0.23) +188.9% 0.19(0.15+0.04) +5.6%
full-v4 0.19(0.15+0.04) 0.52(0.28+0.23) +173.7% 0.19(0.14+0.04) +0.0%
sparse-v3 0.10(0.08+0.02) 0.10(0.09+0.01) +0.0% 0.10(0.09+0.01) +0.0%
sparse-v4 0.10(0.08+0.02) 0.10(0.08+0.02) +0.0% 0.10(0.08+0.02) +0.0%
*git blame f2/f4/f3/a*
full-v3 0.45(0.36+0.08) 0.78(0.51+0.27) +73.3% 0.45(0.37+0.08) +0.0%
full-v4 0.45(0.37+0.08) 0.78(0.51+0.26) +73.3% 0.45(0.37+0.08) +0.0%
sparse-v3 0.36(0.32+0.04) 0.36(0.31+0.05) +0.0% 0.36(0.31+0.04) +0.0%
sparse-v4 0.36(0.31+0.05) 0.36(0.31+0.05) +0.0% 0.36(0.31+0.04) +0.0%
*git checkout-index -f --all*
full-v3 0.07(0.02+0.05) 0.24(0.12+0.12) +242.9% 0.08(0.04+0.04) +14.3%
full-v4 0.07(0.03+0.04) 0.24(0.11+0.13) +242.9% 0.08(0.03+0.04) +14.3%
sparse-v3 0.04(0.01+0.03) 0.04(0.00+0.03) +0.0% 0.04(0.01+0.03) +0.0%
sparse-v4 0.04(0.01+0.02) 0.04(0.01+0.03) +0.0% 0.04(0.01+0.02) +0.0%
*git update-index --add --remove f2/f4/a*
full-v3 0.29(0.23+0.02) 0.46(0.30+0.12) +58.6% 0.30(0.24+0.02) +3.4%
full-v4 0.27(0.22+0.02) 0.45(0.29+0.12) +66.7% 0.28(0.22+0.03) +3.7%
sparse-v3 0.02(0.02+0.00) 0.02(0.01+0.00) +0.0% 0.02(0.01+0.00) +0.0%
sparse-v4 0.02(0.02+0.00) 0.02(0.02+0.00) +0.0% 0.02(0.02+0.00) +0.0%
So, with the optimization, the extra work appears to be essentially 0
for sparse-checkouts that are also using sparse-indexes (even before my
optimization), and the extra work appears to be just marginally more
than 0 for sparse-checkouts that are using full indexes.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make several small updates, to address a few documentation issues
I spotted:
* sparse-checkout focused on "patterns" even though the inputs (and
outputs in the case of `list`) are directories in cone-mode
* The description section of the sparse-checkout documentation
was a bit sparse (no pun intended), and focused more on internal
mechanics rather than end user usage. This made sense in the
early days when the command was even more experimental, but let's
adjust a bit to try to make it more approachable to end users who
may want to consider using it. Keep the scary backward
compatibility warning, though; we're still hard at work trying to
fix up commands to behave reasonably in sparse checkouts.
* both read-tree and update-index tried to describe how to use the
skip-worktree bit, but both predated the sparse-checkout command.
The sparse-checkout command is a far easier mechanism to use and
for users trying to reduce the size of their working tree, we
should recommend users to look at it instead.
* The update-index documentation pointed out that assume-unchanged
and skip-worktree sounded similar but had different purposes.
However, it made no attempt to explain the differences, only to
point out that they were different. Explain the differences.
* The update-index documentation focused much more on (internal?)
implementation details than on end-user usage. Try to explain
its purpose better for users of update-index, rather than
fellow developers trying to work with the SKIP_WORKTREE bit.
* Clarify that when core.sparseCheckout=true, we treat a file's
presence in the working tree as being an override to the
SKIP_WORKTREE bit (i.e. in sparse checkouts when the file is
present we ignore the SKIP_WORKTREE bit).
Note that this commit, like many touching documentation, is best viewed
with the `--color-words` option to diff/log.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fix is short (~30 lines), but the description is not. Sorry.
There is a set of problems caused by files in what I'll refer to as the
"present-despite-SKIP_WORKTREE" state. This commit aims to not just fix
these problems, but remove the entire class as a possibility -- for
those using sparse checkouts. But first, we need to understand the
problems this class presents. A quick outline:
* Problems
* User facing issues
* Problem space complexity
* Maintenance and code correctness challenges
* SKIP_WORKTREE expectations in Git
* Suggested solution
* Pros/Cons of suggested solution
* Notes on testcase modifications
=== User facing issues ===
There are various ways for users to get files to be present in the
working copy despite having the SKIP_WORKTREE bit set for that file in
the index. This may come from:
* various git commands not really supporting the SKIP_WORKTREE bit[1,2]
* users grabbing files from elsewhere and writing them to the worktree
(perhaps even cached in their editor)
* users attempting to "abort" a sparse-checkout operation with a
not-so-early Ctrl+C (updating $GIT_DIR/info/sparse-checkout and the
working tree is not atomic)[3].
Once users have present-despite-SKIP_WORKTREE files, any modifications
users make to these files will be ignored, possibly to users' confusion.
Further:
* these files will degrade performance for the sparse-index case due
to requiring the index to be expanded (see commit 55dfcf9591
("sparse-checkout: clear tracked sparse dirs", 2021-09-08) for why
we try to delete entire directories outside the sparse cone).
* these files will not be updated by by standard commands
(switch/checkout/pull/merge/rebase will leave them alone unless
conflicts happen -- and even then, the conflicted file may be
written somewhere else to avoid overwriting the SKIP_WORKTREE file
that is present and in the way)
* there is nothing in Git that users can use to discover such
files (status, diff, grep, etc. all ignore it)
* there is no reasonable mechanism to "recover" from such a condition
(neither `git sparse-checkout reapply` nor `git reset --hard` will
correct it).
So, not only are users modifications ignored, but the files get
progressively more stale over time. At some point in the future, they
may change their sparseness specification or disable sparse-checkouts.
At that time, all present-despite-SKIP_WORKTREE files will show up as
having lots of modifications because they represent a version from a
different branch or commit. These might include user-made local changes
from days before, but the only way to tell is to have users look through
them all closely.
If these users come to others for help, there will be no logs that
explain the issue; it's just a mysterious list of changes. Users might
adamantly claim (correctly, as it turns out) that they didn't modify
these files, while others presume they did.
[1] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
[2] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFnFpzwGC11TLoLs8YK5yiisA5D5-fFjXnJsbESVDwZsA@mail.gmail.com/
=== Problem space complexity ===
SKIP_WORKTREE has been part of Git for over a decade. Duy did lots of
work on it initially, and several others have since come along and put
lots of work into it. Stolee spent most of 2021 on the sparse-index,
with lots of bugfixes along the way including to non-sparse-index cases
as we are still trying to get sparse checkouts to behave reasonably.
Basically every codepath throughout the treat needs to be aware of an
additional type of file: tracked-but-not-present. The extra type
results in lots of extra testcases and lots of extra code everywhere.
But, the sad thing is that we actually have more than one extra type.
We have tracked, tracked-but-not-present (SKIP_WORKTREE), and
tracked-but-promised-to-not-be-present-but-is-present-anyway
(present-despite-SKIP_WORKTREE). Two types is a monumental amount of
effort to support, and adding a third feels a bit like insanity[4].
[4] Some examples of which can be seen at
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
=== Maintenance and code correctness challenges ===
Matheus' patches to grep stalled for nearly a year, in part because of
complications of how to handle sparse-checkouts appropriately in all
cases[5][6] (with trying to sanely figure out how to sanely handle
present-despite-SKIP_WORKTREE files being one of the complications).
His rm/add follow-ups also took months because of those kinds of
issues[7]. The corner cases with things like submodules and
SKIP_WORKTREE with the addition of present-despite-SKIP_WORKTREE start
becoming really complex[8].
We've had to add ugly logic to merge-ort to attempt to handle
present-despite-SKIP_WORKTREE files[9], and basically just been forced
to give up in merge-recursive knowing full well that we'll sometimes
silently discard user modifications. Despite stash essentially being a
merge, it needed extra code (beyond what was in merge-ort and
merge-recursive) to manually tweak SKIP_WORKTREE bits in order to avoid
a few different bugs that'd result in an early abort with a partial
stash application[10].
[5] See https://lore.kernel.org/git/5f3f7ac77039d41d1692ceae4b0c5df3bb45b74a.1612901326.git.matheus.bernardino@usp.br/#t
and the dates on the thread; also Matheus and I had several
conversations off-list trying to resolve the issues over that time
[6] ...it finally kind of got unstuck after
https://lore.kernel.org/git/CABPp-BGJ_Nvi5TmgriD9Bh6eNXE2EDq2f8e8QKXAeYG3BxZafA@mail.gmail.com/
[7] See for example
https://lore.kernel.org/git/CABPp-BHwNoVnooqDFPAsZxBT9aR5Dwk5D9sDRCvYSb8akxAJgA@mail.gmail.com/#t
and quotes like "The core functionality of sparse-checkout has always
been only partially implemented", a statement I still believe is true
today.
[8] https://lore.kernel.org/git/pull.809.git.git.1592356884310.gitgitgadget@gmail.com/
[9] See commit 66b209b86a ("merge-ort: implement CE_SKIP_WORKTREE
handling with conflicted entries", 2021-03-20)
[10] See commit ba359fd507 ("stash: fix stash application in
sparse-checkouts", 2020-12-01)
=== SKIP_WORKTREE expectations in Git ===
A couple quotes:
* From [11] (before the "sparse-checkout" command existed):
If it needs too many special cases, hacks, and conditionals, then it
is not worth the complexity---if it is easier to write a correct code
by allowing Git to populate working tree files, it is perfectly fine
to do so.
In a sense, the sparse checkout "feature" itself is a hack by itself,
and that is why I think this part should be "best effort" as well.
* From the git-sparse-checkout manual (still present today):
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
THE FUTURE.
[11] https://lore.kernel.org/git/xmqqbmb1a7ga.fsf@gitster-ct.c.googlers.com/
=== Suggested solution ===
SKIP_WORKTREE was written to allow sparse-checkouts, in particular, as
the name of the option implies, to allow the file to NOT be in the
worktree but consider it to be unchanged rather than deleted.
The suggests a simple solution: present-despite-SKIP_WORKTREE files
should not exist, for those using sparse-checkouts.
Enforce this at index loading time by checking if core.sparseCheckout is
true; if so, check files in the index with the SKIP_WORKTREE bit set to
verify that they are absent from the working tree. If they are present,
unset the bit (in memory, though any commands that write to the index
will record the update).
Users can, of course, can get the SKIP_WORKTREE bit back such as by
running `git sparse-checkout reapply` (if they have ensured the file is
unmodified and doesn't match the specified sparsity patterns).
=== Pros/Cons of suggested solution ===
Pros:
* Solves the user visible problems reported above, which I've been
complaining about for nearly a year but couldn't find a solution to.
* Helps prevent slow performance degradation with a sparse-index.
* Much easier behavior in sparse-checkouts for users to reason about
* Very simple, ~30 lines of code.
* Significantly simplifies some ugly testcases, and obviates the need
to test an entire class of potential issues.
* Reduces code complexity, reasoning, and maintenance. Avoids
disagreements about weird corner cases[12].
* It has been reported that some users might be (ab)using
SKIP_WORKTREE as a let-me-modify-but-keep-the-file-in-the-worktree
mechanism[13, and a few other similar references]. These users know
of multiple caveats and shortcomings in doing so; perhaps not
surprising given the "SKIP_WORKTREE expecations" section above.
However, these users use `git update-index --skip-worktree`, and not
`git sparse-checkout` or core.sparseCheckout=true. As such, these
users would be unaffected by this change and can continue abusing
the system as before.
[12] https://lore.kernel.org/git/CABPp-BH9tju7WVm=QZDOvaMDdZbpNXrVWQdN-jmfN8wC6YVhmw@mail.gmail.com/
[13] https://stackoverflow.com/questions/13630849/git-difference-between-assume-unchanged-and-skip-worktree
Cons:
* When core.sparseCheckout is enabled, this adds a performance cost to
reading the index. I'll defer discussion of this cost to a subsequent
patch, since I have some optimizations to add.
=== Notes on testcase modifications ===
The good:
* t1011: Compare to two cases above it ('read-tree will not throw away
dirty changes, non-sparse'); since the file is present, it should
match the non-sparse case now
* t1092: sparse-index & sparse-checkout now match full-worktree
behavior in more cases! Yaay for consistency!
* t6428, t7012: look at how much simpler the tests become! Merge and
stash can just fail early telling the user there's a file in the
way, instead of not noticing until it's about to write a file and
then have to implement sudden crash avoidance. Hurray for sanity!
* t7817: sparse behavior better matches full tree behavior. Hurray
for sanity!
The confusing:
* t3705: These changes were ONLY needed on Windows, but they don't
hurt other platforms. Let's discuss each individually:
* core.sparseCheckout should be false by default. Nothing in this
testcase toggles that until many, many tests later. However,
early tests (#5 in particular) were testing `update-index
--skip-worktree` behavior in a non-sparse-checkout, but the
Windows tests in CI were behaving as if core.sparseCheckout=true
had been specified somewhere. I do not have access to a Windows
machine. But I just manually did what should have been a no-op
and turned the config off. And it fixed the test.
* I have no idea why the leftover .gitattributes file from this
test was causing failures for test #18 on Windows, but only with
these changes of mine. Test #18 was checking for empty stderr,
and specifically wanted to know that some error completely
unrelated to file endings did not appear. The leftover
.gitattributes file thus caused some spurious stderr unrelated to
the thing being checked. Since other tests did not intend to
test normalization, just proactively remove the .gitattributes
file. I'm certain this is cleaner and better, I'm just unsure
why/how this didn't trigger problems before.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For sparse-checkouts, we don't want unpack-trees to error out on files
that are missing from the worktree, so there has traditionally been
logic to make it skip the verify_uptodate() check for these.
Unfortunately, it was skipping the verify_uptodate() check for files
that were expected to *become* SKIP_WORKTREE. For files that were not
already SKIP_WORKTREE, that can cause us to later delete the file in
apply_sparse_checkout(). Only skip the check for files that were
already SKIP_WORKTREE as well to avoid lightly discarding important
changes users may have made to files.
Note 1: unpack-trees.c is already a bit complex, and the logic around
CE_SKIP_WORKTREE and CE_NEW_SKIP_WORKTREE in that file are no exception.
I also tried just replacing CE_NEW_SKIP_WORKTREE with CE_SKIP_WORKTREE
in the verify_uptodate() check instead of checking for both flags, and
found that it also fixed this bug and passed all the tests. I also
attempted to devise a few testcases that might trip either variant of my
fix and was unable to find any problems. It may be that just checking
CE_SKIP_WORKTREE is a better fix, but I'm not sure. I thought it
was a bit safer to strictly reduce the number of cases where we skip the
up-to-date check rather than just toggling which kind of cases skip it,
and thus went with the current variant of the fix.
Note 2: I also wondered if verify_absent() might have a similar bug, but
despite my attempts to try to devise a testcase that would trigger such
a thing, I couldn't find any problematic testcases. Thus, this patch
makes no attempt to apply similar changes to verify_absent() and
verify_absent_if_directory().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user has a file with local modifications that is not marked as
SKIP_WORKTREE, but the sparsity patterns are such that it should be
marked that way, and the user then invokes a command like
* git checkout -q HEAD^
or
* git read-tree -mu HEAD^
Then the file will be deleted along with all the users' modifications.
Add a testcase demonstrating this problem.
Note: This bug only triggers if something other than 'HEAD' is given;
if the commands above had specified 'HEAD', then the users' file would
be left alone.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"pull --rebase" internally uses the merge machinery when the other
history is a descendant of ours (i.e. perform fast-forward). This
came from [1], where the discussion was started from a feature
request to do so. It is a bit hard to read the rationale behind it
in the discussion, but it seems that it was an established fact for
everybody involved that does not even need to be mentioned that
fast-forwarding done with "rebase" was much undesirable than done
with "merge", and more importantly, the result left by "merge" is as
good as (or better than) that by "rebase".
Except for one thing. Because "git merge" does not (and should not)
honor rebase.autostash, "git pull" needs to read it and forward it
when we use "git merge" as a (hopefully better) substitute for "git
rebase" during the fast-forwarding. But we forgot to do so (we only
add "--[no-]autostash" to the "git merge" command when "git pull" itself
was invoked with "--[no-]autostash" command line option.
Make sure "git merge" is run with "--autostash" when
rebase.autostash is set and used to fast-forward the history on
behalf of "git rebase". Incidentally this change also takes care of
the case where
- "git pull --rebase" (without other command line options) is run
- "rebase.autostash" is not set
- The history fast-forwards
In such a case, "git merge" is run with an explicit "--no-autostash"
to prevent it from honoring merge.autostash configuration, which is
what we want. After all, we want the "git merge" to pretend as if
it is "git rebase" while being used for this purpose.
[1] https://lore.kernel.org/git/xmqqa8cfbkeq.fsf_-_@gitster.mtv.corp.google.com/
Reported-by: Tilman Vogel <tilman.vogel@web.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* vd/sparse-clean-etc:
update-index: reduce scope of index expansion in do_reupdate
update-index: integrate with sparse index
update-index: add tests for sparse-checkout compatibility
checkout-index: integrate with sparse index
checkout-index: add --ignore-skip-worktree-bits option
checkout-index: expand sparse checkout compatibility tests
clean: integrate with sparse index
reset: reorder wildcard pathspec conditions
reset: fix validation in sparse index test
Replace unconditional index expansion in 'do_reupdate()' with one scoped to
only where a full index is needed. A full index is only required in
'do_reupdate()' when a sparse directory in the index differs from HEAD; in
that case, the index is expanded and the operation restarted.
Because the index should only be expanded if a sparse directory is modified,
add a test ensuring the index is not expanded when differences only exist
within the sparse cone.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable use of the sparse index with `update-index`. Most variations of
`update-index` work without explicitly expanding the index or making any
other updates in or outside of `update-index.c`.
The one usage requiring additional changes is `--cacheinfo`; if a file
inside a sparse directory was specified, the index would not be expanded
until after the cache tree is invalidated, leading to a mismatch between the
index and cache tree. This scenario is handled by rearranging
`add_index_entry_with_check`, allowing `index_name_stage_pos` to expand the
index *before* attempting to invalidate the relevant cache tree path,
avoiding cache tree/index corruption.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce tests for a variety of `git update-index` use cases, including
performance scenarios. Tests are intended to exercise `update-index` with
options that change the commands interaction with the index (e.g.,
`--again`) and with files/directories inside and outside a sparse checkout
cone.
Of note is that these tests clearly establish the behavior of `git
update-index --add` with untracked, outside-of-cone files. Unlike `git add`,
which fails with an error when provided with such files, `update-index`
succeeds in adding them to the index. Additionally, the `skip-worktree` flag
is *not* automatically added to the new entry. Although this is pre-existing
behavior, there are a couple of reasons to avoid changing it in favor of
consistency with e.g. `git add`:
* `update-index` is low-level command for modifying the index; while it can
perform operations similar to those of `add`, it traditionally has fewer
"guardrails" preventing a user from doing something they may not want to
do (in this case, adding an outside-of-cone, non-`skip-worktree` file to
the index)
* `update-index` typically only exits with an error code if it is incapable
of performing an operation (e.g., if an internal function call fails);
adding a new file outside the sparse checkout definition is still a valid
operation, albeit an inadvisable one
* `update-index` does not implicitly set flags (e.g., `skip-worktree`) when
creating new index entries with `--add`; if flags need to be updated,
options like `--[no-]skip-worktree` allow a user to intentionally set them
All this to say that, while there are valid reasons to consider changing the
treatment of outside-of-cone files in `update-index`, there are also
sufficient reasons for leaving it as-is.
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add repository settings to allow usage of the sparse index.
When using the `--all` option, sparse directories are ignored by default due
to the `skip-worktree` flag, so there is no need to expand the index. If
`--ignore-skip-worktree-bits` is specified, the index is expanded in order
to check out all files.
When checking out individual files, existing behavior in a full index is to
exit with an error if a directory is specified (as the directory name will
not match an index entry). However, it is possible in a sparse index to
match a directory name to a sparse directory index entry, but checking out
that sparse directory still results in an error on checkout. To reduce some
potential confusion for users, `checkout_file(...)` explicitly exits with an
informative error if provided with a sparse directory name. The test
corresponding to this scenario verifies the error message, which now differs
between sparse index and non-sparse index checkouts.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update `checkout-index` to no longer refresh files that have the
`skip-worktree` bit set, exiting with an error if `skip-worktree` filenames
are directly provided to `checkout-index`. The newly-added
`--ignore-skip-worktree-bits` option provides a mechanism to replicate the
old behavior, checking out *all* files specified (even those with
`skip-worktree` enabled).
The ability to toggle whether files should be checked-out based on
`skip-worktree` already exists in `git checkout` and `git restore` (both of
which have an `--ignore-skip-worktree-bits` option). The change to, by
default, ignore `skip-worktree` files is especially helpful for
sparse-checkout; it prevents inadvertent creation of files outside the
sparse definition on disk and eliminates the need to expand a sparse index
when using the `--all` option.
Internal usage of `checkout-index` in `git stash` and `git filter-branch` do
not make explicit use of files with `skip-worktree` enabled, so
`--ignore-skip-worktree-bits` is not added to them.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to cover `checkout-index`, with a focus on cases interesting in a
sparse checkout (e.g., files specified outside sparse checkout definition).
New tests are intended to serve as a baseline for existing and/or expected
behavior and performance when integrating `checkout-index` with the sparse
index. Note that the test 'checkout-index --all' is marked as
'test_expect_failure', indicating that `update-index --all` will be modified
in a subsequent patch to behave as the test expects.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove full index requirement for `git clean` and test to ensure the index
is not expanded in `git clean`. Add to existing test for `git clean` to
verify cleanup of untracked files in sparse directories is consistent
between sparse index and non-sparse index checkouts.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rearrange conditions in method determining whether index expansion is
necessary when a pathspec is specified for `git reset`, placing less
expensive condition first. Additionally, add details & examples to related
code comments to help with readability.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update t1092 test 'reset with pathspecs outside sparse definition' to verify
index contents. The use of `rev-parse` verifies the contents of HEAD, not
the index, providing no real validation of the reset results. Conversely,
`ls-files` reports the contents of the index (OIDs, flags, filenames), which
are then compared across checkouts to ensure compatible index states.
Fixes 741a2c9ffa (reset: expand test coverage for sparse checkouts,
2021-09-27).
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code added in 1ae2b8cda8 (reftable: add merged table view,
2021-10-07) to consistently use the "uint64_t" type. These "min" and
"max" variables get passed in the body of this function to a function
whose prototype is:
[...] reftable_writer_set_limits([...], uint64_t min, uint64_t max
This avoids the following warning on SunCC 12.5 on
gcc211.fsffrance.org:
"reftable/merged_test.c", line 27: warning: initializer does not fit or is out of range: 0xffffffff
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apparently, the IBM xlc compiler doesn't like this.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A test case wants to append stuff to a loose object file to ensure
that this kind of corruption is detected. To make a read-only loose
object file writable with chmod, it is not necessary to also make
it executable. Replace the bitmask 755 with the instruction +w to
request only the write bit and to also heed the umask. And get rid
of a POSIXPERM prerequisite, which is unnecessary for the test.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a logic error in refs_resolve_ref_unsafe() introduced in a recent
series of mine to abstract the refs API away from errno. See
96f6623ada (Merge branch 'ab/refs-errno-cleanup', 2021-11-29)for that
series.
In that series introduction of "failure_errno" to
refs_resolve_ref_unsafe came in ef18119dec (refs API: add a version
of refs_resolve_ref_unsafe() with "errno", 2021-10-16). There we'd set
"errno = 0" immediately before refs_read_raw_ref(), and then set
"failure_errno" to "errno" if errno was non-zero afterwards.
Then in the next commit 8b72fea7e9 (refs API: make
refs_read_raw_ref() not set errno, 2021-10-16) we started expecting
"refs_read_raw_ref()" to set "failure_errno". It would do that if
refs_read_raw_ref() failed, but it wouldn't be the same errno.
So we might set the "errno" here to any arbitrary bad value, and end
up e.g. returning NULL when we meant to return the refname from
refs_resolve_ref_unsafe(), or the other way around. Instrumenting this
code will reveal cases where refs_read_raw_ref() will fail, and
"errno" and "failure_errno" will be set to different values.
In practice I haven't found a case where this scary bug changed
anything in practice. The reason for that is that we'll not care about
the actual value of "errno" here per-se, but only whether:
1. We have an errno
2. If it's one of ENOENT, EISDIR or ENOTDIR. See the adjacent code
added in a1c1d8170d (refs_resolve_ref_unsafe: handle d/f
conflicts for writes, 2017-10-06)
I.e. if we clobber "failure_errno" with "errno", but it happened to be
one of those three, and we'll clobber it with another one of the three
we were OK.
Perhaps there are cases where the difference ended up mattering, but I
haven't found them. Instrumenting the test suite to fail if "errno"
and "failure_errno" are different shows a lot of failures, checking if
they're different *and* one is but not the other is outside that list
of three "errno" values yields no failures.
But let's fix the obvious bug. We should just stop paying attention to
"errno" in refs_resolve_ref_unsafe(). In addition let's change the
partial resetting of "errno" in files_read_raw_ref() to happen just
before the "return", to ensure that any such bug will be more easily
spotted in the future.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Translate new messages
- Translate the word 'cone' instead of leaving it verbatim
(in the context of sparse checkout)
- Make translations of 'failed to' consistent
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Reviewed-by: Teng Long <dyroneteng@gmail.com>
Reviewed-by: 依云 <lilydjwg@gmail.com>
Reviewed-by: Jiang Xin <worldhello.net@gmail.com>
Some lockfile code called free() in signal-death code path, which
has been corrected.
* ps/lockfile-cleanup-fix:
fetch: fix deadlock when cleaning up lockfiles in async signals
"git merge $signed_tag" started to drop the tag message from the
default merge message it uses by accident, which has been corrected.
* fs/ssh-signing-key-lifetime:
fmt-merge-msg: prevent use-after-free with signed tags
Notably, it lacks uncompress2(); use the fallback we ship in our
tree instead.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
shift count being exactly at 7-bit smaller than the long is OK; on
32-bit architecture, shift count starts at 4 and goes through 11, 18
and 25, at which point the guard triggers one iteration too early.
Reported-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To test for a key that is completely unknown to the keyring we need one
to sign the commit with. This was done by generating a new key and not
add it into the keyring. To avoid the key generation overhead and
problems where GPG did hang in CI during it, switch GNUPGHOME to the
empty $GNUPGHOME_NOT_USED instead, therefore making all used keys unknown
for this single `verify-commit` call.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is useful to know when a branch first diverged in history
from some integration branch in order to be able to enumerate
the user's local changes. However, these local changes can
include arbitrary merges, so it is necessary to ignore this
merge structure when finding the divergence point.
In order to do this, teach the "rev-list" family to accept
"--exclude-first-parent-only", which restricts the traversal
of excluded commits to only follow first parent links.
-A-----E-F-G--main
\ / /
B-C-D--topic
In this example, the goal is to return the set {B, C, D} which
represents a topic branch that has been merged into main branch.
`git rev-list topic ^main` will end up returning no commits
since excluding main will end up traversing the commits on topic
as well. `git rev-list --exclude-first-parent-only topic ^main`
however will return {B, C, D} as desired.
Add docs for the new flag, and clarify the doc for --first-parent
to indicate that it applies to traversing the set of included
commits only.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The message introduced by commit 593a2a5d06 (branch: protect branches
checked out in all worktrees, 2021-12-01) is missing a space in the
first line, add it.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The C reimplementation of "add -p" fails to split the last hunk in a
file if hunk ends with an addition or deletion without any post context
line unless it is the last file to be processed.
To determine whether a hunk can be split a counter is incremented each
time a context line follows an insertion or deletion. If at the end of
the hunk the value of this counter is greater than one then the hunk
can be split into that number of smaller hunks. If the last hunk in a
file ends with an insertion or deletion then there is no following
context line and the counter will not be incremented. This case is
already handled at the end of the loop where counter is incremented if
the last hunk ended with an insertion or deletion. Unfortunately there
is no similar check between files (likely because the perl version
only ever parses one diff at a time). Fix this by checking if the last
hunk ended with an insertion or deletion when we see the diff header
of a new file and extend the existing regression test.
Reproted-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up some test constructs in preparation for extending the tests
in the next commit. There are three small changes, I've grouped them
together as they're so small it didn't seem worth creating three
separate commits.
1 - "cat file | sed expression" is better written as
"sed expression file".
2 - Follow our usual practice of redirecting the output of git
commands to a file rather than piping it into another command.
3 - Use test_write_lines rather than 'printf "%s\n"'.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the eol attribute states that it is "effectively
setting the text attribute". However, this implies that it forces the
text attribute to always be set, which has not been the case since
6523728499 ("convert: unify the "auto" handling of CRLF", 2016-06-28).
Let's avoid confusing users (and the present author when trying to
describe Git's behavior to others) by clearly documenting in which
cases the "eol" attribute has effect.
Specifically, the attribute always has an effect unless the file is
explicitly set as -text, or the file is set as text=auto and the file is
detected as binary.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, it isn't clear what the behavior is when the eol attribute is
set in .gitattributes but the text attribute is not. Let's add some
tests to document this behavior in our code, which happens to be that
the behavior is as if we set the text attribute implicitly. This will
make sure we don't accidentally change the behavior, which somebody is
probably relying on, and serve as documentation to developers.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a typo in my recent 03dc51fe849 (cat-file: fix remaining usage
bugs, 2021-10-09).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix up whitespace issues around "(... | ...)" in the SYNOPSIS and
usage. These were introduced in ab/cat-file series. See
e145efa6059 (Merge branch 'ab/cat-file' into next, 2022-01-05). In
particular 57d6a1cf96, 5a40417876 and 97fe725075 in that series.
We'll now correctly emit this usage output:
$ git cat-file -h
usage: git cat-file <type> <object>
or: git cat-file (-e | -p) <object>
or: git cat-file (-t | -s) [--allow-unknown-type] <object>
[...]
Before this the last line of that would be inconsistent with the
preceding "(-e | -p)":
or: git cat-file ( -t | -s ) [--allow-unknown-type] <object>
Reported-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Includes extra translation for branch.c:313 which has an
upstream whitespace error that is expected to be fixed.
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Switching out manual arg parsing for the parse-options API for the
expire and delete subcommands.
Move explicit_expiry flag into cmd_reflog_expire_cb struct so callbacks
can set both the value of the timestamp as well as the explicit_expiry
flag.
Signed-off-by: "John Cai" <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merging a signed tag, fmt_merge_msg_sigs() is responsible for
populating the body of the merge message with the names of the signed
tags, their signatures, and the validity of those signatures.
In 02769437e1 (ssh signing: use sigc struct to pass payload,
2021-12-09), check_signature() was taught to pass the object payload via
the sigc struct instead of passing the payload buffer separately.
In effect, 02769437e1 causes buf, and sigc.payload to point at the same
region in memory. This causes a problem for fmt_tag_signature(), which
wants to read from this location, since it is freed beforehand by
signature_check_clear() (which frees it via sigc's `payload` member).
That makes the subsequent use in fmt_tag_signature() a use-after-free.
As a result, merge messages did not contain the body of any signed tags.
Luckily, they tend not to contain garbage, either, since the result of
strstr()-ing the object buffer in fmt_tag_signature() is guarded:
const char *tag_body = strstr(buf, "\n\n");
if (tag_body) {
tag_body += 2;
strbuf_add(tagbuf, tag_body, buf + len - tag_body);
}
Unfortunately, the tests in t6200 did not catch this at the time because
they do not search for the body of signed tags in fmt-merge-msg's
output.
Resolve this by waiting to call signature_check_clear() until after its
contents can be safely discarded. Harden ourselves against any future
regressions in this area by making sure we can find signed tag messages
in the output of fmt-merge-msg, too.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash apply" forgot to attempt restoring untracked files when
it failed to restore changes to tracked ones.
* en/stash-df-fix:
stash: do not return before restoring untracked files
Similar message templates have been consolidated so that
translators need to work on fewer number of messages.
* ja/i18n-similar-messages:
i18n: turn even more messages into "cannot be used together" ones
i18n: ref-filter: factorize "%(foo) atom used without %(bar) atom"
i18n: factorize "--foo outside a repository"
i18n: refactor "unrecognized %(foo) argument" strings
i18n: factorize "no directory given for --foo"
i18n: factorize "--foo requires --bar" and the like
i18n: tag.c factorize i18n strings
i18n: standardize "cannot open" and "cannot read"
i18n: turn "options are incompatible" into "cannot be used together"
i18n: refactor "%s, %s and %s are mutually exclusive"
i18n: refactor "foo and bar are mutually exclusive"
A corner case bug in the ort merge strategy has been corrected.
* en/merge-ort-renorm-with-rename-delete-conflict-fix:
merge-ort: fix bug with renormalization and rename/delete conflicts
Extend the guidance to choose the base commit to build your work
on, and hint/nudge contributors to read others' changes.
* jc/doc-submitting-patches-choice-of-base:
SubmittingPatchs: clarify choice of base and testing
The color palette used by "git grep" has been updated to match that
of GNU grep.
* lh/use-gnu-color-in-grep:
grep: align default colors with GNU grep ones
"git -c branch.autosetupmerge=inherit branch new old" makes "new"
to have the same upstream as the "old" branch, instead of marking
"old" itself as its upstream.
* js/branch-track-inherit:
config: require lowercase for branch.*.autosetupmerge
branch: add flags and config to inherit tracking
branch: accept multiple upstream branches for tracking
Code clean-up to hide vreportf() from public API.
* ab/usage-die-message:
config API: use get_error_routine(), not vreportf()
usage.c + gc: add and use a die_message_errno()
gc: return from cmd_gc(), don't call exit()
usage.c API users: use die_message() for error() + exit 128
usage.c API users: use die_message() for "fatal :" + exit 128
usage.c: add a die_message() routine
"git apply --3way" bypasses the attempt to do a three-way
application in more cases to address the regression caused by the
recent change to use direct application as a fallback.
* jz/apply-3-corner-cases:
git-apply: skip threeway in add / rename cases
Assorted fixlets in reftable code.
* hn/reftable-fixes:
reftable: support preset file mode for writing
reftable: signal overflow
reftable: fix typo in header
Code refactoring in the reflog part of refs API.
* ab/reflog-prep:
reflog + refs-backend: move "verbose" out of the backend
refs files-backend: assume cb->newlog if !EXPIRE_REFLOGS_DRY_RUN
reflog: reduce scope of "struct rev_info"
reflog expire: don't use lookup_commit_reference_gently()
reflog expire: refactor & use "tip_commit" only for UE_NORMAL
reflog expire: use "switch" over enum values
reflog: change one->many worktree->refnames to use a string_list
reflog expire: narrow scope of "cb" in cmd_reflog_expire()
reflog delete: narrow scope of "cmd" passed to count_reflog_ent()
"git stash" by default triggers its "push" action, but its
implementation also made "git stash -h" to show short help only for
"git stash push", which has been corrected.
* ab/do-not-limit-stash-help-to-push:
stash: don't show "git stash push" usage on bad "git stash" usage
Debugging support for refs API.
* hn/refs-debug-update:
refs: centralize initialization of the base ref_store.
refs: print error message in debug output
refs: pass gitdir to packed_ref_store_create
"git fetch" and "git pull" are now declared sparse-index clean.
Also "git ls-files" learns the "--sparse" option to help debugging.
* ds/fetch-pull-with-sparse-index:
test-read-cache: remove --table, --expand options
t1091/t3705: remove 'test-tool read-cache --table'
t1092: replace 'read-cache --table' with 'ls-files --sparse'
ls-files: add --sparse option
fetch/pull: use the sparse index
Test updates.
* hn/ref-api-tests-update:
t7004: use "test-tool ref-store" for reflog inspection
t7004: create separate tags for different tests
t5550: require REFFILES
t5540: require REFFILES
Perf tests were run with end-user's shell, but it has been
corrected to use the shell specified by $TEST_SHELL_PATH.
* ja/perf-use-specified-shell:
t/perf: do not run tests in user's $SHELL
Use of certain "git rev-list" options with "git fast-export"
created nonsense results (the worst two of which being "--reverse"
and "--invert-grep --grep=<foo>"). The use of "--first-parent" is
made to behave a bit more sensible than before.
* ws/fast-export-with-revision-options:
fast-export: fix surprising behavior with --first-parent
The way "git p4" shows file sizes in its output has been updated to
use human-readable units.
* jh/p4-human-unit-numbers:
git-p4: show progress as an integer
git-p4: print size values in appropriate units
Certain sparse-checkout patterns that are valid in non-cone mode
led to segfault in cone mode, which has been corrected.
* ds/sparse-checkout-malformed-pattern-fix:
sparse-checkout: refuse to add to bad patterns
sparse-checkout: fix OOM error with mixed patterns
sparse-checkout: fix segfault on malformed patterns
There are two identical declarations of `ensure_full_index()` in
cache.h.
Commit 3964fc2aae ("sparse-index: add guard to ensure full index",
2021-03-30) provided an empty implementation of `ensure_full_index()`,
declaring it in a new file sparse-index.h. When commit 4300f8442a
("sparse-index: implement ensure_full_index()", 2021-03-30) fleshed out
the implementation, it added an identical declaration to cache.h.
Then 118a2e8bde ("cache: move ensure_full_index() to cache.h",
2021-04-01) favored having the declaration in cache.h. Because of the
double declaration, at that point we could have just dropped the one in
sparse-index.h, but instead it got moved to cache.h.
As a result, cache.h contains the exact same function declaration twice.
Drop the one under "/* Name hashing */", in favor of the one under
"/* Initialize and use the cache information */".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using a buffer limited to 2048 is unnecessarily limiting. Switch to
using a string buffer to read in stdin for annotation.
Signed-off-by: "John Cai" <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a --annotate-stdin that is functionally equivalent of --stdin.
--stdin does not behave as --stdin in other subcommands, such as
pack-objects whereby it takes one argument per line. Since --stdin can
be a confusing and misleading name, rename it to --annotate-stdin.
This change adds a warning to --stdin warning that it will be removed in
the future.
Signed-off-by: "John Cai" <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Christoph Reiter reported on the Git for Windows issue tracker[1], that
mingw_strftime() imports strftime() from ucrtbase.dll with the wrong
calling convention. It should be __cdecl instead of WINAPI, which we
always use in DECLARE_PROC_ADDR().
The MSYS2 project encountered cmake sefaults on x86 Windows caused by
the same issue in the cmake source. [2] There are no known git crashes
that where caused by this, yet, but we should try to prevent them.
We import two other non-WINAPI functions via DECLARE_PROC_ADDR(), too.
* NtSetSystemInformation() (NTAPI)
* GetUserNameExW() (SEC_ENTRY)
NTAPI, SEC_ENTRY and WINAPI are all ususally defined as __stdcall,
but there are circumstances where they're defined differently.
Teach DECLARE_PROC_ADDR() about calling conventions and be explicit
about when we want to use which calling convention.
Import winnt.h for the definition of NTAPI and sspi.h for SEC_ENTRY
near their respective only users.
[1] https://github.com/git-for-windows/git/issues/3560
[2] https://github.com/msys2/MINGW-packages/issues/10152
Reported-By: Christoph Reiter <reiter.christoph@gmail.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like in the previous patch for compat/qsort_s.c, remove the optimization
of using an on-stack buffer to avoid small allocations. This ensures
maximum alignment for the array elements and simplifies the code a bit.
The performance impact for the current callers is unlikely to be
noticeable:
* compat/mingw.c::make_environment_block() uses ALLOC_ARRAY and
ALLOC_GROW several times already, so another allocation of up to 1KB
should not matter much.
* diffcore-rename.c::diffcore_rename_extended() is called once per diff
or twice per merge, and those require allocations for each object and
more already.
* merge-ort.c::detect_and_process_renames() is called once per merge.
It's responsible for the two per-merge diffcore_rename_extended()
calls mentioned above as well, though. So this is possibly the most
impacted caller. Per-object allocations are likely to dwarf the
additional small allocations in git_stable_qsort(), though.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new hook.h library has replaced all run-command.h hook-related
functionality. So let's delete this dead code.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the push-to-checkout hook away from run-command.h to and over to
the new hook.h library.
This removes the last direct user of run_hook_le(), so we could remove
that function now, but let's leave that to a follow-up cleanup commit.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the post-index-change hook away from run-command.h to and over to
the new hook.h library.
This removes the last direct user of "run_hook_ve()" outside of
run-command.c ("run_hook_le()" still uses it). So we can make the
function static now. A subsequent commit will remove this code
entirely when "run_hook_le()" itself goes away.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of duplicating the behavior of run-command.h:run_hook_le() in
Python, we can directly call 'git hook run'. We emulate the existence
check with the --ignore-missing flag.
We're dropping the "verbose" handling added in 9f59ca4d6a (git-p4:
create new function run_git_hook, 2020-02-11), those who want
diagnostic output about how hooks are run are now able to get that via
e.g. the trace2 facility and GIT_TRACE=1.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "sendmail-validate" hook to be run via the "git hook run"
wrapper instead of via a direct invocation.
This is the smallest possibly change to get "send-email" using "git
hook run". We still check the hook itself with "-x", and set a
"GIT_DIR" variable, both of which are asserted by our tests. We'll
need to get rid of this special behavior if we start running N hooks,
but for now let's be as close to bug-for-bug compatible as possible.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For certain one-shot hooks we'd like to optimistically run them, and
not complain if they don't exist.
This was already supported by the underlying hook.c library, but had
not been exposed via "git hook run". The command version of this will
be used by send-email in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the running of the 'post-checkout' hook away from run-command.h
to the new hook.h library in builtin/worktree.c. For this special case
we need a change to the hook API to teach it to run the hook from a
given directory.
We cannot skip the "absolute_path" flag and just check if "dir" is
specified as we'd then fail to find our hook in the new dir we'd
chdir() to. We currently don't have a use-case for running a hook not
in our "base" repository at a given absolute path, so let's have "dir"
imply absolute_path(find_hook(hook_name)).
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the running of the 'post-checkout' hook away from run-command.h
to the new hook.h library, except in the case of
builtin/worktree.c. That special-case will be handled in a subsequent
commit.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the pre-rebase hook away from run-command.h to and over to the
new hook.h library.
Since this hook needs arguments introduce a run_hooksl() wrapper, like
run_hooks(), but it takes varargs.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a run_hooks_l() wrapper, we'll use it in subsequent commits for
the simple cases of wanting to run a single hook under a given name
along with a list of arguments.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach pre-applypatch and post-applypatch to use the hook.h library
instead of the run-command.h library.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the pre-auto-gc hook away from run-command.h to and over to the
new hook.h library. This uses the new run_hooks() wrapper.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a run_hooks() wrapper, we'll use it in subsequent commits for the
simple cases of wanting to run a single hook under a given name,
without providing options such as "env" or "args".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to enable hooks to be run as an external process, by a
standalone Git command, or by tools which wrap Git, provide an external
means to run all configured hook commands for a given hook event.
Most of our hooks require more complex functionality than this, but
let's start with the bare minimum required to support our simplest
hooks.
In terms of implementation the usage_with_options() and "goto usage"
pattern here mirrors that of
builtin/{commit-graph,multi-pack-index}.c.
Some of the implementation here, such as a function being named
run_hooks_opt() when it's tasked with running one hook, to using the
run_processes_parallel_tr2() API to run with jobs=1 is somewhere
between a bit odd and and an overkill for the current features of this
"hook run" command and the hook.[ch] API.
This code will eventually be able to run multiple hooks declared in
config in parallel, by starting out with these names and APIs we
reduce the later churn of renaming functions, switching from the
run_command() to run_processes_parallel_tr2() API etc.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The compatibility definition for qsort_s() uses "char buffer[1024]"
on the stack to avoid making malloc() calls for small temporary
space, which essentially hand-rolls alloca().
But the elements of the array being sorted may have alignment needs
more strict than what an array of bytes may have. &buf[0] may be
word aligned, but using the address as if it stores the first
element of an array of a struct, whose first member may need to be
aligned on double-word boundary, would be a no-no.
We could use xalloca() from git-compat-util.h, or alloca() directly
on platforms with HAVE_ALLOCA_H, but let's try using unconditionally
xmalloc() before we know the performance characteristics of the
callers.
It may not make much of an argument to inspect the current callers
and say "it shouldn't matter to any of them", but anyway:
* The one in object-name.c is used to sort potential matches to a
given ambiguous object name prefix in the error path;
* The one in pack-write.c is done once per a pack .idx file being
written to create the reverse index, so (1) the cost of malloc()
overhead is dwarfed by the cost of the packing operation, and (2)
the number of entries being sorted is the number of objects in a
pack;
* The one in ref-filter.c is used by "branch --list", "tag --list",
and "for-each-ref", only once per operation. We sort an array of
pointers with entries, each corresponding to a ref that is shown.
* The one in string-list.c is used by sort_string_list(), which is
way too generic to assume any access patterns, so it may or may
not matter, but I do not care too much ;-)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching packfiles, we write a bunch of lockfiles for the packfiles
we're writing into the repository. In order to not leave behind any
cruft in case we exit or receive a signal, we register both an exit
handler as well as signal handlers for common signals like SIGINT. These
handlers will then unlink the locks and free the data structure tracking
them. We have observed a deadlock in this logic though:
(gdb) bt
#0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007f4932bea2cd in _int_free (av=0x7f4932f2eb20 <main_arena>, p=0x3e3e4200, have_lock=0) at malloc.c:3969
#2 0x00007f4932bee58c in __GI___libc_free (mem=<optimized out>) at malloc.c:2975
#3 0x0000000000662ab1 in string_list_clear ()
#4 0x000000000044f5bc in unlock_pack_on_signal ()
#5 <signal handler called>
#6 _int_free (av=0x7f4932f2eb20 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:4024
#7 0x00007f4932bee58c in __GI___libc_free (mem=<optimized out>) at malloc.c:2975
#8 0x000000000065afd5 in strbuf_release ()
#9 0x000000000066ddb9 in delete_tempfile ()
#10 0x0000000000610d0b in files_transaction_cleanup.isra ()
#11 0x0000000000611718 in files_transaction_abort ()
#12 0x000000000060d2ef in ref_transaction_abort ()
#13 0x000000000060d441 in ref_transaction_prepare ()
#14 0x000000000060e0b5 in ref_transaction_commit ()
#15 0x00000000004511c2 in fetch_and_consume_refs ()
#16 0x000000000045279a in cmd_fetch ()
#17 0x0000000000407c48 in handle_builtin ()
#18 0x0000000000408df2 in cmd_main ()
#19 0x00000000004078b5 in main ()
The process was killed with a signal, which caused the signal handler to
kick in and try free the data structures after we have unlinked the
locks. It then deadlocks while calling free(3P).
The root cause of this is that it is not allowed to call certain
functions in async-signal handlers, as specified by signal-safety(7).
Next to most I/O functions, this list of disallowed functions also
includes memory-handling functions like malloc(3P) and free(3P) because
they may not be reentrant. As a result, if we execute such functions in
the signal handler, then they may operate on inconistent state and fail
in unexpected ways.
Fix this bug by not calling non-async-signal-safe functions when running
in the signal handler. We're about to re-raise the signal anyway and
will thus exit, so it's not much of a problem to keep the string list of
lockfiles untouched. Note that it's fine though to call unlink(2), so
we'll still clean up the lockfiles correctly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to trim \r from the output of 'ssh-keygen -Y find-principals' on
Windows, or we end up calling 'ssh-keygen -Y verify' with a bogus signer
identity. ssh-keygen.c:2841 contains a call to puts(3), which confirms
this hypothesis. Signature verification passes with the fix.
Helped-by: Pedro Martelletto <pedro@yubico.com>
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git update-index --refresh' and '--really-refresh' should force writing
of the index file if racy timestamps have been encountered, as
'git status' already does [1].
Note that calling 'git update-index --refresh' still does not guarantee
that there will be no more racy timestamps afterwards (the same holds
true for 'git status'):
- calling 'git update-index --refresh' immediately after touching and
adding a file may still leave racy timestamps if all three operations
occur within the racy-tolerance (usually 1 second unless USE_NSEC has
been defined)
- calling 'git update-index --refresh' for timestamps which are set into
the future will leave them racy
To guarantee that such racy timestamps will be resolved would require to
wait until the system clock has passed beyond these timestamps and only
then write the index file. Especially for future timestamps, this does
not seem feasible because of possibly long delays/hangs.
[1] https://lore.kernel.org/git/d3dd805c-7c1d-30a9-6574-a7bfcb7fc013@syntevo.com/
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git status" fixes racy timestamps regardless of the worktree being
dirty or not. The new test cases capture this behavior.
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add functions `test_set_magic_mtime` and `test_is_magic_mtime` which can
be used to (re)set the mtime of a file to a predefined ("magic")
timestamp, then perform some operations and finally check for mtime
changes of the file.
The core implementation follows the suggestion from the
mailing list [1].
[1] https://lore.kernel.org/git/xmqqczl5hpaq.fsf@gitster.g/T/#u
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Symlink changes are tracked in a string_list, with the util pointer
value indicating whether a symlink is kept or removed. Using fake
pointer values requires awkward casts. Use one strset for each type of
change instead to simplify and shorten the code.
Original-patch-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CalledProcessError is an exception class from the subprocess namespace.
When raising this exception, git-p4 would instantiate CalledProcessError
objects without properly referencing the subprocess namespace causing
the script to fail.
Resolves the issue by replacing CalledProcessError with
subprocess.CalledProcessError.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously the git-p4 script would log commands as stringified
representations of the command parameter, leading to output such as
this:
Reading pipe: ['git', 'config', '--bool', 'git-p4.useclientspec']
Now that all commands are list objects, this patch instead joins the
elements of the list into a single string so the output now looks more
readable:
Reading pipe: git config --bool git-p4.useclientspec
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the majority of the subprocess calls where shell=True was used, it
was only needed to parse command arguments by spaces. In each of these
cases, the commands are now being passed in as lists instead of strings.
This change aids the comprehensibility of the code. Constucting commands
and arguments using strings risks bugs from unsanitized inputs, and the
attendant complexity of properly quoting and escaping command arguments.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the script would invoke subprocess functions setting the
shell argument True if the command argument was a string, setting it
False otherwise.
This patch replaces this implicit type-driven behaviour with explicit
shell arguments specified by the caller.
The apparent motive for the implict behaviour is that the subprocess
functions do not divide command strings into args. Invoking
subprocess.call("echo hello") will attempt to execute a program by the
name "echo hello". With subprocess.call("echo hello", shell=True), sh
-c "echo hello" will be executed instead, which will cause the command
and args to be divided by spaces.
Eventually, all usage of shell=True, that is not necessary for some
purpose beyond parsing command strings, should be removed. For now,
this patch makes the usage of shells explicit.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as a previous commit, use the new `grep_and_expr()`
to construct the AND node in `compile_pattern_and()`.
Unlike the aforementioned previous commit, this is not about code
duplication, since this is the only spot in grep.c where an AND node is
constructed. Rather, this is about visual consistency with the other
`compile_pattern_xyz()` functions.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two functions that have very similar logic of finding a header
value. find_commit_header, and find_header. We can conslidate the logic
by introducing a new function find_header_mem, which is equivalent to
find_commit_header except it takes a len parameter that determines how
many bytes will be read. find_commit_header and find_header can then both
call find_header_mem.
This reduces duplicate logic, as the logic for finding header values
can now all live in one place.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When constructing an OR node, the grep.c code uses `grep_or_expr()` to
make a node, assign its kind, and set its left and right children. The
same is not done for AND nodes.
Prepare to introduce a new `grep_and_expr()` function which will share
code with the existing implementation of `grep_or_expr()` by introducing
a new function which compiles either kind of binary expression, and
reimplement `grep_or_expr()` in terms of it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the definition of grep_not_expr() up and use this function in
compile_pattern_not() to simplify the code and reduce duplication.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the definition of grep_or_expr() up and use this function in
compile_pattern_or() to reduce code duplication.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep --perl-regexp" failed to match UTF-8 characters with
wildcard when the pattern consists only of ASCII letters, which has
been corrected.
* rs/pcre2-utf:
grep/pcre2: factor out literal variable
grep/pcre2: use PCRE2_UTF even with ASCII patterns
"git log --invert-grep --author=<name>" used to exclude commits
written by the given author, but now "--invert-grep" only affects
the matches made by the "--grep=<pattern>" option.
* rs/log-invert-grep-with-headers:
log: let --invert-grep only invert --grep
The default merge message prepared by "git merge" records the name
of the current branch; the name can be overridden with a new option
to allow users to pretend a merge is made on a different branch.
* jc/merge-detached-head-name:
merge: allow to pretend a merge is made into a different branch
Among some code paths that ask an yes/no question, only one place
gave a prompt that looked different from the others, which has been
updated to match what the others create.
* km/help-prompt-fix:
help: make auto-correction prompt more consistent
"git upload-pack" (the other side of "git fetch") used a 8kB buffer
but most of its payload came on 64kB "packets". The buffer size
has been enlarged so that such a packet fits.
* jv/use-larger-buffer-in-upload-pack:
upload-pack.c: increase output buffer size
Correctness and performance update to "diff --color-moved" feature.
* pw/diff-color-moved-fix:
diff --color-moved: intern strings
diff: use designated initializers for emitted_diff_symbol
diff --color-moved-ws=allow-indentation-change: improve hash lookups
diff --color-moved: stop clearing potential moved blocks
diff --color-moved: shrink potential moved blocks as we go
diff --color-moved: unify moved block growth functions
diff --color-moved: call comparison function directly
diff --color-moved-ws=allow-indentation-change: simplify and optimize
diff: simplify allow-indentation-change delta calculation
diff --color-moved: avoid false short line matches and bad zebra coloring
diff --color-moved=zebra: fix alternate coloring
diff --color-moved: rewind when discarding pmb
diff --color-moved: factor out function
diff --color-moved: clear all flags on blocks that are too short
diff --color-moved: add perf tests
"git am" learns "--empty=(stop|drop|keep)" option to tweak what is
done to a piece of e-mail without a patch in it.
* xw/am-empty:
am: support --allow-empty to record specific empty patches
am: support --empty=<option> to handle empty patches
doc: git-format-patch: describe the option --always
Many git commands that deal with working tree files try to remove a
directory that becomes empty (i.e. "git switch" from a branch that
has the directory to another branch that does not would attempt
remove all files in the directory and the directory itself). This
drops users into an unfamiliar situation if the command was run in
a subdirectory that becomes subject to removal due to the command.
The commands have been taught to keep an empty directory if it is
the directory they were started in to avoid surprising users.
* en/keep-cwd:
t2501: simplify the tests since we can now assume desired behavior
dir: new flag to remove_dir_recurse() to spare the original_cwd
dir: avoid incidentally removing the original_cwd in remove_path()
stash: do not attempt to remove startup_info->original_cwd
rebase: do not attempt to remove startup_info->original_cwd
clean: do not attempt to remove startup_info->original_cwd
symlinks: do not include startup_info->original_cwd in dir removal
unpack-trees: add special cwd handling
unpack-trees: refuse to remove startup_info->original_cwd
setup: introduce startup_info->original_cwd
t2501: add various tests for removing the current working directory
The conditions to choose different definitions of the FLEX_ARRAY
macro for vendor compilers has been simplified to make it easier to
maintain.
* jc/flex-array-definition:
flex-array: simplify compiler-specific workaround
The RCS keyword substitution in "git p4" used to be done assuming
that the contents are UTF-8 text, which can trigger decoding
errors. We now treat the contents as a bytestring for robustness
and correctness.
* jh/p4-rcs-expansion-in-bytestring:
git-p4: resolve RCS keywords in bytes not utf-8
git-p4: open temporary patch file for write only
git-p4: add raw option to read_pipelines
git-p4: pre-compile RCS keyword regexes
git-p4: use with statements to close files after use in patchRCSKeywords
Even if some of these messages are not subject to gettext i18n, this
helps bring a single style of message for a given error type.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They are all replaced by "the option '%s' requires '%s'", which is a
new string but replaces 17 previous unique strings.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use placeholders for constant tokens. The strings are turned into
"cannot be used together"
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use static strings for constant parts of the sentences. They are all
turned into "cannot be used together".
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-grep shares a lot of options with the standard grep tool.
Like GNU grep, it has coloring options to highlight the matching text.
And like it, it has options to customize the various colored parts.
This patch updates the default git-grep colors to make them match the
GNU grep default ones [1].
It was possible to get the same result by setting the various `color.grep.<slot>`
options, but this patch makes `git grep --color` share the same color scheme as
`grep --color` by default without any user configuration.
[1] https://www.man7.org/linux/man-pages/man1/grep.1.html#ENVIRONMENT
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit bee8691f19 ("stash: restore untracked files AFTER restoring
tracked files", 2021-09-10), we correctly identified that we should
restore changes to tracked files before attempting to restore untracked
files, and accordingly moved the code for restoring untracked files a
few lines down in do_apply_stash(). Unfortunately, the intervening
lines had some early return statements meaning that we suddenly stopped
restoring untracked files in some cases.
Even before the previous commit, there was another possible issue with
the current code -- a post-stash-apply 'git status' that was intended
to be run after restoring the stash was skipped when we hit a conflict
(or other error condition), which seems slightly inconsistent.
Fix both issues by saving the return status, and letting other
functionality run before returning.
Reported-by: AJ Henderson
Test-case-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/reflog-prep:
reflog + refs-backend: move "verbose" out of the backend
refs files-backend: assume cb->newlog if !EXPIRE_REFLOGS_DRY_RUN
reflog: reduce scope of "struct rev_info"
reflog expire: don't use lookup_commit_reference_gently()
reflog expire: refactor & use "tip_commit" only for UE_NORMAL
reflog expire: use "switch" over enum values
reflog: change one->many worktree->refnames to use a string_list
reflog expire: narrow scope of "cb" in cmd_reflog_expire()
reflog delete: narrow scope of "cmd" passed to count_reflog_ent()
315a84f9aa (subtree: use commits before rejoins for splits, 2018-09-28)
changed the signature of check_parents from 'check_parents [REV...]'
to 'check_parents PARENTS_EXPR INDENT'. In other words the variable list
of parent revisions became a list embedded in a string. However it
neglected to unpack the list again before sending it to cache_miss,
leading to incorrect calls whenever more than one parent was present.
This is the case whenever a merge commit is processed, with the end
result being a loss of performance from unecessary rechecks.
The indent parameter was subsequently removed in e9525a8a02 (subtree:
have $indent actually affect indentation, 2021-04-27), but the argument
handling bug remained.
For consistency, take multiple arguments in check_parents,
and pass all of them to cache_miss separately.
Signed-off-by: James Limbouris <james@digitalmatter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "init" and "set" subcommands in "git sparse-checkout" have been
unified for a better user experience and performance.
* en/sparse-checkout-set:
sparse-checkout: remove stray trailing space
clone: avoid using deprecated `sparse-checkout init`
Documentation: clarify/correct a few sparsity related statements
git-sparse-checkout.txt: update to document init/set/reapply changes
sparse-checkout: enable reapply to take --[no-]{cone,sparse-index}
sparse-checkout: enable `set` to initialize sparse-checkout mode
sparse-checkout: split out code for tweaking settings config
sparse-checkout: disallow --no-stdin as an argument to set
sparse-checkout: add sanity-checks on initial sparsity state
sparse-checkout: break apart functions for sparse_checkout_(set|add)
sparse-checkout: pass use_stdin as a parameter instead of as a global
Broken &&-chains in the test scripts have been corrected.
* es/test-chain-lint:
t6000-t9999: detect and signal failure within loop
t5000-t5999: detect and signal failure within loop
t4000-t4999: detect and signal failure within loop
t0000-t3999: detect and signal failure within loop
tests: simplify by dropping unnecessary `for` loops
tests: apply modern idiom for exiting loop upon failure
tests: apply modern idiom for signaling test failure
tests: fix broken &&-chains in `{...}` groups
tests: fix broken &&-chains in `$(...)` command substitutions
tests: fix broken &&-chains in compound statements
tests: use test_write_lines() to generate line-oriented output
tests: simplify construction of large blocks of text
t9107: use shell parameter expansion to avoid breaking &&-chain
t6300: make `%(raw:size) --shell` test more robust
t5516: drop unnecessary subshell and command invocation
t4202: clarify intent by creating expected content less cleverly
t1020: avoid aborting entire test script when one test fails
t1010: fix unnoticed failure on Windows
t/lib-pager: use sane_unset() to avoid breaking &&-chain
New interface into the tmp-objdir API to help in-core use of the
quarantine feature.
* ns/tmp-objdir:
tmp-objdir: disable ref updates when replacing the primary odb
tmp-objdir: new API for creating temporary writable databases
"git format-patch" uses a single rev_info instance and then exits.
Mark the structure with UNLEAK() macro to squelch leak sanitizer.
* jc/unleak-log:
format-patch: mark rev_info with UNLEAK
When in cone mode sparse-checkout, it is unclear how 'git
sparse-checkout add <dir1> ...' should behave if the existing
sparse-checkout file does not match the cone mode patterns. Change the
behavior to fail with an error message about the existing patterns.
Also, all cone mode patterns start with a '/' character, so add that
restriction. This is necessary for our example test 'cone mode: warn on
bad pattern', but also requires modifying the example sparse-checkout
file we use to test the warnings related to recognizing cone mode
patterns.
This error checking would cause a failure further down the test script
because of a test that adds non-cone mode patterns without cleaning them
up. Perform that cleanup as part of the test now.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test to t1091-sparse-checkout-builtin.sh that would result in an
infinite loop and out-of-memory error before this change. The issue
relies on having non-cone-mode patterns while trying to modify the
patterns in cone-mode.
The fix is simple, allowing us to break from the loop when the input
path does not contain a slash, as the "dir" pattern we added does not.
This is only a fix to the critical out-of-memory error. A better
response to such a strange state will follow in a later change.
Reported-by: Calbabreaker <calbabreaker@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Then core.sparseCheckoutCone is enabled, the sparse-checkout patterns are
used to populate two hashsets that accelerate pattern matching. If the user
modifies the sparse-checkout file outside of the 'sparse-checkout' builtin,
then strange patterns can happen, triggering some error checks.
One of these error checks is possible to hit when some special characters
exist in a line. A warning message is correctly written to stderr, but then
there is additional logic that attempts to remove the line from the hashset
and free the data. This leads to a segfault in the 'git sparse-checkout
list' command because it iterates over the contents of the hashset, which is
now invalid.
The fix here is to stop trying to remove from the hashset. In addition,
we disable cone mode sparse-checkout because of the malformed data. This
results in the pattern-matching working with a possibly-slower
algorithm, but using the patterns as they are in the sparse-checkout
file.
This also changes the behavior of commands such as 'git sparse-checkout
list' because the output patterns will be the contents of the
sparse-checkout file instead of the list of directories. This is an
existing behavior for other types of bad patterns.
Add a test that triggers the segfault without the code change.
Reported-by: John Burnett <johnburnett@johnburnett.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We encourage identifying what, among many topics on `next`, exact
topics a new work depends on, instead of building directly on
`next`. Let's clarify this in the documentation.
Developers should know what they are building on top of, and be
aware of which part of the system is currently being worked on.
Encouraging them to make trial merges to `next` and `seen`
themselves will incentivize them to read others' changes and
understand them, eventually helping the developers to coordinate
among themselves and reviewing each others' changes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the cat_one_file() logic that calls get_oid_with_context()
under --textconv and --filters to use the GET_OID_ONLY_TO_DIE flag,
thus improving the error messaging emitted when e.g. <path> is missing
but <rev> is not.
To service the "cat-file" use-case we need to introduce a new
"GET_OID_REQUIRE_PATH" flag, otherwise it would exit early as soon as
a valid "HEAD" was resolved, but in the "cat-file" case being changed
we always need a valid revision and path.
This arguably makes the "<bad rev>:<bad path>" and "<bad
rev>:<good (in HEAD) path>" use cases worse, as we won't quote the
<path> component at the user anymore, but let's just use the existing
logic "git log" et al use for now. We can improve the messaging for
those cases as a follow-up for all callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop having GET_OID_ONLY_TO_DIE imply GET_OID_QUIETLY in
get_oid_with_context_1().
The *_DIE flag was added in 33bd598c39 (sha1_name.c: teach lookup
context to get_sha1_with_context(), 2012-07-02), and then later
tweaked in 7243ffdd78 (get_sha1: avoid repeating ourselves via
ONLY_TO_DIE, 2016-09-26).
Everything in that commit makes sense, but only for callers that
expect to fail in an initial call to get_oid_with_context_1(), e.g. as
"git show 0017" does via handle_revision_arg(), and then would like to
call get_oid_with_context_1() again via this
maybe_die_on_misspelt_object_name() function.
In the subsequent commit we'll add a new caller that expects to call
this only once, but who would still like to have all the error
messaging that GET_OID_ONLY_TO_DIE gives it, in addition to any
regular errors.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the usage output emitted on "git cat-file -h" to group related
options, making it clear to users which options go with which other
ones.
The new output is:
Check object existence or emit object contents
-e check if <object> exists
-p pretty-print <object> content
Emit [broken] object attributes
-t show object type (one of 'blob', 'tree', 'commit', 'tag', ...)
-s show object size
--allow-unknown-type allow -s and -t to work with broken/corrupt objects
Batch objects requested on stdin (or --batch-all-objects)
--batch[=<format>] show full <object> or <rev> contents
--batch-check[=<format>]
like --batch, but don't emit <contents>
--batch-all-objects with --batch[-check]: ignores stdin, batches all known objects
Change or optimize batch output
--buffer buffer --batch output
--follow-symlinks follow in-tree symlinks
--unordered do not order objects before emitting them
Emit object (blob or tree) with conversion or filter (stand-alone, or with batch)
--textconv run textconv on object's content
--filters run filters on object's content
--path blob|tree use a <path> for (--textconv | --filters ); Not with 'batch'
The old usage was:
<type> can be one of: blob, tree, commit, tag
-t show object type
-s show object size
-e exit with zero when there's no error
-p pretty-print object's content
--textconv for blob objects, run textconv on object's content
--filters for blob objects, run filters on object's content
--batch-all-objects show all objects with --batch or --batch-check
--path <blob> use a specific path for --textconv/--filters
--allow-unknown-type allow -s and -t to work with broken/corrupt objects
--buffer buffer --batch output
--batch[=<format>] show info and content of objects fed from the standard input
--batch-check[=<format>]
show info about objects fed from the standard input
--follow-symlinks follow in-tree symlinks (used with --batch or --batch-check)
--unordered do not order --batch-all-objects output
While shorter, I think the new one is easier to understand, as
e.g. "--allow-unknown-type" is grouped with "-t" and "-s", as it can
only be combined with those options. The same goes for "--buffer",
"--unordered" etc.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the migration of --batch-all-objects to OPT_CMDMODE() in the
preceding commit one bug with combining it and other OPT_CMDMODE()
options was solved, but we were still left with e.g. --buffer silently
being discarded when not in batch mode.
Fix all those bugs, and in addition emit errors telling the user
specifically what options can't be combined with what other options,
before this we'd usually just emit the cryptic usage text and leave
the users to work it out by themselves.
This change is rather large, because to do so we need to untangle the
options processing so that we can not only error out, but emit
sensible errors, and e.g. emit errors about options before errors
about stray argc elements (as they might become valid if the option
were removed).
Some of the output changes ("error:" to "fatal:" with
usage_msg_opt[f]()), but none of the exit codes change, except in
those cases where we silently accepted bad option combinations before,
now we'll error out.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usage of OPT_CMDMODE() in "cat-file"[1] was added in parallel with
the development of[3] the --batch-all-objects option[4], so we've
since grown[5] checks that it can't be combined with other command
modes, when it should just be made a top-level command-mode
instead. It doesn't combine with --filters, --textconv etc.
By giving parse_options() information about what options are mutually
exclusive with one another we can get the die() message being removed
here for free, we didn't even use that removed message in some cases,
e.g. for both of:
--batch-all-objects --textconv
--batch-all-objects --filters
We'd take the "goto usage" in the "if (opt)" branch, and never reach
the previous message. Now we'll emit e.g.:
$ git cat-file --batch-all-objects --filters
error: option `filters' is incompatible with --batch-all-objects
1. b48158ac94 (cat-file: make the options mutually exclusive, 2015-05-03)
2. https://lore.kernel.org/git/xmqqtwspgusf.fsf@gitster.dls.corp.google.com/
3. https://lore.kernel.org/git/20150622104559.GG14475@peff.net/
4. 6a951937ae (cat-file: add --batch-all-objects option, 2015-06-22)
5. 321459439e (cat-file: support --textconv/--filters in batch mode, 2016-09-09)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no benefit to defining this at a distance, and it makes the
code harder to read as you've got to scroll up to see the usage that
corresponds to the options.
In subsequent commits I'll make use of usage_msg_opt(), which will be
quite noisy if I have to use the long "cat_file_usage" variable,
there's no other command being defined in this file, so let's rename
it to just "usage".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were various inaccuracies in the previous SYNOPSIS output,
e.g. "--path" is not something that can optionally go with any options
except --textconv or --filters, as the output implied.
The opening line of the DESCRIPTION section is also "In its first
form[...]", which refers to "git cat-file <type> <object>", but the
SYNOPSIS section wasn't showing that as the first form!
That part of the documentation made sense in
d83a42f34a (Documentation: minor grammatical fixes in
git-cat-file.txt, 2009-03-22) when it was introduced, but since then
various options that were added have made that intro make no sense in
the context it was in. Now the two will match again.
The usage output here is not properly aligned on "master" currently,
but will be with my in-flight 4631cfc20b (parse-options: properly
align continued usage output, 2021-09-21), so let's indent things
correctly in the C code in anticipation of that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a usage_msg_optf() as a shorthand for the sort of
usage_msg_opt(xstrfmt(...)) used in builtin/stash.c. I'll make more
use of this function in builtin/cat-file.c shortly.
The disconnect between the "..." and "fmt" is a bit unusual, but it
works just fine and this keeps it consistent with usage_msg_opt(),
i.e. a caller of it can be moved to usage_msg_optf() and not have to
have its arguments re-arranged.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests for the output that's emitted when we disambiguate
<obj>:<path> in cat-file. This gives us a baseline for improving these
messages.
For e.g. "git blame" we'll emit:
$ git blame HEAD:foo
fatal: no such path 'HEAD:foo' in HEAD
But cat-file doesn't disambiguate these two cases, and just gives the
rather unhelpful:
$ git cat-file --textconv HEAD:foo
fatal: Not a valid object name HEAD:foo
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stress test the usage emitted when options are combined in ways that
isn't supported. Let's test various option combinations, some of these
we buggily allow right now.
E.g. this reveals a bug in 321459439e (cat-file: support
--textconv/--filters in batch mode, 2016-09-09) that we'll fix in a
subsequent commit. We're supposed to be emitting a relevant message
when --batch-all-objects is combined with --textconv or --filters, but
we don't.
The cases of needing to assign to opt=2 in the "opt" loop are because
on those we do the right thing already, in subsequent commits the
"test_expect_failure" cases will be fixed, and the for-loops unified.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since commit a492d5331c ("merge-ort: ensure we consult df_conflict
and path_conflicts", 2021-06-30), when renormalization is active AND a
file is involved in a rename/delete conflict BUT the file is unmodified
(either before or after renormalization), merge-ort was running into an
assertion failure. Prior to that commit (or if assertions were compiled
out), merge-ort would mis-merge instead, ignoring the rename/delete
conflict and just deleting the file.
Remove the assertions, fix the code appropriately, leave some good
comments in the code, and add a testcase for this situation.
Reported-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the --statistics flag that I added in 5e9637c629 (i18n: add
infrastructure for translating Git with gettext, 2011-11-18). Our
Makefile output is good about reducing verbosity by default, except in
this case:
$ rm -rf po/build/locale/e*; time make -j $(nproc) all
SUBDIR templates
MKDIR -p po/build/locale/el/LC_MESSAGES
MSGFMT po/build/locale/el/LC_MESSAGES/git.mo
MKDIR -p po/build/locale/es/LC_MESSAGES
MSGFMT po/build/locale/es/LC_MESSAGES/git.mo
1038 translated messages, 3325 untranslated messages.
5230 translated messages.
I didn't have any good reason for using --statistics at the time other
than ad-hoc eyeballing of the output. We don't need to spew out
exactly how many messages we've got translated every time. Now we'll
instead emit:
$ rm -rf po/build/locale/e*; time make -j $(nproc) all
SUBDIR templates
MKDIR -p po/build/locale/el/LC_MESSAGES
MSGFMT po/build/locale/el/LC_MESSAGES/git.mo
MKDIR -p po/build/locale/es/LC_MESSAGES
MSGFMT po/build/locale/es/LC_MESSAGES/git.mo
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove -DPAGER_ENV from the BASIC_CFLAGS and instead have it passed
via the EXTRA_CPPFLAGS passed when compiling pager.c.
This doesn't change anything except to make it clear that only pager.c
needs this, as it's the only user of this define. See
995bc22d7f (pager: move pager-specific setup into the build,
2016-08-04) for the commit that originally added this.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix an issue in my cfe853e66b (hook-list.h: add a generated list of
hooks, like config-list.h, 2021-09-26), the builtin/help.c was
inadvertently made to depend on hook-list.h, but it's used by
builtin/bugreport.c.
The hook.c also does not depend on hook-list.h. It did in an earlier
version of the greater series cfe853e66b was extracted from, but not
anymore. We might end up needing that line again, but let's remove it
for now.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable $SHELL is usually set to the user's
interactive shell. Our build and test scripts never use $SHELL because
there are no guarantees about its input language. Instead, we use
/bin/sh which should be a POSIX shell.
For systems with a broken /bin/sh, we allow to override that path via
SHELL_PATH. To run tests in yet another shell we allow to override
SHELL_PATH with TEST_SHELL_PATH.
Perf tests run in $SHELL via a wrapper defined in t/perf/perf-lib.sh,
so they break with e.g. SHELL=python. Use TEST_SHELL_PATH like
in other tests. TEST_SHELL_PATH is always defined because
t/perf/perf-lib.sh includes t/test-lib.sh, which includes
GIT-BUILD-OPTIONS.
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create files with mode 0666, so umask works as intended. Provides an override,
which is useful to support shared repos (test t1301-shared-repo.sh).
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reflog entries have unbounded size. In theory, each log ('g') block in reftable
can have an arbitrary size, so the format allows for arbitrarily sized reflog
messages. However, in the implementation, we are not scaling the log blocks up
with the message, and writing a large message fails.
This triggers a failure for reftable in t7006-pager.sh.
Until this is fixed more structurally, report an error from within the reftable
library for easier debugging.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The chainlint test script linter in the test suite has been updated.
* es/chainlint:
chainlint.sed: stop splitting "(..." into separate lines "(" and "..."
chainlint.sed: swallow comments consistently
chainlint.sed: stop throwing away here-doc tags
chainlint.sed: don't mistake `<< word` in string as here-doc operator
chainlint.sed: make here-doc "<<-" operator recognition more POSIX-like
chainlint.sed: drop subshell-closing ">" annotation
chainlint.sed: drop unnecessary distinction between ?!AMP?! and ?!SEMI?!
chainlint.sed: tolerate harmless ";" at end of last line in block
chainlint.sed: improve ?!SEMI?! placement accuracy
chainlint.sed: improve ?!AMP?! placement accuracy
t/Makefile: optimize chainlint self-test
t/chainlint/one-liner: avoid overly intimate chainlint.sed knowledge
t/chainlint/*.test: generalize self-test commentary
t/chainlint/*.test: fix invalid test cases due to mixing quote types
t/chainlint/*.test: don't use invalid shell syntax
"git apply" has been taught to ignore a message without a patch
with the "--allow-empty" option. It also learned to honor the
"--quiet" option given from the command line.
* jz/apply-quiet-and-allow-empty:
git-apply: add --allow-empty flag
git-apply: add --quiet flag
"git fetch --set-upstream" did not check if there is a current
branch, leading to a segfault when it is run on a detached HEAD,
which has been corrected.
* ab/fetch-set-upstream-while-detached:
pull, fetch: fix segfault in --set-upstream option
Move the handling of the "verbose" flag entirely out of
"refs/files-backend.c" and into "builtin/reflog.c". This allows the
backend to stop knowing about the EXPIRE_REFLOGS_VERBOSE flag.
The expire_reflog_ent() function shouldn't need to deal with the
implementation detail of whether or not we're emitting verbose output,
by doing this the --verbose output becomes backend-agnostic, so
reftable will get the same output.
I think the output is rather bad currently, and should e.g. be
implemented with some better future mode of progress.[ch], but that's
a topic for another improvement.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's not possible for "cb->newlog" to be NULL if
!EXPIRE_REFLOGS_DRY_RUN, since files_reflog_expire() would have
error()'d and taken the "goto failure" branch if it couldn't open the
file. By not using the "newlog" field private to "file-backend.c"'s
"struct expire_reflog_cb", we can move this verbosity logging to
"builtin/reflog.c" in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "cmd.stalefix" handling added in 1389d9ddaa (reflog expire
--fix-stale, 2007-01-06) to use a locally scoped "struct
rev_info". This code relies on mark_reachable_objects() twiddling
flags in the walked objects.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the initial implementation of "git reflog" in 4264dc15e1 (git
reflog expire, 2006-12-19) we had this
lookup_commit_reference_gently().
I don't think we've ever found tags that we need to recursively
dereference in reflogs, so this should at least be changed to a
"lookup commit" as I'm doing here, although I can't think of a way
where it mattered in practice.
I also think we'd probably like to just die here if we have a NULL
object, but as this code needs to handle potentially broken
repositories let's just show an "error" but continue, the non-quiet
lookup_commit() will do for us. None of our tests cover the case where
"commit" is NULL after this lookup.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an intermediate variable for "tip_commit" in
reflog_expiry_prepare(), and only add it to the struct if we're
handling the UE_NORMAL case.
The code behaves the same way as before, but this makes the control
flow clearer, and the shorter name allows us to fold a 4-line i/else
into a one-line ternary instead.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code added in 03cb91b18c (reflog --expire-unreachable: special
case entries in "HEAD" reflog, 2010-04-09) to use a "switch" statement
with an exhaustive list of "case" statements instead of doing numeric
comparisons against the enum labels.
Now we won't assume that "x != UE_ALWAYS" means "(x == UE_HEAD || x ||
UE_NORMAL)". That assumption is true now, but we'd introduce subtle
bugs here if that were to change, now the compiler will notice and
error out on such errors.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the FLEX_ARRAY pattern added in bda3a31cc7 (reflog-expire:
Avoid creating new files in a directory inside readdir(3) loop,
2008-01-25) the string-list API instead.
This does not change any behavior, allows us to delete much of this
code as it's replaced by things we get from the string-list API for
free, as a result we need just one struct to keep track of this data,
instead of two.
The "DUP" -> "string_list_append_nodup(..., strbuf_detach(...))"
pattern here is the same as that used in a recent memory leak fix in
b202e51b15 (grep: fix a "path_list" memory leak, 2021-10-22).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the preceding change for "reflog delete", change the "cb_data"
we pass to callbacks to be &cb.cmd itself, instead of passing &cb and
having the callback lookup cb->cmd.
This makes it clear that the "cb" itself is the same memzero'd
structure on each iteration of the for-loops that use &cb, except for
the "cmd" member.
The "struct expire_reflog_policy_cb" we pass to reflog_expire() will
have the members that aren't "cmd" modified by the callbacks, but
before we invoke them everything except "cmd" is zero'd out.
This included the "tip_commit", "mark_list" and "tips". It might have
looked as though we were re-using those between iterations, but the
first thing we did in reflog_expiry_prepare() was to either NULL them,
or clobber them with another value.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "cb_data" we pass to the count_reflog_ent() to be the
&cb.cmd itself, instead of passing &cb and having the callback lookup
cb->cmd.
This makes it clear that the "cb" itself is the same memzero'd
structure on each iteration of the for-loop that uses &cb, except for
the "cmd" member.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is consistent with the calling convention for ref backend creation, and
avoids storing ".git/packed-refs" (the name of a regular file) in a variable called
ref_store::gitdir.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "rollback" verb implements a simple algorithm which takes the set of
remote perforce tracker branches, or optionally, the complete collection
of local branches in a git repository, and deletes commits from these
branches until there are no commits left with a perforce change number
greater than than a user-specified change number. If the base of a git
branch has a newer change number than the user-specified maximum, then
the branch is deleted.
In future, there might be an argument for the addition of some kind of
"reset this branch back to a given perforce change number" verb for
git-p4. However, in its current form it is unlikely to be useful to
users for the following reasons:
* The verb is completely undocumented. The only description provided
contains the following text: "A tool to debug the multi-branch
import. Don't use :)".
* The verb has a very narrow purpose in that it applies the rollback
operation to fixed sets of branches - either all remote p4 branches,
or all local branches. There is no way for users to specify branches
with more granularity, for example, allowing users to specify a
single branch or a set of branches. The utility of the current
implementation is therefore a niche within a niche.
Given these shortcomings, this patch removes the verb from git-p4.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-p4 "debug" verb is described as "A tool to debug the output of
p4 -G".
The verb is not documented in any detail, but implements a function
which executes an arbitrary p4 command with the -G flag, which causes
perforce to format all output as marshalled Python dictionary objects.
The verb was implemented early in the history of git-p4, and may once
have served a useful purpose to the authors in the early stages of
development. However, the "debug" verb is no longer being used by the
current developers (and users) of git-p4, and whatever purpose the verb
previously offered is easily replaced by invoking p4 directly.
This patch therefore removes the verb from git-p4.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable intentionally keeps reflog data for deleted refs.
This breaks tests that delete and recreate "refs/tags/tag_with_reflog" as traces
of the deletion are left in reflog. To resolve this, use a differently named ref
for each test case.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dumb HTTP protocol exposes ref storage details as part of the protocol,
so it only works with the FILES refstorage backend
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dumb HTTP protocol exposes ref storage details as part of the protocol,
so it only works with the FILES refstorage backend
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit effectively reverts 2782db3 (test-tool: don't force full
index, 2021-03-30) and e2df6c3 (test-read-cache: print cache entries
with --table, 2021-03-30) to remove the --table and --expand options
from 'test-tool read-cache'. The previous changes already removed these
options from the test suite in favor of 'git ls-files --sparse'.
The initial thought of creating these options was to allow for tests to
see additional information with every cache entry. In particular, the
object type is still not mirrored in 'git ls-files'. Since sparse
directory entries always end with a slash, the object type is not
critical to verify the sparse index is enabled. It was thought that it
would be helpful to have additional information, such as flags, but that
was not needed for the FS Monitor integration and hasn't been needed
since.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that 'git ls-files --sparse' exists, we can use it to verify the
state of a sparse index instead of 'test-tool read-cache table'. Replace
these usages within t1091-sparse-checkout-builtin.sh and
t3705-add-sparse-checkout.sh.
The important changes are due to the different output format. In t3705,
we need to use the '--stage' output to get a file mode and OID, but
it also includes a stage value and drops the object type. This leads
to some differences in how we handle looking for specific entries.
In t1091, the test focuses on enabling the sparse index, so we do not
need the --stage flag to demonstrate how the index changes, and instead
can use a diff.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that 'git ls-files --sparse' exists, we can use it to verify the
state of a sparse index instead of 'test-tool read-cache table'. Replace
these usages within t1092-sparse-checkout-compatibility.sh.
The important changes are due to the different output format. We need to
use the '--stage' output to get a file mode and OID, but it also
includes a stage value and drops the object type. This leads to some
differences in how we handle looking for specific entries.
Some places where we previously looked for no 'tree' entries, we can
instead directly compare the output across the repository with a sparse
index and the one without.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Existing callers to 'git ls-files' are expecting file names, not
directories. It is best to expand a sparse index to show all of the
contained files in this case.
However, expert users may want to inspect the contents of the index
itself including which directories are sparse. Add a --sparse option to
allow users to request this information.
During testing, I noticed that options such as --modified did not affect
the output when the files in question were outside the sparse-checkout
definition. Tests are added to document this preexisting behavior and
how it remains unchanged with the sparse index and the --sparse option.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git fetch' and 'git pull' commands parse the index in order to
determine if submodules exist. Without command_requires_full_index=0,
this will expand a sparse index, causing slow performance even when
there is no new data to fetch.
The .gitmodules file will never be inside a sparse directory entry, and
even if it was, the index_name_pos() method would expand the sparse
index if needed as we search for the path by name. These commands do not
iterate over the index, which is the typical thing we are careful about
when integrating with the sparse index.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This provides a better error message in case SHA256 was inadvertently switched
on through the environment.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add pieces from "scalar" to contrib/.
* js/scalar:
scalar: implement the `version` command
scalar: implement the `delete` command
scalar: teach 'reconfigure' to optionally handle all registered enlistments
scalar: allow reconfiguring an existing enlistment
scalar: implement the `run` command
scalar: teach 'clone' to support the --single-branch option
scalar: implement the `clone` subcommand
scalar: implement 'scalar list'
scalar: let 'unregister' handle a deleted enlistment directory gracefully
scalar: 'unregister' stops background maintenance
scalar: 'register' sets recommended config and starts maintenance
scalar: create test infrastructure
scalar: start documenting the command
scalar: create a rudimentary executable
scalar: add a README with a roadmap
Teach diff and blame to work well with sparse index.
* ld/sparse-diff-blame:
blame: enable and test the sparse index
diff: enable and test the sparse index
diff: replace --staged with --cached in t1092 tests
repo-settings: prepare_repo_settings only in git repos
test-read-cache: set up repo after git directory
commit-graph: return if there is no git directory
git: ensure correct git directory setup with -h
"git name-rev" has been tweaked to give output that is shorter and
easier to understand.
* en/name-rev-shorter-output:
name-rev: prefer shorter names over following merges
"git fetch" without the "--update-head-ok" option ought to protect
a checked out branch from getting updated, to prevent the working
tree that checks it out to go out of sync. The code was written
before the use of "git worktree" got widespread, and only checked
the branch that was checked out in the current worktree, which has
been updated.
(originally called ak/fetch-not-overwrite-any-current-branch)
* ak/protect-any-current-branch:
branch: protect branches checked out in all worktrees
receive-pack: protect current branch for bare repository worktree
receive-pack: clean dead code from update_worktree()
fetch: protect branches checked out in all worktrees
worktree: simplify find_shared_symref() memory ownership model
branch: lowercase error messages
receive-pack: lowercase error messages
fetch: lowercase error messages
The cryptographic signing using ssh keys can specify literal keys
for keytypes whose name do not begin with the "ssh-" prefix by
using the "key::" prefix mechanism (e.g. "key::ecdsa-sha2-nistp256").
* fs/ssh-signing-other-keytypes:
ssh signing: make sign/amend test more resilient
ssh signing: support non ssh-* keytypes
Extend the signing of objects with SSH keys and learn to pay
attention to the key validity time range when verifying.
* fs/ssh-signing-key-lifetime:
ssh signing: verify ssh-keygen in test prereq
ssh signing: make fmt-merge-msg consider key lifetime
ssh signing: make verify-tag consider key lifetime
ssh signing: make git log verify key lifetime
ssh signing: make verify-commit consider key lifetime
ssh signing: add key lifetime test prereqs
ssh signing: use sigc struct to pass payload
t/fmt-merge-msg: make gpgssh tests more specific
t/fmt-merge-msg: do not redirect stderr
When "git log" implicitly enabled the "decoration" processing
without being explicitly asked with "--decorate" option, it failed
to read and honor the settings given by the "--decorate-refs"
option.
* jk/log-decorate-opts-with-implicit-decorate:
log: load decorations with --simplify-by-decoration
log: handle --decorate-refs with userformat "%d"
"git rebase -x" by mistake started exporting the GIT_DIR and
GIT_WORK_TREE environment variables when the command was rewritten
in C, which has been corrected.
* en/rebase-x-wo-git-dir-env:
sequencer: do not export GIT_DIR and GIT_WORK_TREE for 'exec'
Weather balloon to find compilers that do not grok variable
declaration in the for() loop.
* jc/c99-var-decl-in-for-loop:
revision: use C99 declaration of variable in for() loop
In po/README.md, we point core developers to gettext's "Preparing
Strings" documentation for advice on marking strings for translation.
However, this doc doesn't really discuss the issues around plural form
translation, which can make it seem that nothing special needs to be
done in this case.
Add a specific callout here about marking plural-form strings so that
the advice later on in the README is not overlooked.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The revision traversal machinery typically processes and returns all
children before any parent. fast-export needs to operate in the
reverse fashion, handling parents before any of their children in
order to build up the history starting from the root commit(s). This
would be a clear case where we could just use the revision traversal
machinery's "reverse" option to achieve this desired affect.
However, this wasn't what the code did. It added its own array for
queuing. The obvious hand-rolled solution would be to just push all
the commits into the array and then traverse afterwards, but it didn't
quite do that either. It instead attempted to process anything it
could as soon as it could, and once it could, check whether it could
process anything that had been queued. As far as I can tell, this was
an effort to save a little memory in the case of multiple root commits
since it could process some commits before queueing all of them. This
involved some helper functions named has_unshown_parent() and
handle_tail(). For typical invocations of fast-export, this
alternative essentially amounted to a hand-rolled method of reversing
the commits -- it was a bunch of work to duplicate the revision
traversal machinery's "reverse" option.
This hand-rolled reversing mechanism is actually somewhat difficult to
reason about. It takes some time to figure out how it ensures in
normal cases that it will actually process all traversed commits
(rather than just dropping some and not printing anything for them).
And it turns out there are some cases where the code does drop commits
without handling them, and not even printing an error or warning for
the user. Due to the has_unshown_parent() checks, some commits could
be left in the array at the end of the "while...get_revision()" loop
which would be unprocessed. This could be triggered for example with
git fast-export main -- --first-parent
or non-sensical traversal rules such as
git fast-export main -- --grep=Merge --invert-grep
While most traversals that don't include all parents should likely
trigger errors in fast-export (or at least require being used in
combination with --reference-excluded-parents), the --first-parent
traversal is at least reasonable and it'd be nice if it didn't just drop
commits. It'd also be nice for future readers of the code to have a
simpler "reverse traversal" mechanism. Use the "reverse" option of the
revision traversal machinery to achieve both.
Even for the non-sensical traversal flags like the --grep one above,
this would be an improvement. For example, in that case, the code
previously would have silently truncated history to only those commits
that do not have an ancestor containing "Merge" in their commit message.
After this code change, that case would include all commits without
"Merge" in their commit message -- but any commit that previously had a
"Merge"-mentioning parent would lose that parent
(likely resulting in many new root commits). While the new behavior is
still odd, it is at least understandable given that
--reference-excluded-parents is not the default.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: William Sprent <williams@unity3d.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although we only documented that branch.*.autosetupmerge would accept
"always" as a value, the actual implementation would accept any
combination of upper- or lower-case. Fix this to be consistent with
documentation and with other values of this config variable.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be helpful when creating a new branch to use the existing
tracking configuration from the branch point. However, there is
currently not a method to automatically do so.
Teach git-{branch,checkout,switch} an "inherit" argument to the
"--track" option. When this is set, creating a new branch will cause the
tracking configuration to default to the configuration of the branch
point, if set.
For example, if branch "main" tracks "origin/main", and we run
`git checkout --track=inherit -b feature main`, then branch "feature"
will track "origin/main". Thus, `git status` will show us how far
ahead/behind we are from origin, and `git pull` will pull from origin.
This is particularly useful when creating branches across many
submodules, such as with `git submodule foreach ...` (or if running with
a patch such as [1], which we use at $job), as it avoids having to
manually set tracking info for each submodule.
Since we've added an argument to "--track", also add "--track=direct" as
another way to explicitly get the original "--track" behavior ("--track"
without an argument still works as well).
Finally, teach branch.autoSetupMerge a new "inherit" option. When this
is set, "--track=inherit" becomes the default behavior.
[1]: https://lore.kernel.org/git/20180927221603.148025-1-sbeller@google.com/
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new static variant of install_branch_config() that accepts
multiple remote branch names for tracking. This will be used in an
upcoming commit that enables inheriting the tracking configuration from
a parent branch.
Currently, all callers of install_branch_config() pass only a single
remote. Make install_branch_config() a small wrapper around
install_branch_config_multiple_remotes() so that existing callers do not
need to be changed.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a series of patches for a topic-B depends on having topic-A,
the workflow to prepare the topic-B branch would look like this:
$ git checkout -b topic-B main
$ git merge --no-ff --no-edit topic-A
$ git am <mbox-for-topic-B
When topic-A gets updated, recreating the first merge and rebasing
the rest of the topic-B, all on detached HEAD, is a useful
technique. After updating topic-A with its new round of patches:
$ git checkout topic-B
$ prev=$(git rev-parse 'HEAD^{/^Merge branch .topic-A. into}')
$ git checkout --detach $prev^1
$ git merge --no-ff --no-edit topic-A
$ git rebase --onto HEAD $prev @{-1}^0
$ git checkout -B @{-1}
This will
(0) check out the current topic-B.
(1) find the previous merge of topic-A into topic-B.
(2) detach the HEAD to the parent of the previous merge.
(3) merge the updated topic-A to it.
(4) reapply the patches to rebuild the rest of topic-B.
(5) update topic-B with the result.
without contaminating the reflog of topic-B too much. topic-B@{1}
is the "logically previous" state before topic-A got updated, for
example. At (4), comparison (e.g. range-diff) between HEAD and
@{-1} is a meaningful way to sanity check the result, and the same
can be done at (5) by comparing topic-B and topic-B@{1}.
But there is one glitch. The merge into the detached HEAD done in
the step (3) above gives us "Merge branch 'topic-A' into HEAD", and
does not say "into topic-B".
Teach the "--into-name=<branch>" option to "git merge" and its
underlying "git fmt-merge-message", to pretend as if we were merging
into <branch>, no matter what branch we are actually merging into,
when they prepare the merge message. The pretend name honors the
usual "into <target>" suppression mechanism, which can be seen in
the tests added here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When importing files from Perforce, git-p4 periodically logs the
progress of file transfers as a percentage. However, the value is
printed as a float with an excessive number of decimal places.
For example a typical update might contain the following message:
Importing revision 12345 (26.199617677553135%)
This patch simply rounds the value down to the nearest integer
percentage value, greatly improving readability.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Acked-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-p4 script reports file sizes in various log messages.
Previously, in each case the script would print them as the number of
bytes divided by 1048576 i.e. the size in mebibytes, rounded down to an
integer. This resulted in small files being described as having a size
of "0 MB".
This patch replaces the existing behaviour with a new helper function:
format_size_human_readable, which takes a number of bytes (or any other
quantity), and computes the appropriate prefix to use: none, Ki, Mi, Gi,
Ti, Pi, Ei, Zi, Yi.
For example, a size of 123456 will now be printed as "120.6 KiB" greatly
improving the readability of the log output.
Large valued prefixes such as pebi, exbi, zebi and yobi are included for
completeness, though they not expected to appear in any real-world
Perforce repository!
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Patterns that contain no wildcards and don't have to be case-folded are
literal. Give this condition a name to increase the readability of the
boolean expression for enabling the option PCRE2_UTF.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
compile_pcre2_pattern() currently uses the option PCRE2_UTF only for
patterns with non-ASCII characters. Patterns with ASCII wildcards can
match non-ASCII strings, though. Without that option PCRE2 mishandles
UTF-8 input, though -- it matches parts of multi-byte characters. Fix
that by using PCRE2_UTF even for ASCII-only patterns.
This is a remake of the reverted ae39ba431a (grep/pcre2: fix an edge
case concerning ascii patterns and UTF-8 data, 2021-10-15). The change
to the condition and the test are simplified and more targeted.
Original-patch-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Release the strbuf containing the interpolated path after copying it to
a stack buffer and before erroring out if it's too long.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Certain invocations of "git apply --3way" will attempt threeway and
fail due to missing objects, even though git is able to fall back on
apply_fragments and apply the patch successfully with a return value
of 0. To fix, return early from try_threeway() in the following
cases:
- When the patch is a rename and no lines have changed. In this
case, "git diff" doesn't record the blob info, so 3way is neither
possible nor necessary.
- When the patch is an addition and there is no add/add conflict,
i.e. direct_to_threeway is false. In this case, threeway will
fail since the preimage is not in cache, but isn't necessary
anyway since there is no conflict.
This fixes a few unecessary error messages when applying these kinds
of patches with --3way.
It also fixes a reported issue where applying a concatenation of
several git produced patches will fail when those patches involve a
deletion followed by creation of the same file. Add a test for this
case too. (test provided by <i@zenithal.me>)
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While testing some ideas in 'git repack', I ran it with '--quiet' and
discovered that some progress output was still shown. Specifically, the
output for writing the multi-pack-index showed the progress.
The 'show_progress' variable in cmd_repack() is initialized with
isatty(2) and is not modified at all by the '--quiet' flag. The
'--quiet' flag modifies the po_args.quiet option which is translated
into a '--quiet' flag for the 'git pack-objects' child process. However,
'show_progress' is used to directly send progress information to the
multi-pack-index writing logic which does not use a child process.
The fix here is to modify 'show_progress' to be false if po_opts.quiet
is true, and isatty(2) otherwise. This new expectation simplifies a
later condition that checks both.
Update the documentation to make it clear that '-q' will disable all
progress in addition to ensuring the 'git pack-objects' child process
will receive the flag.
Use 'test_terminal' to check that this works to get around the isatty(2)
check.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Historically, we needed a single packfile in order to have reachability
bitmaps. This introduced logic that when 'git repack' had a '-b' option
that we should stop sending the '--honor-pack-keep' option to the 'git
pack-objects' child process, ensuring that we create a packfile
containing all reachable objects.
In the world of multi-pack-index bitmaps, we no longer need to repack
all objects into a single pack to have valid bitmaps. Thus, we should
continue sending the '--honor-pack-keep' flag to 'git pack-objects'.
The fix is very simple: only disable the flag when writing bitmaps but
also _not_ writing the multi-pack-index.
This opens the door to new repacking strategies that might want to keep
some historical set of objects in a stable pack-file while only
repacking more recent objects.
To test, create a new 'test_subcommand_inexact' helper that is more
flexible than 'test_subcommand'. This allows us to look for the
--honor-pack-keep flag without over-indexing on the exact set of
arguments.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Actually use extended regexes as indicated in the comment.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing colon to ensure correct rendering of definition list
item. Without the proper number of colons, it renders as just another
top-level paragraph rather than a list item.
Signed-off-by: Greg Hurrell <greg@hurrell.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option --invert-grep is documented to filter out commits whose
messages match the --grep filters. However, it also affects the
header matches (--author, --committer), which is not intended.
Move the handling of that option to grep.c, as only the code there can
distinguish between matches in the header from those in the message
body. If --invert-grep is given then enable extended expressions (not
the regex type, we just need git grep's --not to work), negate the body
patterns and check if any of them match by piggy-backing on the
collect_hits mechanism of grep_source_1().
Collecting the matches in struct grep_opt is a bit iffy, but with
"last_shown" we have a precedent for writing state information to that
struct.
Reported-by: Dotan Cohen <dotancohen@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comand uses a single instance of rev_info on stack, makes a
single revision traversal and exit. Mark the resources held by the
rev_info structure with UNLEAK().
We do not do this at lower level in revision.c or cmd_log_walk(), as
a new caller of the revision traversal API can make unbounded number
of rev_info during a single run, and UNLEAK() would not a be
suitable mechanism to deal with such a caller.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier we marked that this patch-id test is leak-sanitizer clean,
but if we read the test script carefully, it is merely because we
have too many invocations of commands in the "git log" family on the
upstream side of the pipe, hiding breakages from them.
Split the pipeline so that breakages from these commands can be
caught (not limited to aborts due to leak-sanitizer) and unmark
the script as not passing the test with leak-sanitizer in effect.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
RCS keywords are strings that are replaced with information from
Perforce. Examples include $Date$, $Author$, $File$, $Change$ etc.
Perforce resolves these by expanding them with their expanded values
when files are synced, but Git's data model requires these expanded
values to be converted back into their unexpanded form.
Previously, git-p4.py would implement this behaviour through the use of
regular expressions. However, the regular expression substitution was
applied using decoded strings i.e. the content of incoming commit diffs
was first decoded from bytes into UTF-8, processed with regular
expressions, then converted back to bytes.
Not only is this behaviour inefficient, but it is also a cause of a
common issue caused by text files containing invalid UTF-8 data. For
files created in Windows, CP1252 Smart Quote Characters (0x93 and 0x94)
are seen fairly frequently. These codes are invalid in UTF-8, so if the
script encountered any file containing them, on Python 2 the symbols
will be corrupted, and on Python 3 the script will fail with an
exception.
This patch replaces this decoding/encoding with bytes object regular
expressions, so that the substitution is performed directly upon the
source data with no conversions.
A test for smart quote handling has been added to the
t9810-git-p4-rcs.sh test suite.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The patchRCSKeywords method creates a temporary file in which to store
the patched output data. Previously this file was opened in "w+" mode
(write and read), but the code never reads the contents of the file
while open, so it only needs to be opened in "w" mode (write-only).
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously the read_lines function always decoded the result lines. In
order to improve support for non-decoded binary processing of data in
git-p4.py, this patch adds a raw option to the function that allows
decoding to be disabled.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously git-p4.py would compile one of two regular expressions for
ever RCS keyword-enabled file. This patch improves simplifies the code
by pre-compiling the two regular expressions when the script first
loads.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python with statements are used to wrap the execution of a block of code
so that an object can be safely released when execution leaves the
scope.
They are desirable for improving code tidyness, and to ensure that
objects are properly destroyed even when exceptions are thrown.
Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the usage message emitted by "git stash --invalid-option" to
emit usage information for "git stash" in general, and not just for
the "push" command. I.e. before:
$ git stash --invalid-option
error: unknown option `invalid-option'
usage: git stash [push [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [-m|--message <message>]
[--] [<pathspec>...]]
[...]
After:
$ git stash --invalid-option
error: unknown option `invalid-option'
usage: git stash list [<options>]
or: git stash show [<options>] [<stash>]
or: git stash drop [-q|--quiet] [<stash>]
or: git stash ( pop | apply ) [--index] [-q|--quiet] [<stash>]
or: git stash branch <branchname> [<stash>]
or: git stash clear
or: git stash [push [-p|--patch] [-S|--staged] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [-m|--message <message>]
[--pathspec-from-file=<file> [--pathspec-file-nul]]
[--] [<pathspec>...]]
or: git stash save [-p|--patch] [-S|--staged] [-k|--[no-]keep-index] [-q|--quiet]
[-u|--include-untracked] [-a|--all] [<message>]
[...]
That we emitted the usage for just "push" in the case of the
subcommand not being explicitly specified was an unintentional
side-effect of how it was implemented. When it was converted to C in
d553f538b8 (stash: convert push to builtin, 2019-02-25) the pattern
of having per-subcommand usage information was rightly continued. The
"git-stash.sh" shellscript did not have that, and always printed the
equivalent of "git_stash_usage".
But in doing so the case of push being implicit and explicit was
conflated. A variable was added to track this in 8c3713cede (stash:
eliminate crude option parsing, 2020-02-17), but it did not update the
usage output accordingly.
This still leaves e.g. "git stash push -h" emitting the
"git_stash_usage" output, instead of "git_stash_push_usage". That
should be fixed, but is a much deeper misbehavior in parse_options()
not being aware of subcommands at all. I.e. in how
PARSE_OPT_KEEP_UNKNOWN and PARSE_OPT_NO_INTERNAL_HELP combine in
commands such as "git stash".
Perhaps PARSE_OPT_KEEP_UNKNOWN should imply
PARSE_OPT_NO_INTERNAL_HELP, or better yet parse_options() should be
extended to fully handle these subcommand cases that we handle
manually in "git stash", "git commit-graph", "git multi-pack-index"
etc. All of those musings would be a much bigger change than this
isolated fix though, so let's leave that for some other time.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three callsites of git_prompt() helper function that ask a
"yes/no" question to the end user, but one of them in help.c that
asks if the suggested auto-correction is OK, which is given when the
user makes a possible typo in a Git subcommand name, is formatted
differently from the others.
Update the format string to make the prompt string look more
consistent.
Signed-off-by: Kashav Madan <kshvmdn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option helps to record specific empty patches in the middle
of an am session, which does create empty commits only when:
1. the index has not changed
2. lacking a branch
When the index has changed, "--allow-empty" will create a non-empty
commit like passing "--continue" or "--resolved".
Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since that the command 'git-format-patch' can include patches of
commits that emit no changes, the 'git-am' command should also
support an option, named as '--empty', to specify how to handle
those empty patches. In this commit, we have implemented three
valid options ('stop', 'drop' and 'keep').
Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit has described how to use '--always' option in the command
'git-format-patch' to include patches for commits that emit no changes.
Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The discussion for gpg.ssh.allowedSignersFile shows an example string
that contains "user1@example.com,user2@example.com". Asciidoc thinks
these are real email addresses and generates "mailto" footnotes for
them. This makes the rendered content more confusing, as it has extra
"[1]" markers:
The file consists of one or more lines of principals followed by an
ssh public key. e.g.: user1@example.com[1],user2@example.com[2]
ssh-rsa AAAAX1... See ssh-keygen(1) "ALLOWED SIGNERS" for details.
and also generates pointless notes at the end of the page:
NOTES
1. user1@example.com
mailto:user1@example.com
2. user2@example.com
mailto:user2@example.com
We can fix this by putting the example into a backtick literal block.
That inhibits the mailto generation, and as a bonus typesets the example
text in a way that sets it off from the regular prose (a tt font for
html, or bold in the roff manpage).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When serving a fetch, git upload-pack copies data from a git
pack-objects stdout pipe to its stdout. This commit increases the size
of the buffer used for that copying from 8192 to 65515, the maximum
sideband-64k packet size.
Previously, this buffer was allocated on the stack. Because the new
buffer size is nearly 64KB, we switch this to a heap allocation.
On GitLab.com we use GitLab's pack-objects cache which does writes of
65515 bytes. Because of the default 8KB buffer size, propagating these
cache writes requires 8 pipe reads and 8 pipe writes from
git-upload-pack, and 8 pipe reads from Gitaly (our Git RPC service).
If we increase the size of the buffer to the maximum Git packet size,
we need only 1 pipe read and 1 pipe write in git-upload-pack, and 1
pipe read in Gitaly to transfer the same amount of data. In benchmarks
with a pure fetch and 100% cache hit rate workload we are seeing CPU
utilization reductions of over 30%.
Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commits marked `sparse-checkout init` as deprecated; we
can just use `set` instead here and pass it no paths.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in the previous commit, using separate `init` and `set` steps
with sparse-checkout result in a number of issues. The previous commits
made `set` able to handle the work of both commands, and enabled reapply
to tweak the {cone,sparse-index} settings. Update the documentation to
reflect this, and mark `init` as deprecated.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Folks may want to switch to or from cone mode, or to or from a
sparse-index without changing their sparsity paths. Allow them to do so
using the reapply command.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previously suggested workflow:
git sparse-checkout init ...
git sparse-checkout set ...
Suffered from three problems:
1) It would delete nearly all files in the first step, then
restore them in the second. That was poor performance and
forced unnecessary rebuilds.
2) The two-step process resulted in two progress bars, which
was suboptimal from a UI point of view for wrappers that
invoked both of these commands but only exposed a single
command to their end users.
3) With cone mode, the first step would delete nearly all
ignored files everywhere, because everything was considered
to be outside of the specified sparsity paths. (The user was
not allowed to specify any sparsity paths in the `init` step.)
Avoid these problems by teaching `set` to understand the extra
parameters that `init` takes and performing any necessary initialization
if not already in a sparse checkout.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`init` has some code for handling updates to either cone mode or
the sparse-index setting. We would like to be able to reuse this
elsewhere, namely in `set` and `reapply`. Split this function out,
and make it slightly more general so it can handle being called from
the new callers.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We intentionally added --stdin as an option to `sparse-checkout set`,
but didn't intend for --no-stdin to be permitted as well.
Reported-by: Victoria Dye <vdye@github.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most sparse-checkout subcommands (list, add, reapply) only make sense
when already in a sparse state. Add a quick check that will error out
early if this is not the case.
Also document with a comment why we do not exit early in `disable` even
when core.sparseCheckout starts as false.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse_checkout_set() was reused by sparse_checkout_add() with the only
difference being a single parameter being passed to that function.
However, we would like sparse_checkout_set() to do the same work that
sparse_checkout_init() does if sparse checkouts are not already enabled.
To facilitate this transition, give each mode their own copy of the
function. This does not introduce any behavioral changes; that will
come in a subsequent patch.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_patterns_from_input() has relied on a global variable,
set_opts.use_stdin, which has been used by both the `set` and `add`
subcommands of sparse-checkout. Once we introduce an
add_opts.use_stdin, the hardcoding of set_opts.use_stdin will be
incorrect. Pass the value as function parameter instead to allow us to
make subsequent changes.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Victoria Dye <vdye@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* ab/die-with-bug:
object.c: use BUG(...) no die("BUG: ...") in lookup_object_by_type()
pathspec: use BUG(...) not die("BUG:%s:%d....", <file>, <line>)
strbuf.h: use BUG(...) not die("BUG: ...")
pack-objects: use BUG(...) not die("BUG: ...")
The test helper for refs subsystem learned to write bogus and/or
nonexistent object name to refs to simulate error situations we
want to test Git in.
* hn/allow-bogus-oid-in-ref-tests:
t1430: create valid symrefs using test-helper
t1430: remove refs using test-tool
refs: introduce REF_SKIP_REFNAME_VERIFICATION flag
refs: introduce REF_SKIP_OID_VERIFICATION flag
refs: update comment.
test-ref-store: plug memory leak in cmd_delete_refs
test-ref-store: parse symbolic flag constants
test-ref-store: remove force-create argument for create-reflog
Change the type of an internal function to return an enum (instead
of int) and replace -2 that was used to signal an error with -1.
* ab/parse-options-cleanup:
parse-options.c: use "enum parse_opt_result" for parse_nodash_opt()
"default" and "reset" colors have been added to our palette.
* re/color-default-reset:
color: allow colors to be prefixed with "reset"
color: support "default" to restore fg/bg color
color: add missing GIT_COLOR_* white/black constants
Document the parameters given to the reflog entry iterator callback
functions.
* jc/reflog-iterator-callback-doc:
refs: document callback for reflog-ent iterators
The sparse-index/sparse-checkout feature had a bug in its use of
the matching code to determine which path is in or outside the
sparse checkout patterns.
* ds/sparse-deep-pattern-checkout-fix:
unpack-trees: use traverse_path instead of name
t1092: add deeper changes during a checkout
Coding guideline document has been updated to clarify what goes to
standard error in our system.
* es/doc-stdout-vs-stderr:
CodingGuidelines: document which output goes to stdout vs. stderr
Many tests that used to need GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
mechanism to force "git" to use 'master' as the default name for
the initial branch no longer need it; the use of the mechanism from
them have been removed.
* js/test-initial-branch-override-cleanup:
tests: set GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME only when needed
"git worktree add" showed "Preparing worktree" message to the
standard output stream, but when it failed, the message from die()
went to the standard error stream. Depending on the order the
stdio streams are flushed at the program end, this resulted in
confusing output. It has been corrected by sending all the chatty
messages to the standard error stream.
* es/worktree-chatty-to-stderr:
git-worktree.txt: add missing `-v` to synopsis for `worktree list`
worktree: send "chatty" messages to stderr
Drop support for TravisCI and update test workflows at GitHub.
* ab/ci-updates:
CI: don't run "make test" twice in one job
CI: use "$runs_on_pool", not "$jobname" to select packages & config
CI: rename the "Linux32" job to lower-case "linux32"
CI: use shorter names that fit in UX tooltips
CI: remove Travis CI support
Prepare tests on ref API to help testing reftable backends.
* hn/reflog-tests:
refs/debug: trim trailing LF from reflog message
test-ref-store: tweaks to for-each-reflog-ent format
t1405: check for_each_reflog_ent_reverse() more thoroughly
test-ref-store: don't add newline to reflog message
show-branch: show reflog message
When the "git push" command is killed while the receiving end is
trying to report what happened to the ref update proposals, the
latter used to die, due to SIGPIPE. The code now ignores SIGPIPE
to increase our chances to run the post-receive hook after it
happens.
* rj/receive-pack-avoid-sigpipe-during-status-reporting:
receive-pack: ignore SIGPIPE while reporting status to client
API clean-up.
* ab/run-command:
run-command API: remove "env" member, always use "env_array"
difftool: use "env_array" to simplify memory management
run-command API: remove "argv" member, always use "args"
run-command API users: use strvec_push(), not argv construction
run-command API users: use strvec_pushl(), not argv construction
run-command tests: use strvec_pushv(), not argv assignment
run-command API users: use strvec_pushv(), not argv assignment
upload-archive: use regular "struct child_process" pattern
worktree: stop being overly intimate with run_command() internals
"Zealous diff3" style of merge conflict presentation has been added.
* en/zdiff3:
update documentation for new zdiff3 conflictStyle
xdiff: implement a zealous diff3, or "zdiff3"
The default setting for trace2 event nesting was too low to cause
test failures, which is worked around by bumping it up in the test
framework.
* ds/trace2-regions-in-tests:
t/t*: remove custom GIT_TRACE2_EVENT_NESTING
test-lib.sh: set GIT_TRACE2_EVENT_NESTING
The test framework learns to list unsatisfied test prerequisites,
and optionally error out when prerequisites that are expected to be
satisfied are not.
* fs/test-prereq:
test-lib: make BAIL_OUT() work in tests and prereq
test-lib: introduce required prereq for test runs
test-lib: show missing prereq summary
More tests are marked as leak-free.
* ab/mark-leak-free-tests-even-more:
leak tests: mark some fast-import tests as passing with SANITIZE=leak
leak tests: mark some config tests as passing with SANITIZE=leak
leak tests: mark some status tests as passing with SANITIZE=leak
leak tests: mark some clone tests as passing with SANITIZE=leak
leak tests: mark some add tests as passing with SANITIZE=leak
leak tests: mark some diff tests as passing with SANITIZE=leak
leak tests: mark some apply tests as passing with SANITIZE=leak
leak tests: mark some notes tests as passing with SANITIZE=leak
leak tests: mark some update-index tests as passing with SANITIZE=leak
leak tests: mark some rev-parse tests as passing with SANITIZE=leak
leak tests: mark some rev-list tests as passing with SANITIZE=leak
leak tests: mark some misc tests as passing with SANITIZE=leak
leak tests: mark most gettext tests as passing with SANITIZE=leak
leak tests: mark "sort" test as passing SANITIZE=leak
leak tests: mark a read-tree test as passing SANITIZE=leak
The "reftable" backend for the refs API, without integrating into
the refs subsystem, has been added.
* hn/reftable:
Add "test-tool dump-reftable" command.
reftable: add dump utility
reftable: implement stack, a mutable database of reftable files.
reftable: implement refname validation
reftable: add merged table view
reftable: add a heap-based priority queue for reftable records
reftable: reftable file level tests
reftable: read reftable files
reftable: generic interface to tables
reftable: write reftable files
reftable: a generic binary tree implementation
reftable: reading/writing blocks
Provide zlib's uncompress2 from compat/zlib-compat.c
reftable: (de)serialization for the polymorphic record type.
reftable: add blocksource, an abstraction for random access reads
reftable: utility functions
reftable: add error related functionality
reftable: add LICENSE
hash.h: provide constants for the hash IDs
Some users or scripts will pipe "git diff"
output to "git apply" when replaying diffs
or commits. In these cases, they will rely
on the return value of "git apply" to know
whether the diff was applied successfully.
However, for empty commits, "git apply" will
fail. This complicates scripts since they
have to either buffer the diff and check
its length, or run diff again with "exit-code",
essentially doing the diff twice.
Add the "--allow-empty" flag to "git apply"
which allows it to handle both empty diffs
and empty commits created by "git format-patch
--always" by doing nothing and returning 0.
Add tests for both with and without --allow-empty.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace OPT_VERBOSE with OPT_VERBOSITY.
This adds a --quiet flag to "git apply" so
the user can turn down the verbosity.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because `sed` is line-oriented, for ease of implementation, when
chainlint.sed encounters an opening subshell in which the first command
is cuddled with the "(", it splits the line into two lines: one
containing only "(", and the other containing whatever follows "(".
This allows chainlint.sed to get by with a single set of regular
expressions for matching shell statements rather than having to
duplicate each expression (one set for matching non-cuddled statements,
and one set for matching cuddled statements).
However, although syntactically and semantically immaterial, this
transformation has no value to test authors and might even confuse them
into thinking that the linter is misbehaving by inserting (whitespace)
line-noise into the shell code it is validating. Moreover, it also
allows an implementation detail of chainlint.sed to seep into the
chainlint self-test "expect" files, which potentially makes it difficult
to reuse the self-tests should a more capable chainlint ever be
developed.
To address these concerns, stop splitting cuddled "(..." into two lines.
Note that, as an implementation artifact, due to sed's line-oriented
nature, this change inserts a blank line at output time just before the
"(..." line is emitted. It would be possible to suppress this blank line
but doing so would add a fair bit of complexity to chainlint.sed.
Therefore, rather than suppressing the extra blank line, the Makefile's
`check-chainlint` target which runs the chainlint self-tests is instead
modified to ignore blank lines when comparing chainlint output against
the self-test "expect" output. This is a reasonable compromise for two
reasons. First, the purpose of the chainlint self-tests is to verify
that the ?!AMP?! annotations are being correctly added; precise
whitespace is immaterial. Second, by necessity, chainlint.sed itself
already throws away all blank lines within subshells since, when
checking for a broken &&-chain, it needs to check the final _statement_
in a subshell, not the final _line_ (which might be blank), thus it has
never made any attempt to precisely reproduce blank lines in its output.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking for broken a &&-chain, chainlint.sed knows that the final
statement in a subshell should not end with `&&`, so it takes care to
make a distinction between the final line which is an actual statement
and any lines which may be mere comments preceding the closing ')'. As
such, it swallows comment lines so that they do not interfere with the
&&-chain check.
However, since `sed` does not provide any sort of real recursion,
chainlint.sed only checks &&-chains in subshells one level deep; it
doesn't do any checking in deeper subshells or in `{...}` blocks within
subshells. Furthermore, on account of potential implementation
complexity, it doesn't check &&-chains within `case` arms.
Due to an oversight, it also doesn't swallow comments inside deep
subshells, `{...}` blocks, or `case` statements, which makes its output
inconsistent (swallowing comments in some cases but not others).
Unfortunately, this inconsistency seeps into the chainlint self-test
"expect" files, which potentially makes it difficult to reuse the
self-tests should a more capable chainlint ever be developed. Therefore,
teach chainlint.sed to consistently swallow comments in all cases.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of chainlint is to highlight problems it finds in test code
by inserting annotations at the location of each problem. Arbitrarily
eliding bits of the code it is checking is not helpful, yet this is
exactly what chainlint.sed does by cavalierly and unnecessarily dropping
the here-doc operator and tag; i.e. `cat <<TAG` becomes simply `cat` in
the output. This behavior can make it more difficult for the test writer
to align the annotated output of chainlint.sed with the original test
code. Address this by retaining here-doc tags.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tighten here-doc recognition to prevent it from being fooled by text
which looks like a here-doc operator but happens merely to be the
content of a string, such as this real-world case from t7201:
echo "<<<<<<< ours" &&
echo ourside &&
echo "=======" &&
echo theirside &&
echo ">>>>>>> theirs"
This problem went unnoticed because chainlint.sed is not a real parser,
but rather applies heuristics to pretend to understand shell code. In
this case, it saw what it thought was a here-doc operator (`<< ours`),
and fell off the end of the test looking for the closing tag "ours"
which it never found, thus swallowed the remainder of the test without
checking it for &&-chain breakage.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to POSIX, "<<" and "<<-" are distinct shell operators. For the
latter to be recognized, no whitespace is allowed before the "-", though
whitespace is allowed after the operator. However, the chainlint
patterns which identify here-docs are both too loose and too tight,
incorrectly allowing whitespace between "<<" and "-" but disallowing it
between "-" and the here-doc tag. Fix the patterns to better match
POSIX.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chainlint.sed inserts a ">" annotation at the beginning of a line to
signal that its heuristics have identified an end-of-subshell. This was
useful as a debugging aid during development of the script, but it has
no value to test writers and might even confuse them into thinking that
the linter is misbehaving by inserting line-noise into the shell code it
is validating. Moreover, its presence also potentially makes it
difficult to reuse the chainlint self-test "expect" output should a more
capable linter ever be developed. Therefore, drop the ">" annotation.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
>From inception, when chainlint.sed encountered a line using semicolon to
separate commands rather than `&&`, it would insert a ?!SEMI?!
annotation at the beginning of the line rather ?!AMP?! even though the
&&-chain is also broken by the semicolon. Given a line such as:
?!SEMI?! cmd1; cmd2 &&
the ?!SEMI?! annotation makes it easier to see what the problem is than
if the output had been:
?!AMP?! cmd1; cmd2 &&
which might confuse the test author into thinking that the linter is
broken (since the line clearly ends with `&&`).
However, now that the ?!AMP?! an ?!SEMI?! annotations are inserted at
the point of breakage rather than at the beginning of the line, and
taking into account that both represent a broken &&-chain, there is
little reason to distinguish between the two. Using ?!AMP?! alone is
sufficient to point the test author at the problem. For instance, in:
cmd1; ?!AMP?! cmd2 &&
cmd3
it is clear that the &&-chain is broken between `cmd1` and `cmd2`.
Likewise, in:
cmd1 && cmd2 ?!AMP?!
cmd3
it is clear that the &&-chain is broken between `cmd2` and `cmd3`.
Finally, in:
cmd1; ?!AMP?! cmd2 ?!AMP?!
cmd3
it is clear that the &&-chain is broken between each command.
Hence, there is no longer a good reason to make a distinction between a
broken &&-chain due to a semicolon and a broken chain due to a missing
`&&` at end-of-line. Therefore, drop the ?!SEMI?! annotation and use
?!AMP?! exclusively.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chainlint.sed flags ";" when used as a command terminator since it
breaks the &&-chain, thus can allow failures to go undetected. However,
when a command terminated by ";" is the last command in the body of a
compound statement, such as `command-2` in:
if test $# -gt 1
then
command-1 &&
command-2;
fi
then the ";" is harmless and the exit code from `command-2` is passed
through untouched and becomes the exit code of the compound statement,
as if the ";" was not present. Therefore, tolerate a trailing ";" in
this position rather than complaining about broken &&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When chainlint.sed detects commands separated by a semicolon rather than
by `&&`, it places a ?!SEMI?! annotation at the beginning of the line.
However, this is an unusual location for programmers accustomed to error
messages (from compilers, for instance) indicating the exact point of
the problem. Therefore, relocate the ?!SEMI?! annotation to the location
of the semicolon in order to better direct the programmer's attention to
the source of the problem.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When chainlint.sed detects a broken &&-chain, it places an ?!AMP?!
annotation at the beginning of the line. However, this is an unusual
location for programmers accustomed to error messages (from compilers,
for instance) indicating the exact point of the problem. Therefore,
relocate the ?!AMP?! annotation to the end of the line in order to
better direct the programmer's attention to the source of the problem.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than running `chainlint` and `diff` once per self-test -- which
may become expensive as more tests are added -- instead run `chainlint`
a single time over all tests bodies collectively and compare the result
to the collective "expected" output.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of chainlint.sed is to detect &&-chain breakage only within
subshells (one level deep); it doesn't bother checking for top-level
&&-chain breakage since the &&-chain checker built into t/test-lib.sh
should detect broken &&-chains outside of subshells by making them
magically exit with code 117.
Unfortunately, one of the chainlint.sed self-tests has overly intimate
knowledge of this particular division of responsibilities and only cares
about what chainlint.sed itself will produce, while ignoring the fact
that a more all-encompassing linter would complain about a broken
&&-chain outside the subshell. This makes it difficult to re-use the
test with a more capable chainlint implementation should one ever be
developed. Therefore, adjust the test and its "expected" output to
avoid being specific to the tunnel-vision of this one implementation.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of chainlint.sed is to detect &&-chain breakage only within
subshells (one level deep); it doesn't bother checking for top-level
&&-chain breakage since the &&-chain checker built into t/test-lib.sh
should detect broken &&-chains outside of subshells by making them
magically exit with code 117. However, this division of labor may not
always be the case if a more capable chainlint implementation is ever
developed. Beyond that, due to being sed-based and due to its use of
heuristics, chainlint.sed has several limitations (such as being unable
to detect &&-chain breakage in subshells more than one level deep since
it only manually emulates recursion into a subshell).
Some of the comments in the chainlint self-tests unnecessarily reflect
the limitations of chainlint.sed even though those limitations are not
what is being tested. Therefore, simplify and generalize the comments to
explain only what is being tested, thus ensuring that they won't become
outdated if a more capable chainlint is ever developed.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The chainlint self-test code snippets are supposed to represent the body
of a test_expect_success() or test_expect_failure(), yet the contents of
a few tests would have caused the shell to report syntax errors had they
been real test bodies due to the mix of single- and double-quotes.
Although chainlint.sed, with its simplistic heuristics, is blind to this
problem, a future more robust chainlint implementation might not have
such a limitation. Therefore, stop mixing quote types haphazardly in
those tests and unify quoting throughout. While at it, drop chunks of
tests which merely repeat what is already tested elsewhere but with
alternative quotes.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The chainlint self-test code snippets are supposed to represent the body
of a test_expect_success() or test_expect_failure(), yet the contents of
these tests would have caused the shell to report syntax errors had they
been real test bodies. Although chainlint.sed, with its simplistic
heuristics, is blind to these syntactic problems, a future more robust
chainlint implementation might not have such a limitation, so make these
snippets syntactically valid.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Failures within `for` and `while` loops can go unnoticed if not detected
and signaled manually since the loop itself does not abort when a
contained command fails, nor will a failure necessarily be detected when
the loop finishes since the loop returns the exit code of the last
command it ran on the final iteration, which may not be the command
which failed. Therefore, detect and signal failures manually within
loops using the idiom `|| return 1` (or `|| exit 1` within subshells).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than manually looping over a set of items and plugging those
items into a template string which is printed repeatedly, achieve the
same effect by taking advantage of `printf` which loops over its
arguments automatically.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than maintaining a flag indicating a failure within a loop and
aborting the test when the loop ends if the flag is set, modern practice
is to signal the failure immediately by exiting the loop early via
`return 1` (or `exit 1` if inside a subshell). Simplify these loops by
following the modern idiom.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the way these tests signal failure by employing the modern
idiom of making the `if` or `case` statement resolve to false when an
error is detected.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The top-level &&-chain checker built into t/test-lib.sh causes tests to
magically exit with code 117 if the &&-chain is broken. However, it has
the shortcoming that the magic does not work within `{...}` groups,
`(...)` subshells, `$(...)` substitutions, or within bodies of compound
statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed`
partly fills in the gap by catching broken &&-chains in `(...)`
subshells, but bugs can still lurk behind broken &&-chains in the other
cases.
Fix broken &&-chains in `{...}` groups in order to reduce the number of
possible lurking bugs.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The top-level &&-chain checker built into t/test-lib.sh causes tests to
magically exit with code 117 if the &&-chain is broken. However, it has
the shortcoming that the magic does not work within `{...}` groups,
`(...)` subshells, `$(...)` substitutions, or within bodies of compound
statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed`
partly fills in the gap by catching broken &&-chains in `(...)`
subshells, but bugs can still lurk behind broken &&-chains in the other
cases.
Fix broken &&-chains in `$(...)` command substitutions in order to
reduce the number of possible lurking bugs.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The top-level &&-chain checker built into t/test-lib.sh causes tests to
magically exit with code 117 if the &&-chain is broken. However, it has
the shortcoming that the magic does not work within `{...}` groups,
`(...)` subshells, `$(...)` substitutions, or within bodies of compound
statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed`
partly fills in the gap by catching broken &&-chains in `(...)`
subshells, but bugs can still lurk behind broken &&-chains in the other
cases.
Fix broken &&-chains in compound statements in order to reduce the
number of possible lurking bugs.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of test_write_lines() to generate line-oriented output
rather than using for-loops or a series of `echo` commands. Not only is
test_write_lines() a natural fit for such a task, but there is less
opportunity for a broken &&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of here-docs to create large blocks of text rather than
using a series of `echo` statements. Not only are here-docs a natural
fit for such a task, but there is less opportunity for a broken
&&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test intentionally breaks the &&-chain when using `expr` to parse
"[<path>]:<ref>" since the pattern matching operation will return 1
(failure) when <path> is empty even though an empty <path> is legitimate
in this test and should not cause the test to fail. However, it is
possible to parse the input without breaking the &&-chain by using shell
parameter expansion (i.e. `${i%%...}`). Other ways to avoid the problem
would be `{ expr $i : ... ||:; }` or test_might_fail(), however,
parameter expansion seems simplest.
IMPLEMENTATION NOTE
The rewritten `if` expression:
if test "$ref" = "${ref#refs/remotes/}"`; then continue; fi
is perhaps a bit subtle. At first glance, it looks like it will
`continue` the loop if $ref starts with "refs/remotes/", but in fact
it's the opposite: the loop will `continue` if $ref does not start with
"refs/remotes/".
In the original, `expr` would only match if the ref started with
"refs/remotes/", and $ref would end up empty if it didn't, so `test -z`
would `continue` the loop if the ref did not start with "refs/remotes/".
With parameter expansion, ${ref#refs/remotes/} attempts to strip
"refs/remotes/" from $ref. If it fails, meaning that $ref does not start
with "refs/remotes/", then the expansion will just be $ref unchanged,
and it will `continue` the loop. On the other hand, if stripping
succeeds, meaning that $ref begins with "refs/remotes/", then the
expansion will be the value of $ref with "refs/remotes/" removed, hence
`continue` will not be taken.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test populates its `expect` file solely by appending content but
fails to ensure that the file starts out empty. The test succeeds only
because no earlier test populated a file of the exact same name, however
this is an accident waiting to happen. Make the test more robust by
ensuring that it contains exactly the intended content.
While at it, simplify the implementation via a straightforward `sed`
application and by avoiding dropping out of the single-quote context
within the test body (thus eliminating a hard-to-digest combination of
apostrophes and backslashes).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To create its "expect" file, this test pipes into `sort` the output of
`git for-each-ref` and a copy of that same output but with a minor
textual transformation applied. To do so, it employs a subshell and
commands `cat` and `sed` even though the same result can be accomplished
by `sed` alone (without a subshell).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests assign the output of `$(...)` command substitution to an
"expect" variable, taking advantage of the fact that `$(...)` folds out
the final line terminator while leaving internal line terminators
intact. They do this because the "actual" string with which "expect"
will be compared is shaped the same way. However, this intent (having
internal line terminators, but no final line terminator) is not
necessarily obvious at first glance and may confuse casual readers. The
intent can be made more obvious by using `printf` instead, with which
line termination is stated clearly:
printf "sixth\nthird"
In fact, many other tests in this script already use `printf` for
precisely this purpose, thus it is an established pattern. Therefore,
convert these tests to employ `printf`, as well.
While at it, modernize the tests to use test_cmp() to compare the
expected and actual output rather than using the semi-deprecated
`verbose test "$x" = "$y"`.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although `exit 1` is the proper way to signal a test failure from within
a subshell, its use outside any subshell should be avoided since it
aborts the entire script rather than aborting only the failed test.
Instead, a simple `return 1` is the proper idiom for signaling failure
outside a subshell since it aborts only the test in question, not the
entire script.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Microsoft Windows, a directory name should never end with a period.
Quoting from Microsoft documentation[1]:
Do not end a file or directory name with a space or a period.
Although the underlying file system may support such names, the
Windows shell and user interface does not.
Naming a directory with a trailing period is indeed perilous:
% git init foo
% cd foo
% mkdir a.
% git status
warning: could not open directory 'a./': No such file or directory
The t1010 "setup" test:
for d in a a. a0
do
mkdir "$d" && echo "$d/one" >"$d/one" &&
git add "$d"
done &&
runs afoul of this Windows limitation, as can be observed when running
the test verbosely:
error: open("a./one"): No such file or directory
error: unable to index file 'a./one'
fatal: adding files failed
The reason this problem has gone unnoticed for so long is twofold.
First, the failed `git add` is swallowed silently because the loop is
not terminated explicitly by `|| return 1` to signal the failure.
Second, none of the tests in this script care about the literal
directory names ("a", "a.", "a0") or the specific number of tree
entries. They care instead about the order of entries in the tree, and
that the tree synthesized in the index and created by `git write-tree`
matches the tree created by the output of `git ls-tree` fed into `git
mktree`, thus the absence of "a./one" has no impact on the tests.
Skipping these tests on Windows by, for instance, checking the
FUNNYNAMES predicate would avoid the problem, however, the funny-looking
name is not what is being tested here. Rather, the tests are about
checking that `git mktree` produces stable results for various input
conditions, such as when the input order is not consistent or when an
object is missing.
Therefore, resolve the problem simply by using a directory name which is
legal on Windows and sorts the same as "a.". While at it, add the
missing `|| return 1` to the loop body in order to catch this sort of
problem in the future.
[1]: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase -x" added an unnecessary 'exec' instructions before
'noop', which has been corrected.
* en/rebase-x-fix:
sequencer: avoid adding exec commands for non-commit creating commands
The single-key-input mode in "git add -p" had some code to handle
keys that generate a sequence of input via ReadKey(), which did not
handle end-of-file correctly, which has been fixed.
* cb/add-p-single-key-fix:
add -p: avoid use of undefined $key when ReadKey -> EOF
The completion script (in contrib/) learns that the "--date"
option of commands from the "git log" family takes "human" and
"auto" as valid values.
* yn/complete-date-format-options:
completion: add human and auto: date format
When a non-existent program is given as the pager, we tried to
reuse an uninitialized child_process structure and crashed, which
has been fixed.
* em/missing-pager:
pager: fix crash when pager program doesn't exist
"git submodule deinit" for a submodule whose .git metadata
directory is embedded in its working tree refused to work, until
the submodule gets converted to use the "absorbed" form where the
metadata directory is stored in superproject, and a gitfile at the
top-level of the working tree of the submodule points at it. The
command is taught to convert such submodules to the absorbed form
as needed.
* mp/absorb-submodule-git-dir-upon-deinit:
submodule: absorb git dir instead of dying on deinit
The function to cull a child process and determine the exit status
had two separate code paths for normal callers and callers in a
signal handler, and the latter did not yield correct value when the
child has caught a signal. The handling of the exit status has
been unified for these two code paths. An existing test with
flakiness has also been corrected.
* jk/t7006-sigpipe-tests-fix:
t7006: simplify exit-code checks for sigpipe tests
t7006: clean up SIGPIPE handling in trace2 tests
run-command: unify signal and regular logic for wait_or_whine()
"git fetch", when received a bad packfile, can fail with SIGPIPE.
This wasn't wrong per-se, but we now detect the situation and fail
in a more predictable way.
* jk/fetch-pack-avoid-sigpipe-to-index-pack:
fetch-pack: ignore SIGPIPE when writing to index-pack
Various operating modes of "git reset" have been made to work
better with the sparse index.
* vd/sparse-reset:
unpack-trees: improve performance of next_cache_entry
reset: make --mixed sparse-aware
reset: make sparse-aware (except --mixed)
reset: integrate with sparse index
reset: expand test coverage for sparse checkouts
sparse-index: update command for expand/collapse test
reset: preserve skip-worktree bit in mixed reset
reset: rename is_missing to !is_in_reset_tree
On platforms where ulong is shorter than size_t, code paths that
shifted 1 or 1U to the left lacked the necessary cast to size_t,
which have been corrected.
* po/size-t-for-vs:
object-file.c: LLP64 compatibility, upcast unity for left shift
diffcore-delta.c: LLP64 compatibility, upcast unity for left shift
repack.c: LLP64 compatibility, upcast unity for left shift
The advice message given by "git pull" when the user hasn't made a
choice between merge and rebase still said that the merge is the
default, which no longer is the case. This has been corrected.
* ah/advice-pull-has-no-preference-between-rebase-and-merge:
pull: don't say that merge is "the default strategy"
The code to decode the length of packed object size has been
corrected.
* jt/pack-header-lshift-overflow:
packfile: avoid overflowing shift during decode
The "merge" subcommand of "git jump" (in contrib/) silently ignored
pathspec and other parameters.
* jk/jump-merge-with-pathspec:
git-jump: pass "merge" arguments to ls-files
Build optimization.
* ab/generate-command-list:
generate-cmdlist.sh: don't parse command-list.txt thrice
generate-cmdlist.sh: replace "grep' invocation with a shell version
generate-cmdlist.sh: do not shell out to "sed"
generate-cmdlist.sh: stop sorting category lines
generate-cmdlist.sh: replace for loop by printf's auto-repeat feature
generate-cmdlist.sh: run "grep | sort", not "sort | grep"
generate-cmdlist.sh: don't call get_categories() from category_list()
generate-cmdlist.sh: spawn fewer processes
generate-cmdlist.sh: trivial whitespace change
command-list.txt: sort with "LC_ALL=C sort"
"git var GIT_DEFAULT_BRANCH" is a way to see what name is used for
the newly created branch if "git init" is run.
* tw/var-default-branch:
var: add GIT_DEFAULT_BRANCH variable
The "--date=format:<strftime>" gained a workaround for the lack of
system support for a non-local timezone to handle "%s" placeholder.
* jk/strbuf-addftime-seconds-since-epoch:
strbuf_addftime(): handle "%s" manually
CI has been taught to catch some Unicode directional formatting
sequence that can be used in certain mischief.
* js/ci-no-directional-formatting:
ci: disallow directional formatting
Redact the path part of packfile URI that appears in the trace output.
* if/redact-packfile-uri:
http-fetch: redact url on die() message
fetch-pack: redact packfile urls in traces
Doc update.
* ja/doc-cleanup:
init doc: --shared=0xxx does not give umask but perm bits
doc: git-init: clarify file modes in octal.
doc: git-http-push: describe the refs as pattern pairs
doc: uniformize <URL> placeholders' case
doc: use three dots for indicating repetition instead of star
doc: git-ls-files: express options as optional alternatives
doc: use only hyphens as word separators in placeholders
doc: express grammar placeholders between angle brackets
doc: split placeholders as individual tokens
doc: fix git credential synopsis
Code clean-up to eventually allow information on remotes defined
for an arbitrary repository to be read.
* gc/remote-with-fewer-static-global-variables:
remote: die if branch is not found in repository
remote: remove the_repository->remote_state from static methods
remote: use remote_state parameter internally
remote: move static variables into per-repository struct
t5516: add test case for pushing remote refspecs
Ensure that the sparseness of the in-core index matches the
index.sparse configuration specified by the repository immediately
after the on-disk index file is read.
* vd/sparse-sparsity-fix-on-read:
sparse-index: update do_read_index to ensure correct sparsity
sparse-index: add ensure_correct_sparsity function
sparse-index: avoid unnecessary cache tree clearing
test-read-cache.c: prepare_repo_settings after config init
Do a full ssh signing, find-principals and verify operation in the test
prereq's to make sure ssh-keygen works as expected. Only generating the
keys and verifying its presence is not sufficient in some situations.
One example was ssh-keygen creating unusable ssh keys in cygwin because
of unsafe default permissions for the key files. The other a broken
openssh 8.7 that segfaulted on any find-principals operation. This
extended prereq check avoids future test breakages in case ssh-keygen or
any environment behaviour changes.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the payload_type for check_signature() when generating merge messages to
verify merged tags signatures key lifetimes.
Implements the same tests as for verify-commit.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the payload_type for check_signature() when calling verify-tag.
Implements the same tests as for verify-commit.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the payload_type for check_signature() when calling git log.
Implements the same tests as for verify-commit.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If valid-before/after dates are configured for this signatures key in the
allowedSigners file then the verification should check if the key was valid at
the time the commit was made. This allows for graceful key rollover and
revoking keys without invalidating all previous commits.
This feature needs openssh > 8.8. Older ssh-keygen versions will simply
ignore this flag and use the current time.
Strictly speaking this feature is available in 8.7, but since 8.7 has a
bug that makes it unusable in another needed call we require 8.8.
Timestamp information is present on most invocations of check_signature.
However signer ident is not. We will need the signer email / name to be able
to implement "Trust on first use" functionality later.
Since the payload contains all necessary information we can parse it
from there. The caller only needs to provide us some info about the
payload by setting payload_type in the signature_check struct.
- Add payload_type field & enum and payload_timestamp to struct
signature_check
- Populate the timestamp when not already set if we know about the
payload type
- Pass -Overify-time={payload_timestamp} in the users timezone to all
ssh-keygen verification calls
- Set the payload type when verifying commits
- Add tests for expired, not yet valid and keys having a commit date
outside of key validity as well as within
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
if ssh-keygen supports -Overify-time, add test keys marked as expired,
not yet valid and valid both within the test_tick timeframe and outside of it.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To be able to extend the payload metadata with things like its creation
timestamp or the creators ident we remove the payload parameters to
check_signature() and use the already existing sigc->payload field
instead, only adding the length field to the struct. This also allows
us to get rid of the xmemdupz() calls in the verify functions. Since
sigc is now used to input data as well as output the result move it to
the front of the function list.
- Add payload_length to struct signature_check
- Populate sigc.payload/payload_len on all call sites
- Remove payload parameters to check_signature()
- Remove payload parameters to internal verify_* functions and use sigc
instead
- Remove xmemdupz() used for verbose output since payload is now already
populated.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some GPGSSH fmt-merge-msg tests were only grepping for failed/successful
signature validation and not checking for the tag in the resulting merge
message. Add the missing grep for it.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the GPG and GPGSSH tests are redirecing stdout as well as stderr
to `actual` and grep for success/failure over the resulting file.
However, no output is printed on stderr and we do not need to
include it in the grep.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test intentionally breaks the &&-chain following `unset` since it
doesn't know if `unset` will succeed or fail and doesn't want a local
`unset` failure to abort the test overall. We can do better by using
sane_unset() which can be linked into the &&-chain as usual.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We no longer are dealing with a mixture of previous and desired
behavior, so simplify the tests a bit.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remove_dir_recurse(), and its non-static wrapper called
remove_dir_recursively(), both take flags for modifying its behavior.
As with the previous commits, we would generally like to protect
the original_cwd, but we want to forced user commands (e.g. 'git rm -rf
...') or other special cases to remove it. Add a flag for this purpose.
After reading through every caller of remove_dir_recursively() in the
current codebase, there was only one that should be adjusted and that
one only in a very unusual circumstance. Add a pair of new testcases to
highlight that very specific case involving submodules && --git-dir &&
--work-tree.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modern git often tries to avoid leaving empty directories around when
removing files. Originally, it did not bother. This behavior started
with commit 80e21a9ed8 (merge-recursive::removeFile: remove empty
directories, 2005-11-19), stating the reason simply as:
When the last file in a directory is removed as the result of a
merge, try to rmdir the now-empty directory.
This was reimplemented in C and renamed to remove_path() in commit
e1b3a2cad7 ("Build-in merge-recursive", 2008-02-07), but was still
internal to merge-recursive.
This trend towards removing leading empty directories continued with
commit d9b814cc97 (Add builtin "git rm" command, 2006-05-19), which
stated the reasoning as:
The other question is what to do with leading directories. The old
"git rm" script didn't do anything, which is somewhat inconsistent.
This one will actually clean up directories that have become empty
as a result of removing the last file, but maybe we want to have a
flag to decide the behaviour?
remove_path() in dir.c was added in 4a92d1bfb7 (Add remove_path: a
function to remove as much as possible of a path, 2008-09-27), because
it was noted that we had two separate implementations of the same idea
AND both were buggy. It described the purpose of the function as
a function to remove as much as possible of a path
Why remove as much as possible? Well, at the time we probably would
have said something like:
* removing leading directories makes things feel tidy
* removing leading directories doesn't hurt anything so long as they
had no files in them.
But I don't believe those reasons hold when the empty directory happens
to be the current working directory we inherited from our parent
process. Leaving the parent process in a deleted directory can cause
user confusion when subsequent processes fail: any git command, for
example, will immediately fail with
fatal: Unable to read current working directory: No such file or directory
Other commands may similarly get confused. Modify remove_path() so that
the empty leading directories it also deletes does not include the
current working directory we inherited from our parent process. I have
looked through every caller of remove_path() in the current codebase to
make sure that all should take this change.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since stash spawns a `clean` subprocess, make sure we run that from the
startup_info->original_cwd directory, so that the `clean` processs knows
to protect that directory. Also, since the `clean` command might no
longer run from the toplevel, pass the ':/' magic pathspec to ensure we
still clean from the toplevel.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since rebase spawns a `checkout` subprocess, make sure we run that from
the startup_info->original_cwd directory, so that the checkout process
knows to protect that directory.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
symlinks has a pair of schedule_dir_for_removal() and
remove_scheduled_dirs() functions that ensure that directories made
empty by removing other files also themselves get removed. However, we
want to exclude startup_info->original_cwd and leave it around. This
avoids the user getting confused by subsequent git commands (and non-git
commands) that would otherwise report confusing messages about being
unable to read the current working directory.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running commands such as `git reset --hard` from a subdirectory, if
that subdirectory is in the way of adding needed files, bail with an
error message.
Note that this change looks kind of like it duplicates the new lines of
code from the previous commit in verify_clean_subdirectory(). However,
when we are preserving untracked files, we would rather any error
messages about untracked files being in the way take precedence over
error messages about a subdirectory that happens to be the_original_cwd
being in the way. But in the UNPACK_RESET_OVERWRITE_UNTRACKED case,
there is no untracked checking to be done, so we simply add a special
case near the top of verify_absent_1.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the past, when a directory needs to be removed to make room for a
file, we have always errored out when that directory contains any
untracked (but not ignored) files. Add an extra condition on that: also
error out if the directory is the current working directory we inherited
from our parent process.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Removing the current working directory causes all subsequent git
commands run from that directory to get confused and fail with a message
about being unable to read the current working directory:
$ git status
fatal: Unable to read current working directory: No such file or directory
Non-git commands likely have similar warnings or even errors, e.g.
$ bash -c 'echo hello'
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
hello
This confuses end users, particularly since the command they get the
error from is not the one that caused the problem; the problem came from
the side-effect of some previous command.
We would like to avoid removing the current working directory of our
parent process; towards this end, introduce a new variable,
startup_info->original_cwd, that tracks the current working directory
that we inherited from our parent process. For convenience of later
comparisons, we prefer that this new variable store a path relative to
the toplevel working directory (thus much like 'prefix'), except without
the trailing slash.
Subsequent commits will make use of this new variable.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Numerous commands will remove directories left empty as a "convenience"
after removing files within them. That is normally fine, but removing
the current working directory can be rather inconvenient since it can
cause confusion for the user when they run subsequent commands. For
example, after one git process has removed the current working
directory, git status/log/diff will all abort with the message:
fatal: Unable to read current working directory: No such file or directory
We also have code paths that, when a file needs to be placed where a
directory is (due to e.g. checkout, merge, reset, whatever), will check
if this is okay and error out if not. These rules include:
* all tracked files under that directory are intended to be removed by
the operation
* none of the tracked files under that directory have uncommitted
modification
* there are no untracked files under that directory
However, if we end up remove the current working directory, we can cause
user confusion when they run subsequent commands, so we would prefer if
there was a fourth rule added to this list: avoid removing the current
working directory.
Since there are several code paths that can result in the current
working directory being removed, add several tests of various different
codepaths. To make it clearer what the difference between the current
behavior and the behavior at the end of the series, code both of them
into the tests and have the appropriate behavior be selected by a flag.
Subsequent commits will toggle the flag from current to desired
behavior.
Also add a few tests suggested during the review of earlier rounds of
this patch series.
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Taking inspiration from xdl_classify_record() assign an id to each
addition and deletion such that lines that match for the current
--color-moved-ws mode share the same unique id. This reduces the
number of hash lookups a little (calculating the ids still involves
one hash lookup per line) but the main benefit is that when growing
blocks of potentially moved lines we can replace string comparisons
which involve chasing a pointer with a simple integer comparison. On a
large diff this commit reduces the time to run 'diff --color-moved' by
37% compared to the previous commit and 31% compared to master, for
'diff --color-moved-ws=allow-indentation-change' the reduction is 28%
compared to the previous commit and 96% compared to master. There is
little change in the performance of 'git log --patch' as the diffs are
smaller.
Test HEAD^ HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.88(0.81+0.06) 0.55(0.50+0.04) -37.5%
4002.3: diff --color-moved-ws=allow-indentation-change large change 0.85(0.79+0.06) 0.61(0.54+0.06) -28.2%
4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.08) 1.15(1.09+0.05) -0.9%
4002.5: log --color-moved --no-color-moved-ws 1.31(1.22+0.08) 1.29(1.19+0.09) -1.5%
4002.6: log --color-moved-ws=allow-indentation-change 1.32(1.24+0.08) 1.31(1.18+0.13) -0.8%
Test master HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.33+0.05) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.55(0.50+0.04) -31.2%
4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.61(0.54+0.06) -95.7%
4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.15(1.09+0.05) +0.0%
4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.29(1.19+0.09) -0.8%
4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.31(1.18+0.13) -22.9%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it clearer which fields are being explicitly initialized
and will simplify the next commit where we add a new field to the
struct.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As libxdiff does not have a whitespace flag to ignore the indentation
the code for --color-moved-ws=allow-indentation-change uses
XDF_IGNORE_WHITESPACE and then filters out any hash lookups where
there are non-indentation changes. This filtering is inefficient as
we have to perform another string comparison.
By using the offset data that we have already computed to skip the
indentation we can avoid using XDF_IGNORE_WHITESPACE and safely remove
the extra checks which improves the performance by 11% and paves the
way for the elimination of string comparisons in the next commit.
This change slightly increases the run time of other --color-moved
modes. This could be avoided by using different comparison functions
for the different modes but after the next two commits there is no
measurable benefit in doing so.
There is a change in behavior for lines that begin with a form-feed or
vertical-tab character. Since b46054b374 ("xdiff: use
git-compat-util", 2019-04-11) xdiff does not treat '\f' or '\v' as
whitespace characters. This means that lines starting with those
characters are never considered to be blank and never match a line
that does not start with the same character. After this patch a line
matching "^[\f\v\r]*[ \t]*$" is considered to be blank by
--color-moved-ws=allow-indentation-change and lines beginning
"^[\f\v\r]*[ \t]*" can match another line if the suffixes match. This
changes the output of git show for d18f76dccf ("compat/regex: use the
regex engine from gawk for compat", 2010-08-17) as some lines in the
pre-image before a moved block that contain '\f' are now considered
moved as well as they match a blank line before the moved lines in the
post-image. This commit updates one of the tests to reflect this
change.
Test HEAD^ HEAD
--------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.33+0.05) 0.38(0.33+0.05) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.86(0.82+0.04) 0.88(0.84+0.04) +2.3%
4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.94+0.03) 0.86(0.81+0.05) -11.3%
4002.4: log --no-color-moved --no-color-moved-ws 1.16(1.07+0.09) 1.16(1.06+0.09) +0.0%
4002.5: log --color-moved --no-color-moved-ws 1.32(1.26+0.06) 1.33(1.27+0.05) +0.8%
4002.6: log --color-moved-ws=allow-indentation-change 1.35(1.29+0.06) 1.33(1.24+0.08) -1.5%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
moved_block_clear() was introduced in 74d156f4a1 ("diff
--color-moved-ws: fix double free crash", 2018-10-04) to free the
memory that was allocated when initializing a potential moved
block. However since 21536d077f ("diff --color-moved-ws: modify
allow-indentation-change", 2018-11-23) initializing a potential moved
block no longer allocates any memory. Up until the last commit we were
relying on moved_block_clear() to set the `match` pointer to NULL when
a block stopped matching, but since that commit we do not clear a
moved block that does not match so it does not make sense to clear
them elsewhere.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than setting `match` to NULL and then looping over the list of
potential matched blocks for a second time to remove blocks with no
matches just filter out the blocks with no matches as we go.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the last two commits pmb_advance_or_null() and
pmb_advance_or_null_multi_match() differ only in the comparison they
perform. Lets simplify the code by combining them into a single
function.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change will allow us to easily combine pmb_advance_or_null() and
pmb_advance_or_null_multi_match() in the next commit. Calling
xdiff_compare_lines() directly rather than using a function pointer
from the hash map has little effect on the run time.
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38(0.35+0.03) 0.38(0.32+0.06) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.87(0.83+0.04) 0.87(0.80+0.06) +0.0%
4002.3: diff --color-moved-ws=allow-indentation-change large change 0.97(0.92+0.04) 0.97(0.93+0.04) +0.0%
4002.4: log --no-color-moved --no-color-moved-ws 1.17(1.06+0.10) 1.16(1.10+0.05) -0.9%
4002.5: log --color-moved --no-color-moved-ws 1.32(1.24+0.08) 1.31(1.22+0.09) -0.8%
4002.6: log --color-moved-ws=allow-indentation-change 1.36(1.25+0.10) 1.35(1.25+0.10) -0.7%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we already have a block of potentially moved lines then as we move
down the diff we need to check if the next line of each potentially
moved line matches the current line of the diff. The implementation of
--color-moved-ws=allow-indentation-change was needlessly performing
this check on all the lines in the diff that matched the current line
rather than just the current line. To exacerbate the problem finding
all the other lines in the diff that match the current line involves a
fuzzy lookup so we were wasting even more time performing a second
comparison to filter out the non-matching lines. Fixing this reduces
time to run
git diff --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0
by 93% compared to master and simplifies the code.
Test HEAD^ HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.35+0.03) 0.38(0.35+0.03) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.86 (0.80+0.06) 0.87(0.83+0.04) +1.2%
4002.3: diff --color-moved-ws=allow-indentation-change large change 19.01(18.93+0.06) 0.97(0.92+0.04) -94.9%
4002.4: log --no-color-moved --no-color-moved-ws 1.16 (1.06+0.09) 1.17(1.06+0.10) +0.9%
4002.5: log --color-moved --no-color-moved-ws 1.32 (1.25+0.07) 1.32(1.24+0.08) +0.0%
4002.6: log --color-moved-ws=allow-indentation-change 1.71 (1.64+0.06) 1.36(1.25+0.10) -20.5%
Test master HEAD
---------------------------------------------------------------------------------------------------------------
4002.1: diff --no-color-moved --no-color-moved-ws large change 0.38 (0.33+0.05) 0.38(0.35+0.03) +0.0%
4002.2: diff --color-moved --no-color-moved-ws large change 0.80 (0.75+0.04) 0.87(0.83+0.04) +8.7%
4002.3: diff --color-moved-ws=allow-indentation-change large change 14.20(14.15+0.05) 0.97(0.92+0.04) -93.2%
4002.4: log --no-color-moved --no-color-moved-ws 1.15 (1.05+0.09) 1.17(1.06+0.10) +1.7%
4002.5: log --color-moved --no-color-moved-ws 1.30 (1.19+0.11) 1.32(1.24+0.08) +1.5%
4002.6: log --color-moved-ws=allow-indentation-change 1.70 (1.63+0.06) 1.36(1.25+0.10) -20.0%
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we reliably end a block when the sign changes we don't need
the whitespace delta calculation to rely on the sign.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When marking moved lines it is possible for a block of potential
matched lines to extend past a change in sign when there is a sequence
of added lines whose text matches the text of a sequence of deleted
and added lines. Most of the time either `match` will be NULL or
`pmb_advance_or_null()` will fail when the loop encounters a change of
sign but there are corner cases where `match` is non-NULL and
`pmb_advance_or_null()` successfully advances the moved block despite
the change in sign.
One consequence of this is highlighting a short line as moved when it
should not be. For example
-moved line # Correctly highlighted as moved
+short line # Wrongly highlighted as moved
context
+moved line # Correctly highlighted as moved
+short line
context
-short line
The other consequence is coloring a moved addition following a moved
deletion in the wrong color. In the example below the first "+moved
line 3" should be highlighted as newMoved not newMovedAlternate.
-moved line 1 # Correctly highlighted as oldMoved
-moved line 2 # Correctly highlighted as oldMovedAlternate
+moved line 3 # Wrongly highlighted as newMovedAlternate
context # Everything else is highlighted correctly
+moved line 2
+moved line 3
context
+moved line 1
-moved line 3
These false matches are more likely when using --color-moved-ws with
the exception of --color-moved-ws=allow-indentation-change which ties
the sign of the current whitespace delta to the sign of the line to
avoid this problem. The fix is to check that the sign of the new line
being matched is the same as the sign of the line that started the
block of potential matches.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b0a2ba4776 ("diff --color-moved=zebra: be stricter with color
alternation", 2018-11-23) sought to avoid using the alternate colors
unless there are two adjacent moved blocks of the same
sign. Unfortunately it contains two bugs that prevented it from fixing
the problem properly. Firstly `last_symbol` is reset at the start of
each iteration of the loop losing the symbol of the last line and
secondly when deciding whether to use the alternate color it should be
checking if the current line is the same sign of the last line, not a
different sign. The combination of the two errors means that we still
use the alternate color when we should do but we also use it when we
shouldn't. This is most noticable when using
--color-moved-ws=allow-indentation-change with hunks like
-this line gets indented
+ this line gets indented
where the post image is colored with newMovedAlternate rather than
newMoved. While this does not matter much, the next commit will change
the coloring to be correct in this case, so lets fix the bug here to
make it clear why the output is changing and add a regression test.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is quite heavily indented and having it in its own function
simplifies an upcoming change.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a block of potentially moved lines is not long enough then the
DIFF_SYMBOL_MOVED_LINE flag is cleared on the matching lines so they
are not marked as moved. To avoid problems when we start rewinding
after an unsuccessful match in a couple of commits time make sure all
the move related flags are cleared, not just DIFF_SYMBOL_MOVED_LINE.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add some tests so we can monitor changes to the performance of the
move detection code. The tests record the performance --color-moved
and --color-moved-ws=allow-indentation-change for a large diff and a
sequence of smaller diffs. The range of commits used for the large
diff can be customized by exporting TEST_REV_A and TEST_REV_B when
running the test.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use "type array[];" syntax for the flex-array member at the end
of a struct under C99 or later, except when we are building with
older SUNPRO_C compilers. As we find more vendor compilers that
claim to grok C99 but not understand the flex-array syntax, the
existing "If we are using C99, but not with these compilers..."
conditional will keep growing.
Make it more manageable by listing vendor-specific exceptions
earlier, with the expectation that new exceptions will not be
combined into existing ones to make the condition longer, and
instead will be implemented as a new "#elif" in the cascade of
similar to old SUNPRO_C, we can just add a single line
#elif defined(_MSC_VER)
immediately before "#elif defined(__GNUC__)" to cause us to fallback
to the safer but a bit wasteful version.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.
Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.
Peff's test case [1] was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.
[1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tmp_objdir API provides the ability to create temporary object
directories, but was designed with the goal of having subprocesses
access these object stores, followed by the main process migrating
objects from it to the main object store or just deleting it. The
subprocesses would view it as their primary datastore and write to it.
Here we add the tmp_objdir_replace_primary_odb function that replaces
the current process's writable "main" object directory with the
specified one. The previous main object directory is restored in either
tmp_objdir_migrate or tmp_objdir_destroy.
For the --remerge-diff usecase, add a new `will_destroy` flag in `struct
object_database` to mark ephemeral object databases that do not require
fsync durability.
Add 'git prune' support for removing temporary object databases, and
make sure that they have a name starting with tmp_ and containing an
operation-specific name.
Based-on-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of DEVELOPER=1 is to turn up the warnings so we can catch
portability or correctness mistakes at the compiler level. But since
modern compilers tend to default to modern standards like gnu17, we
might miss warnings about older standards, even though we expect Git to
build with compilers that use them.
So it's helpful for developer builds to set the -std argument to our
lowest-common denominator. Traditionally this was c89, but since we're
moving to assuming c99 in 7bc341e21b (git-compat-util: add a test
balloon for C99 support, 2021-12-01) that seems like a good spot to
land. And as explained in that commit, we want "gnu99" because we still
want to take advantage of some extensions when they're available.
The new argument kicks in only for clang and gcc (which we know to
support "-std=" and "gnu" standards). And only for compiler versions
which default to a newer standard. That will avoid accidentally
silencing any build problems that non-developers would run into on older
compilers that default to c89.
My digging found that the default switched to gnu11 in gcc 5.1.0.
Clang's documentation is less clear, but has done so since at least
clang-7. So that's what I put in the conditional here. It's OK to err on
the side of not-enabling this for older compilers. Most developers (as
well as CI) are using much more recent versions, so any warnings will
eventually surface.
A concrete example is anonymous unions, which became legal in c11.
Without this patch, "gcc -pedantic" will not complain about them, but
will if we add in "-std=gnu99".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a segfault in the --set-upstream option added in
24bc1a1292 (pull, fetch: add --set-upstream option, 2019-08-19) added
in v2.24.0.
The code added there did not do the same checking we do for "git
branch" itself since 8efb8899cf (branch: segfault fixes and
validation, 2013-02-23), which in turn fixed the same sort of segfault
I'm fixing now in "git branch --set-upstream-to", see
6183d826ba (branch: introduce --set-upstream-to, 2012-08-20).
The warning message I'm adding here is an amalgamation of the error
added for "git branch" in 8efb8899cf, and the error output
install_branch_config() itself emits, i.e. it trims "refs/heads/" from
the name and says "branch X on remote", not "branch refs/heads/X on
remote".
I think it would make more sense to simply die() here, but in the
other checks for --set-upstream added in 24bc1a1292 we issue a
warning() instead. Let's do the same here for consistency for now.
There was an earlier submitted alternate way of fixing this in [1],
due to that patch breaking threading with the original report at [2] I
didn't notice it before authoring this version. I think the more
detailed warning message here is better, and we should also have tests
for this behavior.
The --no-rebase option to "git pull" is needed as of the recently
merged 7d0daf3f12 (Merge branch 'en/pull-conflicting-options',
2021-08-30).
1. https://lore.kernel.org/git/20210706162238.575988-1-clemens@endorphin.org/
2. https://lore.kernel.org/git/CAG6gW_uHhfNiHGQDgGmb1byMqBA7xa8kuH1mP-wAPEe5Tmi2Ew@mail.gmail.com/
Reported-by: Clemens Fruhwirth <clemens@endorphin.org>
Reported-by: Jan Pokorný <poki@fnusa.cz>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cb_unlink is broken once a node is no longer self-referential
due to subsequent insertions. This is a consequence of an
intrusive implementation and I'm not sure if it's easily fixable
while retaining our cache-friendly intrusive property (I've
tried for several hours in another project).
In any case, we're not using cb_unlink anywhere in our codebase,
just get rid of it to avoid misleading future readers.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the git_die_config() function added in 5a80e97c82 (config: add
`git_die_config()` to the config-set API, 2014-08-07) to use the
public callbacks in the usage.[ch] API instead of the the underlying
vreportf() function.
In preceding commits the rest of the vreportf() users outside of
usage.c was migrated to die_message(), so we can now make it "static".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "error: " output when we exit with 128 due to gc.log errors
to use a "fatal: " prefix instead. To do this add a
die_message_errno() a sibling function to the die_errno() added in a
preceding commit.
Before this we'd expect report_last_gc_error() to return -1 from
error_errno() in this case. It already treated a status of 0 and 1
specially. Let's just document that anything that's not 0 or 1 should
be returned.
We could also retain the "ret < 0" behavior here without hardcoding
128 by returning -128, and having the caller do a "return -ret", but I
think this makes more sense, and preserves the path from
die_message*()'s return value to the "return" without hardcoding
"128".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A minor code cleanup. Let's "return" from cmd_gc() instead of calling
exit(). See 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08) for other such cases.
While we're at it add a \n to separate the variable declaration from
the rest of the code in this block. Both of these changes make a
subsequent change smaller and easier to read.
This change isn't really needed for that subsequent change, but now
someone viewing that future behavior change won't need to wonder why
we're either still calling exit() here, or fixing it while we're at
it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the migration of code that printed a message and exited with
128. In this case the caller used "error()", so we'll be changing the
output from "error: " to "fatal: ". This change is intentional and
desired.
This code is dying, so it should emit "fatal", the only reason it
didn't do so was because before the existence of "die_message()" it
would have needed to craft its own "fatal: " message.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code that printed its own "fatal: " message and exited with a
status code of 128 to use the die_message() function added in a
preceding commit.
This change also demonstrates why the return value of
die_message_routine() needed to be that of "report_fn". We have
callers such as the run-command.c::child_err_spew() which would like
to replace its error routine with the return value of
"get_die_message_routine()".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have code in various places that would like to call die(), but
wants to defer the exit(128) it would invoke, e.g. to print an
additional message, or adjust the exit code. Add a die_message()
helper routine to bridge this gap in the API.
Functionally this behaves just like the error() routine, except it'll
print a "fatal: " prefix, and it will return with 128 instead of -1,
this is so that caller can pass the return value to "exit()", instead
of having to hardcode "exit(128)".
Note that as with the other routines the "die_message_builtin" needs
to return "void" and otherwise conform to the "report_fn"
signature.
As we'll see in a subsequent commit callers will want to replace
e.g. their default "die_routine" with a "die_message_routine".
For now we're just adding the routine and making die_builtin() in
usage.c itself use it. In order to do that we need to add a
get_die_message_routine() function, which works like the other
get_*_routine() functions in usage.c. There is no
set_die_message_rotine(), as it hasn't been needed yet. We can add it
if we ever need it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This still leaves some other direct filesystem access. Currently, the files
backend does not allow invalidly named symrefs. Fixes for this are currently in
the 'seen' branch
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use this flag with the test-helper in t1430, to avoid direct writes to the ref
database.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This lets the ref-store test helper write non-existent or unparsable objects
into the ref storage.
Use this to make t1006 and t3800 independent of the files storage backend.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
REF_IS_PRUNING is right below this comment, so it clearly does not belong in
this comment. This was apparently introduced in commit 5ac95fee (Nov 5, 2017
"refs: tidy up and adjust visibility of the `ref_update` flags").
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This lets tests use REF_XXXX constants instead of hardcoded integers. The flag
names should be separated by a ','.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Nobody uses force_create=0, so this flag is unnecessary.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust code added in 7463064b28 (object.h: add
lookup_object_by_type() function, 2021-06-22) to use the BUG()
function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code that was added in 8f4f8f4579 (guard against new pathspec
magic in pathspec matching code, 2013-07-14) to use the BUG() macro
instead of emitting a "fatal" message with the "__FILE__"-name and
"__LINE__"-numbers.
The original code predated the existence of the BUG() function, which
was added in d8193743e0 (usage.c: add BUG() function, 2017-05-12).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7141efab24 (strbuf: clarify assertion in strbuf_setlen(),
2011-04-27) this 'die("BUG: "' invocation was added with the rationale
that strbuf.c had existing users doing the same, but those users were
later changed to use BUG() in 033abf97fc (Replace all die("BUG: ...")
calls by BUG() ones, 2018-05-02). Let's do the same here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change this code added in da93d12b00 (pack-objects: be incredibly
anal about stdio semantics, 2006-04-02) to use BUG() instead.
See 1a07e59c3e (Update messages in preparation for i18n, 2018-07-21)
for when the "BUG: " prefix was added, and [1] for background on the
Solaris behavior that prompted the exhaustive error checking in this
fgets() loop.
1. https://lore.kernel.org/git/824.1144007555@lotus.CS.Berkeley.EDU/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the main() function to call "exit()" instead of ending with a
"return" statement. The "exit()" function is our own wrapper that
calls trace2_cmd_exit_fl() for us, from git-compat-util.h:
#define exit(code) exit(trace2_cmd_exit_fl(__FILE__, __LINE__, (code)))
That "exit()" wrapper has been in use ever since ee4512ed48 (trace2:
create new combined trace facility, 2019-02-22).
This changes nothing about how we "exit()", as we'd invoke
"trace2_cmd_exit_fl()" in both cases due to the wrapper, this change
makes it easier to reason about this code, as we're now always
obviously relying on our "exit()" wrapper.
There is already code immediately downstream of our "main()" which has
a hard reliance on that, e.g. the various "exit()" calls downstream of
"cmd_main()" in "git.c".
We even had a comment in "t/helper/test-trace2.c" that seemed to be
confused about how the "exit()" wrapper interacted with uses of
"return", even though it was introduced in the same trace2 series in
a15860dca3 (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
t0212.sh, 2019-02-22), after the aforementioned ee4512ed48. Perhaps
it pre-dated the "exit()" wrapper?
This change makes the "trace2_cmd_exit()" macro orphaned, we now
always use "trace2_cmd_exit_fl()" directly, but let's keep that
simpler example in place. Even if we're unlikely to get another
"main()" other than the one in our "common-main.c", there's some value
in having the API documentation and example discuss a simpler version
that doesn't require an "exit()" wrapper macro.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable the sparse index for the 'git blame' command. The index was already
not expanded with this command, so the most interesting thing to do is to
add tests that verify that 'git blame' behaves correctly when the sparse
index is enabled and that its performance improves. More specifically, these
cases are:
1. The index is not expanded for 'blame' when given paths in the sparse
checkout cone at multiple levels.
2. Performance measurably improves for 'blame' with sparse index when given
paths in the sparse checkout cone at multiple levels.
The `p2000` tests demonstrate a ~60% execution time reduction when running
'blame' for a file two levels deep and and a ~30% execution time reduction
for a file three levels deep.
Test before after
----------------------------------------------------------------
2000.62: git blame f2/f4/a (full-v3) 0.31 0.32 +3.2%
2000.63: git blame f2/f4/a (full-v4) 0.29 0.31 +6.9%
2000.64: git blame f2/f4/a (sparse-v3) 0.55 0.23 -58.2%
2000.65: git blame f2/f4/a (sparse-v4) 0.57 0.23 -59.6%
2000.66: git blame f2/f4/f3/a (full-v3) 0.77 0.85 +10.4%
2000.67: git blame f2/f4/f3/a (full-v4) 0.78 0.81 +3.8%
2000.68: git blame f2/f4/f3/a (sparse-v3) 1.07 0.72 -32.7%
2000.99: git blame f2/f4/f3/a (sparse-v4) 1.05 0.73 -30.5%
We do not include paths outside the sparse checkout cone because blame
does not support blaming files that are not present in the working
directory. This is true in both sparse and full checkouts.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable the sparse index within the 'git diff' command. Its implementation
already safely integrates with the sparse index because it shares code
with the 'git status' and 'git checkout' commands that were already
integrated. For more details see:
d76723ee53 (status: use sparse-index throughout, 2021-07-14)
1ba5f45132 (checkout: stop expanding sparse indexes, 2021-06-29)
The most interesting thing to do is to add tests that verify that 'git
diff' behaves correctly when the sparse index is enabled. These cases are:
1. The index is not expanded for 'diff' and 'diff --staged'
2. 'diff' and 'diff --staged' behave the same in full checkout, sparse
checkout, and sparse index repositories in the following partially-staged
scenarios (i.e. the index, HEAD, and working directory differ at a given
path):
1. Path is within sparse-checkout cone
2. Path is outside sparse-checkout cone
3. A merge conflict exists for paths outside sparse-checkout cone
The `p2000` tests demonstrate a ~44% execution time reduction for 'git
diff' and a ~86% execution time reduction for 'git diff --staged' using a
sparse index:
Test before after
-------------------------------------------------------------
2000.30: git diff (full-v3) 0.33 0.34 +3.0%
2000.31: git diff (full-v4) 0.33 0.35 +6.1%
2000.32: git diff (sparse-v3) 0.53 0.31 -41.5%
2000.33: git diff (sparse-v4) 0.54 0.29 -46.3%
2000.34: git diff --cached (full-v3) 0.07 0.07 +0.0%
2000.35: git diff --cached (full-v4) 0.07 0.08 +14.3%
2000.36: git diff --cached (sparse-v3) 0.28 0.04 -85.7%
2000.37: git diff --cached (sparse-v4) 0.23 0.03 -87.0%
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace uses of the synonym --staged in t1092 tests with --cached (which
is the real and preferred option). This will allow consistency in the new
tests to be added with the upcoming change to enable the sparse index for
diff.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check whether git directory exists before adding any repo settings. If it
does not exist, BUG with the message that one cannot add settings for an
uninitialized repository. If it does exist, proceed with adding repo
settings.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move repo setup to occur after git directory is set up. This will protect
against test failures in the upcoming change to BUG in
prepare_repo_settings if no git directory exists.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Return early if git directory does not exist. This will protect against
test failures in the upcoming change to BUG in prepare_repo_settings if no
git directory exists.
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure correct git directory setup when -h is passed with commands. This
specifically applies to repos with special help text configuration
variables and to commands run with -h outside a repository. This
will also protect against test failures in the upcoming change to BUG in
prepare_repo_settings if no git directory exists.
Note: this diff is better seen when ignoring whitespace changes.
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse_dir_matches_path() method compares a cache entry that is a
sparse directory entry against a 'struct traverse_info *info' and a
'struct name_entry *p' to see if the cache entry has exactly the right
name for those other inputs.
This method was introduced in 523506d (unpack-trees: unpack sparse
directory entries, 2021-07-14), but included a significant mistake. The
path comparisons used 'info->name' instead of 'info->traverse_path'.
Since 'info->name' only stores a single tree entry name while
'info->traverse_path' stores the full path from root, this method does
not work when 'info' is in a subdirectory of a directory. Replacing the
right strings and their corresponding lengths make the method work
properly.
The previous change included a failing test that exposes this issue.
That test now passes. The critical detail is that as we go deep into
unpack_trees(), the logic for merging a sparse directory entry with a
tree entry during 'git checkout' relies on this
sparse_dir_matches_path() in order to avoid calling
traverse_trees_recursive() during unpack_callback() in this hunk:
if (!is_sparse_directory_entry(src[0], names, info) &&
traverse_trees_recursive(n, dirmask, mask & ~dirmask,
names, info) < 0) {
return -1;
}
For deep paths, the short-circuit never occurred and
traverse_trees_recursive() was being called incorrectly and that was
causing other strange issues. Specifically, the error message from the
now-passing test previously included this:
error: Your local changes to the following files would be overwritten by checkout:
deep/deeper1/deepest2/a
deep/deeper1/deepest3/a
Please commit your changes or stash them before you switch branches.
Aborting
These messages occurred because the 'current' cache entry in
twoway_merge() was showing as NULL because the index did not contain
entries for the paths contained within the sparse directory entries. We
instead had 'oldtree' given as the entry at HEAD and 'newtree' as the
entry in the target tree. This led to reject_merge() listing these
paths.
Now that sparse_dir_matches_path() works the same for deep paths as it
does for shallow depths, the rest of the logic kicks in to properly
handle modifying the sparse directory entries as designed.
Reported-by: Gustave Granroth <gus.gran@gmail.com>
Reported-by: Mike Marcelais <michmarc@exchange.microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the repository data in the setup of t1092 to include more
directories within two parent directories. This reproduces a bug found
by users of the sparse index feature with suitably-complicated
sparse-checkout definitions.
Add a failing test that fails in its first 'git checkout deepest' run in
the sparse index case with this error:
error: Your local changes to the following files would be overwritten by checkout:
deep/deeper1/deepest2/a
deep/deeper1/deepest3/a
Please commit your changes or stash them before you switch branches.
Aborting
The next change will fix this error, and that fix will make it clear why
the extra depth is necessary for revealing this bug. The assignment of
the sparse-checkout definition to include deep/deeper1/deepest as a
sibling directory is important to ensure that deep/deeper1 is not a
sparse directory entry, but deep/deeper1/deepest2 is.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We documented that with grep.patternType set to default, the "git
grep" command returns to "the default matching behavior" in 84befcd0
(grep: add a grep.patternType configuration setting, 2012-08-03).
The grep.extendedRegexp configuration variable was the only way to
configure the behavior before that, after b22520a3 (grep: allow -E
and -n to be turned on by default via configuration, 2011-03-30)
introduced it.
It is understandable that we referred to the behavior that honors
the older configuration variable as "the default matching"
behavior. It is fairly clear in its log message:
When grep.patternType is set to a value other than "default", the
grep.extendedRegexp setting is ignored. The value of "default" restores
the current default behavior, including the grep.extendedRegexp
behavior.
But when the paragraph is read in isolation by a new person who is
not aware of that backstory (which is the synonym for "most users"),
the "default matching behavior" can be read as "how 'git grep'
behaves without any configuration variables or options", which is
"match the pattern as BRE".
Clarify what the passage means by elaborating what the phrase
"default matching behavior" wanted to mean.
Helped-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of test scripts have actually been adapted to accommodate for a
configurable default branch name, but they still overrode it via the
`GIT_TEST_*` variable. Let's drop that override where possible.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commands executed from `git rebase --exec` can give different behavior
from within that environment than they would outside of it, due to the
fact that sequencer.c exports both GIT_DIR and GIT_WORK_TREE. For
example, if the relevant script calls something like
git -C ../otherdir log --format=%H --no-walk
the user may be surprised to find that the command above does not show a
commit hash from ../otherdir, because $GIT_DIR prevents automatic gitdir
detection and makes the -C option useless.
This is a regression in behavior from the original legacy
implemented-in-shell rebase. It is perhaps rare for it to cause
problems in practice, especially since most small problems that were
caused by this area of bugs has been fixed-up in the past in a way that
masked the particular bug observed without fixing the real underlying
problem.
An explanation of how we arrived at the current situation is perhaps
merited. The setting of GIT_DIR and GIT_WORK_TREE done by sequencer.c
arose from a sequence of historical accidents:
* When rebase was implemented as a shell command, it would call
git-sh-setup, which among other things would set GIT_DIR -- but not
export it. This meant that when rebase --exec commands were run via
/bin/sh -c "$COMMAND"
they would not inherit the GIT_DIR setting. The fact that GIT_DIR
was not set in the run $COMMAND is the behavior we'd like to restore.
* When the rebase--helper builtin was introduced to allow incrementally
replacing shell with C code, we had an implementation that was half
shell, half C. In particular, commit 18633e1a22 ("rebase -i: use the
rebase--helper builtin", 2017-02-09) added calls to
exec git rebase--helper ...
which caused rebase--helper to inherit the GIT_DIR environment
variable from the shell. git's setup would change the environment
variable from an absolute path to a relative one (".git"), but would
leave it set. This meant that when rebase --exec commands were run
via
run_command_v_opt(...)
they would inherit the GIT_DIR setting.
* In commit 09d7b6c6fa ("sequencer: pass absolute GIT_DIR to exec
commands", 2017-10-31), it was noted that the GIT_DIR caused problems
with some commands; e.g.
git rebase --exec 'cd subdir && git describe' ...
would have GIT_DIR=.git which was invalid due to the change to the
subdirectory. Instead of questioning why GIT_DIR was set, that commit
instead made sequencer change GIT_DIR to be an absolute path and
explicitly export it via
argv_array_pushf(&child_env, "GIT_DIR=%s", absolute_path(get_git_dir()));
run_command_v_opt_cd_env(..., child_env.argv)
* In commit ab5e67d751 ("sequencer: pass absolute GIT_WORK_TREE to exec
commands", 2018-07-14), it was noted that when GIT_DIR is set but
GIT_WORK_TREE is not, that we do not discover GIT_WORK_TREE but just
assume it is '.'. That is incorrect if trying to run commands from a
subdirectory. However, rather than question why GIT_DIR was set, that
commit instead also added GIT_WORK_TREE to the list of things to
export.
Each of the above problems would have been fixed automatically when
git-rebase became a full builtin, had it not been for the fact that
sequencer.c started exporting GIT_DIR and GIT_WORK_TREE in the interim.
Stop exporting them now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Altmanninger <aclopte@gmail.com>
Acked-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
name-rev has a MERGE_TRAVERSAL_WEIGHT to say that traversing a second or
later parent of a merge should be 65535 times more expensive than a
first-parent traversal, as per ac076c29ae (name-rev: Fix non-shortest
description, 2007-08-27). The point of this weight is to prefer names
like
v2.32.0~1471^2
over names like
v2.32.0~43^2~15^2~11^2~20^2~31^2
which are two equally valid names in git.git for the same commit. Note
that the first follows 1472 parent traversals compared to a mere 125 for
the second. Weighting all traversals equally would clearly prefer the
second name since it has fewer parent traversals, but humans aren't
going to be traversing commits and they tend to have an easier time
digesting names with fewer segments. The fact that the former only has
two segments (~1471, ^2) makes it much simpler than the latter which has
six segments (~43, ^2, ~15, etc.). Since name-rev is meant to "find
symbolic names suitable for human digestion", we prefer fewer segments.
However, the particular rule implemented in name-rev would actually
prefer
v2.33.0-rc0~11^2~1
over
v2.33.0-rc0~20^2
because both have precisely one second parent traversal, and it gives
the tie breaker to shortest number of total parent traversals. Fewer
segments is more important for human consumption than number of hops, so
we'd rather see the latter which has one fewer segment.
Include the generation in is_better_name() and use a new
effective_distance() calculation so that we prefer fewer segments in
the printed name over fewer total parent traversals performed to get the
answer.
== Side-note on tie-breakers ==
When there are the same number of segments for two different names, we
actually use the name of an ancestor commit as a tie-breaker as well.
For example, for the commit cbdca289fb in the git.git repository, we
prefer the name v2.33.0-rc0~112^2~1 over v2.33.0-rc0~57^2~5. This is
because:
* cbdca289fb is the parent of 25e65b6dd5, which implies the name for
cbdca289fb should be the first parent of the preferred name for
25e65b6dd5
* 25e65b6dd5 could be named either v2.33.0-rc0~112^2 or
v2.33.0-rc0~57^2~4, but the former is preferred over the latter due
to fewer segments
* combine the two previous facts, and the name we get for cbdca289fb
is "v2.33.0-rc0~112^2~1" rather than "v2.33.0-rc0~57^2~5".
Technically, we get this for free out of the implementation since we
only keep track of one name for each commit as we walk history (and
re-add parents to the queue if we find a better name for those parents),
but the first bullet point above ensures users get results that feel
more consistent.
== Alternative Ideas and Meanings Discussed ==
One suggestion that came up during review was that shortest
string-length might be easiest for users to consume. However, such a
scheme would be rather computationally expensive (we'd have to track all
names for each commit as we traversed the graph) and would additionally
come with the possibly perplexing result that on a linear segment of
history we could rapidly swap back and forth on names:
MYTAG~3^2 would be preferred over MYTAG~9998
MYTAG~3^2~1 would NOT be preferred over MYTAG~9999
MYTAG~3^2~2 might be preferred over MYTAG~10000
Another item that came up was possible auxiliary semantic meanings for
name-rev results either before or after this patch. The basic answer
was that the previous implementation had no known useful auxiliary
semantics, but that for many repositories (most in my experience), the
new scheme does. In particular, the new name-rev output can often be
used to answer the question, "How or when did this commit get merged?"
Since that usefulness depends on how merges happen within the repository
and thus isn't universally applicable, details are omitted here but you
can see them at [1].
[1] https://lore.kernel.org/git/CABPp-BEeUM+3NLKDVdak90_UUeNghYCx=Dgir6=8ixvYmvyq3Q@mail.gmail.com/
Finally, it was noted that the algorithm could be improved by just
explicitly tracking the number of segments and using both it and
distance in the comparison, instead of giving a magic number that tries
to blend the two (and which therefore might give suboptimal results in
repositories with really huge numbers of commits that periodically merge
older code). However, "[this patch] seems to give us a much better
results than the current code, so let's take it and leave further
futzing outside the scope."
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 6b13bc3232 (xdiff: simplify comparison, 2021-11-17), we do not
call xdl_recmatch() from xdiffi.c's recs_match(), so we no longer need
the "flags" parameter. That in turn lets us drop the flags parameters
from the group-slide functions which call it.
There's no functional change here; it's just making these functions a
little simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 663c5ad035 (diff histogram: intern strings, 2021-11-17), our
cmp_recs() does not call xdl_recmatch(), and thus no longer needs an
xpparam_t struct from which to get the flags. We can drop the unused
parameter from the function, as well as the macro which wraps it.
There's no functional change here; it's just simplifying things (and
making -Wunused-parameter happier).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro has been unused since 43ca7530df (xdiff/xhistogram: rely on
xdl_trim_ends(), 2011-08-01). The function that called it went away
there, and its use in the CMP() macro was inlined. It probably should
have been deleted then, but nobody noticed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verbose mode was added to `git worktree list` by 076b444a62
(worktree: teach `list` verbose mode, 2021-01-27), although the
documentation was updated to reflect the new functionality, the
synopsis was overlooked. Correct this minor oversight.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The order in which the stdout and stderr streams are flushed is not
guaranteed to be the same across platforms or `libc` implementations.
This lack of determinism can lead to anomalous and potentially confusing
output if normal (stdout) output is flushed after error (stderr) output.
For instance, the following output which clearly indicates a failure due
to a fatal error:
% git worktree add ../foo bar
Preparing worktree (checking out 'bar')
fatal: 'bar' is already checked out at '.../wherever'
has been reported[1] on Microsoft Windows to appear as:
% git worktree add ../foo bar
fatal: 'bar' is already checked out at '.../wherever'
Preparing worktree (checking out 'bar')
which may confuse the reader into thinking that the command somehow
recovered and ran to completion despite the error.
This problem crops up because the "chatty" status message "Preparing
worktree" is sent to stdout, whereas the "fatal" error message is sent
to stderr. One way to fix this would be to flush stdout manually before
git-worktree reports any errors to stderr.
However, common practice in Git is for "chatty" messages to be sent to
stderr. Therefore, a more appropriate fix is to adjust git-worktree to
conform to that practice by sending its "chatty" messages to stderr
rather than stdout as is currently the case.
There may be concern that relocating messages from stdout to stderr
could break existing tooling, however, these messages are already
internationalized, thus are unstable. And, indeed, the "Preparing
worktree" message has already been the subject of somewhat significant
changes in 2c27002a0a (worktree: improve message when creating a new
worktree, 2018-04-24). Moreover, there is existing precedent, such as
68b939b2f0 (clone: send diagnostic messages to stderr, 2013-09-18) which
likewise relocated "chatty" messages from stdout to stderr for
git-clone.
[1]: https://lore.kernel.org/git/CA+34VNLj6VB1kCkA=MfM7TZR+6HgqNi5-UaziAoCXacSVkch4A@mail.gmail.com/T/
Reported-by: Baruch Burstein <bmburstein@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since its definition in 2f4ba2a867 (packfile: prepare for the existence
of '*.rev' files, 2021-01-25), the only caller of
`close_pack_revindex()` was within packfile.c.
Thus there is no need for this to be exposed via packfile.h, and instead
can remain static within packfile.c's compilation unit. While we're
here, move the function's opening brace onto its own line.
Noticed-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .NET version of Scalar has a `version` command. This was necessary
because it was versioned independently of Git.
Since Scalar is now tightly coupled with Git, it does not make sense for
them to show different versions. Therefore, it shows the same output as
`git version`. For backwards-compatibility with the .NET version,
`scalar version` prints to `stderr`, though (`git version` prints to
`stdout` instead).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Delete an enlistment by first unregistering the repository and then
deleting the enlistment directory (usually the directory containing the
worktree `src/` directory).
On Windows, if the current directory is inside the enlistment's
directory, change to the parent of the enlistment directory, to allow us
to delete the enlistment (directories used by processes e.g. as current
working directories cannot be deleted on Windows).
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After a Scalar upgrade, it can come in really handy if there is an easy
way to reconfigure all Scalar enlistments. This new option offers this
functionality.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This comes in handy during Scalar upgrades, or when config settings were
messed up by mistake.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note: this subcommand is provided primarily for backwards-compatibility,
for existing Scalar uses. It is mostly just a shim for `git
maintenance`, mapping task names from the way Scalar called them to the
way Git calls them.
The reason why those names differ? The background maintenance was first
implemented in Scalar, and when it was contributed as a patch series
implementing the `git maintenance` command, reviewers suggested better
names, those suggestions were accepted before the patches were
integrated into core Git.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like `git clone`, the `scalar clone` command now also offers to
restrict the clone to a single branch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This implements Scalar's opinionated `clone` command: it tries to use a
partial clone and sets up a sparse checkout by default. In contrast to
`git clone`, `scalar clone` sets up the worktree in the `src/`
subdirectory, to encourage a separation between the source files and the
build output (which helps Git tremendously because it avoids untracked
files that have to be specifically ignored when refreshing the index).
Also, it registers the repository for regular, scheduled maintenance,
and configures a flurry of configuration settings based on the
experience and experiments of the Microsoft Windows and the Microsoft
Office development teams.
Note: since the `scalar clone` command is by far the most commonly
called `scalar` subcommand, we document it at the top of the manual
page.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The produced list simply consists of those repositories registered under
the multi-valued `scalar.repo` config setting in the user's Git config.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user deleted an enlistment manually, let's be generous and
_still_ unregister it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like `scalar register` starts the scheduled background maintenance,
`scalar unregister` stops it. Note that we use `git maintenance start`
in `scalar register`, but we do not use `git maintenance stop` in
`scalar unregister`: this would stop maintenance for _all_ repositories,
not just for the one we want to unregister.
The `unregister` command also removes the corresponding entry from the
`[scalar]` section in the global Git config.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's start implementing the `register` command. With this commit,
recommended settings are configured upon `scalar register`, and Git's
background maintenance is started.
The recommended config settings may very well change in the future. For
example, once the built-in FSMonitor is available, we will want to
enable it upon `scalar register`. For that reason, we explicitly support
running `scalar register` in an already-registered enlistment.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To test the Scalar command, create a test script in contrib/scalar/t
that is executed as `make -C contrib/scalar test`. Since Scalar has no
meaningful capabilities yet, the only test is rather simple. We will add
more tests in subsequent commits that introduce corresponding, new
functionality.
Note: This test script is intended to test `scalar` only lightly, even
after all of the functionality is implemented.
A more comprehensive functional (or: integration) test suite can be
found at https://github.com/microsoft/scalar; It is used in the workflow
https://github.com/microsoft/git/blob/HEAD/.github/workflows/scalar-functional-tests.yml
in Microsoft's Git fork. This test suite performs end-to-end tests with
a real remote repository, and is run as part of the regular CI and PR
builds in that fork.
Since those tests require some functionality supported only by
Microsoft's Git fork ("GVFS protocol"), there is no intention to port
that fuller test suite to `contrib/scalar/`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's build up the documentation for the Scalar command along with the
patches that implement its functionality.
Note: To discourage the feature-incomplete documentation from being
mistaken for the complete thing, we do not yet provide any way to build
HTML or manual pages from the text file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of Scalar (https://github.com/microsoft/scalar), and before
that, of VFS for Git, has always been to prove that Git _can_ scale, and
to upstream whatever strategies have been demonstrated to help.
With this patch, we start the journey from that C# project to move what
is left to Git's own `contrib/` directory, reimplementing it in pure C,
with the intention to facilitate integrating the functionality into core
Git all while maintaining backwards-compatibility for existing Scalar
users (which will be much easier when both live in the same worktree).
It has always been the plan to contribute all of the proven strategies
back to core Git.
For example, while the virtual filesystem provided by VFS for Git helped
the team developing the Windows operating system to move onto Git, while
trying to upstream it we realized that it cannot be done: getting the
virtual filesystem to work (which we only managed to implement fully on
Windows, but not on, say, macOS or Linux), and the required server-side
support for the GVFS protocol, made this not quite feasible.
The Scalar project learned from that and tackled the problem with
different tactics: instead of pretending to Git that the working
directory is fully populated, it _specifically_ teaches Git about
partial clone (which is based on VFS for Git's cache server), about
sparse checkout (which VFS for Git tried to do transparently, in the
file system layer), and regularly runs maintenance tasks to keep the
repository in a healthy state.
With partial clone, sparse checkout and `git maintenance` having been
upstreamed, there is little left that `scalar.exe` does which `git.exe`
cannot do. One such thing is that `scalar clone <url>` will
automatically set up a partial, sparse clone, and configure
known-helpful settings from the start.
So let's bring this convenience into Git's tree.
The idea here is that you can (optionally) build Scalar via
make -C contrib/scalar/
This will build the `scalar` executable and put it into the
contrib/scalar/ subdirectory.
The slightly awkward addition of the `contrib/scalar/*` bits to the
top-level `Makefile` are actually really required: we want to link to
`libgit.a`, which means that we will need to use the very same `CFLAGS`
and `LDFLAGS` as the rest of Git.
An early development version of this patch tried to replicate all the
conditional code in `contrib/scalar/Makefile` (e.g. `NO_POLL`) just like
`contrib/svn-fe/Makefile` used to do before it was retired. It turned
out to be quite the whack-a-mole game: the SHA-1-related flags, the
flags enabling/disabling `compat/poll/`, `compat/regex/`,
`compat/win32mmap.c` & friends depending on the current platform... To
put it mildly: it was a major mess.
Instead, this patch makes minimal changes to the top-level `Makefile` so
that the bits in `contrib/scalar/` can be compiled and linked, and
adds a `contrib/scalar/Makefile` that uses the top-level `Makefile` in a
most minimal way to do the actual compiling.
Note: With this commit, we only establish the infrastructure, no
Scalar functionality is implemented yet; We will do that incrementally
over the next few commits.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Scalar command will be contributed incrementally, over a bunch of
patch series. Let's document what Scalar is about, and then describe the
patch series that are planned.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It has long been practice on this project for a command to emit its
primary output to stdout so that it can be captured to a file or sent
down a pipe, and to emit "chatty" messages (such as those reporting
progress) to stderr so that they don't interfere with the primary
output. However, this practice is not necessarily universal; another
common practice is to send only error messages to stderr, and all other
messages to stdout. Therefore, help newcomers out by documenting how
stdout and stderr are used on this project.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are certain C99 features that might be nice to use in our code
base, but we've hesitated to do so in order to avoid breaking
compatibility with older compilers. But we don't actually know if
people are even using pre-C99 compilers these days.
One way to figure that out is to introduce a very small use of a
feature, and see if anybody complains, and we've done so to probe
the portability for a few features like "trailing comma in enum
declaration", "designated initializer for struct", and "designated
initializer for array". A few years ago, we tried to use a handy
for (int i = 0; i < n; i++)
use(i);
to introduce a new variable valid only in the loop, but found that
some compilers we cared about didn't like it back then. Two years
is a long-enough time, so let's try it again.
If this patch can survive a few releases without complaint, then we
can feel more confident that variable declaration in for() loop is
supported by the compilers our user base use. And if we do get
complaints, then we'll have gained some data and we can easily
revert this patch.
Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On iteration, the reflog message is always terminated by a newline. Trim it to
avoid clobbering the console with is this extra newline.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have some tests that read from files in .git/logs/ hierarchy
when checking if correct reflog entries are created, but that is
too specific to the files backend. Other backends like reftable
may not store its reflog entries in such a "one line per entry"
format.
Update for-each-reflog-ent test helper to produce output that
is identical to lines in a reflog file files backend uses.
That way, (1) the current tests can be updated to use the test
helper to read the reflog entries instead of (parts of) reflog
files, and perform the same inspection for correctness, and (2)
when the ref backend is swapped to another backend, the updated
test can be used as-is to check the correctness.
Adapt t1400 to use the for-each-reflog-ent test helper.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are checking for a certain ordering, we should check that there are two
entries. Do this by mirroring the preceding test.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By convention, reflog messages always end in '\n', so
before we would print blank lines between entries.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, --reflog option would look for '\t' in the reflog message. As refs.c
already parses the reflog line, the '\t' was never found, and show-branch
--reflog would always say "(none)" as reflog message
Add test.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible to specify --simplify-by-decoration but not --decorate. In
this case we do respect the simplification, but we don't actually show
any decorations. However, it works by lazy-loading the decorations when
needed; this is discussed in more detail in 0cc7380d88 (log-tree: call
load_ref_decorations() in get_name_decoration(), 2019-09-08).
This works for basic cases, but will fail to respect any --decorate-refs
option (or its variants). Those are handled only when cmd_log_init()
loads the ref decorations up front, which is only when --decorate is
specified explicitly (or as of the previous commit, when the userformat
asks for %d or similar).
We can solve this by making sure to load the decorations if we're going
to simplify using them but they're not otherwise going to be displayed.
The new test shows a simple case that fails without this patch. Note
that we expect two commits in the output: the one we asked for by
--decorate-refs, and the initial commit. The latter is just a quirk of
how --simplify-by-decoration works. Arguably it may be a bug, but it's
unrelated to this patch (which is just about the loading of the
decorations; you get the same behavior before this patch with an
explicit --decorate).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to show ref decorations, we first have to load them. If you
run:
git log --decorate
then git-log will recognize the option and load them up front via
cmd_log_init(). Likewise if log.decorate is set.
If you don't say --decorate explicitly, but do mention "%d" or "%D" in
the output format, like so:
git log --format=%d
then this also works, because we lazy-load the ref decorations. This has
been true since 3b3d443feb (add '%d' pretty format specifier to show
decoration, 2008-09-04), though the lazy-load was later moved into
log-tree.c.
But there's one problem: that lazy-load just uses the defaults; it
doesn't take into account any --decorate-refs options (or its exclude
variant, or their config). So this does not work:
git log --decorate-refs=whatever --format=%d
It will decorate using all refs, not just the specified ones. This has
been true since --decorate-refs was added in 65516f586b (log: add option
to choose which refs to decorate, 2017-11-21). Adding further confusion
is that it _may_ work because of the auto-decoration feature. If that's
in use (and it often is, as it's the default), then if the output is
going to stdout, we do enable decorations early (and so load them up
front, respecting the extra options). But otherwise we do not. So:
git log --decorate-refs=whatever --format=%d >some-file
would typically behave differently than it does when the output goes to
the pager or terminal!
The solution is simple: we should recognize in cmd_log_init() that we're
going to show decorations, and make sure we load them there. We already
check userformat_find_requirements(), so we can couple this with our
existing code there.
There are two new tests. The first shows off the actual fix. The second
makes sure that our fix doesn't cause us to stomp on an existing
--decorate option (see the new comment in the code, as well).
Reported-by: Josh Rampersad <josh.rampersad@voiceflow.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refuse to force-move a branch over the currently checked out branch of
any working tree, not just the current one.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A bare repository won’t have a working tree at "..", but it may still
have separate working trees created with git worktree. We should protect
the current branch of such working trees from being updated or deleted,
according to receive.denyCurrentBranch.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
update_worktree() can only be called with a non-NULL worktree parameter,
because that’s the only case where we set do_update_worktree = 1.
worktree->path is always initialized to non-NULL.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refuse to fetch into the currently checked out branch of any working
tree, not just the current one.
Fixes this previously reported bug:
https://lore.kernel.org/git/cb957174-5e9a-5603-ea9e-ac9b58a2eaad@mathema.de/
As a side effect of using find_shared_symref, we’ll also refuse the
fetch when we’re on a detached HEAD because we’re rebasing or bisecting
on the branch in question. This seems like a sensible change.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Storing the worktrees list in a static variable meant that
find_shared_symref() had to rebuild the list on each call (which is
inefficient when the call site is in a loop), and also that each call
invalidated the pointer returned by the previous call (which is
confusing).
Instead, make it the caller’s responsibility to pass in the worktrees
list and manage its lifetime.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/CodingGuidelines says “do not end error messages with a
full stop” and “do not capitalize the first word”. Clean up existing
messages, some of which we will be touching in later steps in the
series, that deviate from these rules in this file, as a preparation for
the main part of the topic.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/CodingGuidelines says “do not end error messages with a
full stop” and “do not capitalize the first word”. Clean up existing
messages, some of which we will be touching in later steps in the
series, that deviate from these rules in this file, as a preparation for
the main part of the topic.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation/CodingGuidelines says “do not end error messages with a
full stop” and “do not capitalize the first word”. Clean up existing
messages, some of which we will be touching in later steps in the
series, that deviate from these rules in this file, as a preparation for
the main part of the topic.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
BAIL_OUT() is meant to abort the whole test run and print a message with
a standard prefix that can be parsed to stdout. Since for every test the
normal fd`s are redirected in test_eval_ this output would not be seen
when used within the context of a test or prereq like we do in
test_have_prereq(). To make this function work in these contexts we move
the setup of the fd aliases a few lines up before the first use of
BAIL_OUT() and then have this function always print to the alias.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The C99 standard was released in January 1999, now 22 years ago. It
provides a variety of useful features, including variadic arguments for
macros, declarations after statements, designated initializers, and a
wide variety of other useful features, many of which we already use.
We'd like to take advantage of these features, but we want to be
cautious. As far as we know, all major compilers now support C99 or a
later C standard, such as C11 or C17. POSIX has required C99 support as
a requirement for the 2001 revision, so we can safely assume any POSIX
system which we are interested in supporting has C99.
Even MSVC, long a holdout against modern C, now supports both C11 and
C17 with an appropriate update. Moreover, even if people are using an
older version of MSVC on these systems, they will generally need some
implementation of the standard Unix utilities for the testsuite, and GNU
coreutils, the most common option, has required C99 since 2009.
Therefore, we can safely assume that a suitable version of GCC or clang
is available to users even if their version of MSVC is not sufficiently
capable.
Let's add a test balloon to git-compat-util.h to see if anyone is using
an older compiler. We'll add a comment telling people how to enable
this functionality on GCC and Clang, even though modern versions of both
will automatically do the right thing, and ask people still experiencing
a problem to report that to us on the list.
Note that C89 compilers don't provide the __STDC_VERSION__ macro, so we
use a well-known hack of using "- 0". On compilers with this macro, it
doesn't change the value, and on C89 compilers, the macro will be
replaced with nothing, and our value will be 0.
For sparse, we explicitly request the gnu99 style because we've
traditionally taken advantage of some GCC- and clang-specific extensions
when available and we'd like to retain the ability to do that. sparse
also defaults to C89 without it, so things will fail for us if we don't.
Update the cmake configuration to require C11 for MSVC. We do this
because this will make MSVC to use C11, since it does not explicitly
support C99. We do this with a compiler options because setting the
C_STANDARD option does not work in our CI on MSVC and at the moment, we
don't want to require C11 for Unix compilers.
In the Makefile, don't set any compiler flags for the compiler itself,
since on some systems, such as FreeBSD, we actually need C11, and asking
for C99 causes things to fail to compile. The error message should make
it obvious what's going wrong and allow a user to set the appropriate
option when building in the event they're using a Unix compiler that
doesn't support it by default.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Visual Studio reports C4334 "was 64-bit shift intended" warning because
of size miss-match.
Promote unity to the matching type to fit with the assignment.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Visual Studio reports C4334 "was 64-bit shift intended" warning
because of size miss-match.
Promote unity to the matching type to fit with its subsequent operation.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Visual Studio reports C4334 "was 64-bit shift intended" warning
because of size mismatch.
Promote unity to the matching type to fit with the `&` operator.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"zdiff3" is identical to ordinary diff3 except that it allows compaction
of common lines on the two sides of history at the beginning or end of
the conflict hunk. For example, the following diff3 conflict:
1
2
3
4
<<<<<<
A
B
C
D
E
||||||
5
6
======
A
X
C
Y
E
>>>>>>
7
8
9
has common lines 'A', 'C', and 'E' on the two sides. With zdiff3, one
would instead get the following conflict:
1
2
3
4
A
<<<<<<
B
C
D
||||||
5
6
======
X
C
Y
>>>>>>
E
7
8
9
Note that the common lines, 'A', and 'E' were moved outside the
conflict. Unlike with the two-way conflicts from the 'merge'
conflictStyle, the zdiff3 conflict is NOT split into multiple conflict
regions to allow the common 'C' lines to be shown outside a conflict,
because zdiff3 shows the base version too and the base version cannot be
reasonably split.
Note also that the removing of lines common to the two sides might make
the remaining text inside the conflict region match the base text inside
the conflict region (for example, if the diff3 conflict had '5 6 E' on
the right side of the conflict, then the common line 'E' would be moved
outside and both the base and right side's remaining conflict text would
be the lines '5' and '6'). This has the potential to surprise users and
make them think there should not have been a conflict, but there
definitely was a conflict and it should remain.
Based-on-patch-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Co-authored-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9a5315edfd (Merge branch 'js/patch-mode-in-others-in-c',
2020-02-05), Git acquired a built-in implementation of `git add`'s
interactive mode that could be turned on via the config option
`add.interactive.useBuiltin`.
The first official Git version to support this knob was v2.26.0.
In 2df2d81ddd (add -i: use the built-in version when
feature.experimental is set, 2020-09-08), this built-in implementation
was also enabled via `feature.experimental`. The first version with this
change was v2.29.0.
More than a year (and very few bug reports) later, it is time to declare
the built-in implementation mature and to turn it on by default.
We specifically leave the `add.interactive.useBuiltin` configuration in
place, to give users an "escape hatch" in the unexpected case should
they encounter a previously undetected bug in that implementation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The scripted version of the interactive mode of `git add` still requires
Perl, but the built-in version does not. Let's only require the PERL
prereq if testing the scripted version.
This addresses a long-standing NEEDSWORK added in 35166b1fb5 (t2016:
add a NEEDSWORK about the PERL prerequisite, 2020-10-07).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--exec <cmd>` is documented as
Append "exec <cmd>" after each line creating a commit in the final
history.
...
If --autosquash is used, "exec" lines will not be appended for the
intermediate commits, and will only appear at the end of each
squash/fixup series.
Unfortunately, it would also add exec commands after non-pick
operations, such as 'no-op', which could be seen for example with
git rebase -i --exec true HEAD
todo_list_add_exec_commands() intent was to insert exec commands after
each logical pick, while trying to consider a chains of fixup and squash
commits to be part of the pick before it. So it would keep an 'insert'
boolean tracking if it had seen a pick or merge, but not write the exec
command until it saw the next non-fixup/squash command. Since that
would make it miss the final exec command, it had some code that would
check whether it still needed to insert one at the end, but instead of a
simple
if (insert)
it had a
if (insert || <condition that is always true>)
That's buggy; as per the docs, we should only add exec commands for
lines that create commits, i.e. only if insert is true. Fix the
conditional.
There was one testcase in the testsuite that we tweak for this change;
it was introduced in 54fd3243da ("rebase -i: reread the todo list if
`exec` touched it", 2017-04-26), and was merely testing that after an
exec had fired that the todo list would be re-read. The test at the
time would have worked given any revision at all, though it would only
work with 'HEAD' as a side-effect of this bug. Since we're fixing this
bug, choose something other than 'HEAD' for that test.
Finally, add a testcase that verifies when we have no commits to pick,
that we get no exec lines in the generated todo list.
Reported-by: Nikita Bobko <nikitabobko@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The clean/smudge conversion code path has been prepared to better
work on platforms where ulong is narrower than size_t.
* mc/clean-smudge-with-llp64:
clean/smudge: allow clean filters to process extremely large files
odb: guard against data loss checking out a huge file
git-compat-util: introduce more size_t helpers
odb: teach read_blob_entry to use size_t
t1051: introduce a smudge filter test for extremely large files
test-lib: add prerequisite for 64-bit platforms
test-tool genzeros: generate large amounts of data more efficiently
test-genzeros: allow more than 2G zeros in Windows
Make a few helper functions unused and then lose them.
* ab/sh-retire-helper-functions:
git-sh-setup: remove "sane_grep", it's not needed anymore
git-sh-setup: remove unused sane_egrep() function
git-instaweb: unconditionally assume that gitweb is mod_perl capable
Makefile: remove $(NO_CURL) from $(SCRIPT_DEFINES)
Makefile: remove $(GIT_VERSION) from $(SCRIPT_DEFINES)
Makefile: move git-SCRIPT-DEFINES adjacent to $(SCRIPT_DEFINES)
The command line complation for "git send-email" options have been
tweaked to make it easier to keep it in sync with the command itself.
* tp/send-email-completion:
send-email docs: add format-patch options
send-email: programmatically generate bash completions
The compatibility implementation for unsetenv(3) were written to
mimic ancient, non-POSIX, variant seen in an old glibc; it has been
changed to return an integer to match the more modern era.
* jc/unsetenv-returns-an-int:
unsetenv(3) returns int, not void
Things like "git -c branch.sort=bogus branch new HEAD", i.e. the
operation modes of the "git branch" command that do not need the
sort key information, no longer errors out by seeing a bogus sort
key.
* jc/fix-ref-sorting-parse:
for-each-ref: delay parsing of --sort=<atom> options
"git stash" learned the "--staged" option to stash away what has
been added to the index (and nothing else).
* so/stash-staged:
stash: get rid of unused argument in stash_staged()
stash: implement '--staged' option for 'push' and 'save'
Teach and encourage first-time contributors to this project to
state the base commit when they submit their topic.
* jc/tutorial-format-patch-base:
MyFirstContribution: teach to use "format-patch --base=auto"
The "remainder" of hn/refs-errno-cleanup topic.
* ab/refs-errno-cleanup: (21 commits)
refs API: post-migration API renaming [2/2]
refs API: post-migration API renaming [1/2]
refs API: don't expose "errno" in run_transaction_hook()
refs API: make expand_ref() & repo_dwim_log() not set errno
refs API: make resolve_ref_unsafe() not set errno
refs API: make refs_ref_exists() not set errno
refs API: make refs_resolve_refdup() not set errno
refs tests: ignore ignore errno in test-ref-store helper
refs API: ignore errno in worktree.c's find_shared_symref()
refs API: ignore errno in worktree.c's add_head_info()
refs API: make files_copy_or_rename_ref() et al not set errno
refs API: make loose_fill_ref_dir() not set errno
refs API: make resolve_gitlink_ref() not set errno
refs API: remove refs_read_ref_full() wrapper
refs/files: remove "name exist?" check in lock_ref_oid_basic()
reflog tests: add --updateref tests
refs API: make refs_rename_ref_available() static
refs API: make parse_loose_ref_contents() not set errno
refs API: make refs_read_raw_ref() not set errno
refs API: add a version of refs_resolve_ref_unsafe() with "errno"
...
Allow "git status --porcelain=v2" to show the number of stash
entries with --show-stash like the normal output does.
* ow/stash-count-in-status-porcelain-output:
status: print stash info with --porcelain=v2 --show-stash
status: count stash entries in separate function
Treat "_" as any other URL-valid characters in an URL when matching
the per-URL configuration variable names.
* jk/loosen-urlmatch:
urlmatch: add underscore to URL_HOST_CHARS
The files backend uses file system locking on individual refs, which means a
directory/file conflict can prevent locks being taken. For example, in a repo
with just the ref "foo", an update
(DELETE "foo") + (ADD "foo/bar")
cannot be executed in the files backend, as one cannot take a lock on foo/bar.
The current reftable proof-of-concept integration supports these tranactions, as
the result is a repo with just "foo/bar", which has no directory/file conflict.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* vd/sparse-reset:
unpack-trees: improve performance of next_cache_entry
reset: make --mixed sparse-aware
reset: make sparse-aware (except --mixed)
reset: integrate with sparse index
reset: expand test coverage for sparse checkouts
sparse-index: update command for expand/collapse test
reset: preserve skip-worktree bit in mixed reset
reset: rename is_missing to !is_in_reset_tree
To find the first non-unpacked cache entry, `next_cache_entry` iterates
through index, starting at `cache_bottom`. The performance of this in full
indexes is helped by `cache_bottom` advancing with each invocation of
`mark_ce_used` (called by `unpack_index_entry`). However, the presence of
sparse directories can prevent the `cache_bottom` from advancing in a sparse
index case, effectively forcing `next_cache_entry` to search from the
beginning of the index each time it is called.
The `cache_bottom` must be preserved for the sparse index (see 17a1bb570b
(unpack-trees: preserve cache_bottom, 2021-07-14)). Therefore, to retain the
benefit `cache_bottom` provides in non-sparse index cases, a separate `hint`
position indicates the first position `next_cache_entry` should search,
updated each execution with a new position.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `ensure_full_index` guard on `read_from_tree` and update `git
reset --mixed` to ensure it can use sparse directory index entries wherever
possible. Sparse directory entries are reset using `diff_tree_oid`, which
requires `change` and `add_remove` functions to process the internal
contents of the sparse directory. The `recursive` diff option handles cases
in which `reset --mixed` must diff/merge files that are nested multiple
levels deep in a sparse directory.
The use of pathspecs with `git reset --mixed` introduces scenarios in which
internal contents of sparse directories may be matched by the pathspec. In
order to reset *all* files in the repo that may match the pathspec, the
following conditions on the pathspec require index expansion before
performing the reset:
* "magic" pathspecs
* wildcard pathspecs that do not match only in-cone files or entire sparse
directories
* literal pathspecs matching something outside the sparse checkout
definition
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove `ensure_full_index` guard on `prime_cache_tree` and update
`prime_cache_tree_rec` to correctly reconstruct sparse directory entries in
the cache tree. While processing a tree's entries, `prime_cache_tree_rec`
must determine whether a directory entry is sparse or not by searching for
it in the index (*without* expanding the index). If a matching sparse
directory index entry is found, no subtrees are added to the cache tree
entry and the entry count is set to 1 (representing the sparse directory
itself). Otherwise, the tree is assumed to not be sparse and its subtrees
are recursively added to the cache tree.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Disable `command_requires_full_index` repo setting and add
`ensure_full_index` guards around code paths that cannot yet use sparse
directory index entries. `reset --soft` does not modify the index, so no
compatibility changes are needed for it to function without expanding the
index. For all other reset modes (`--mixed`, `--hard`, `--keep`, `--merge`),
the full index is expanded to prevent cache tree corruption and invalid
variable accesses.
Additionally, the `read_cache()` check verifying an uncorrupted index is
moved after argument parsing and preparing the repo settings. The index is
not used by the preceding argument handling, but `read_cache()` must be run
*after* enabling sparse index for the command (so that the index is not
expanded unnecessarily) and *before* using the index for reset (so that it
is verified as uncorrupted).
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new tests for `--merge` and `--keep` modes, as well as mixed reset with
pathspecs. New performance test cases exercise various execution paths for
`reset`.
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change modified GIT_TRACE2_EVENT_NESTING by default within
test-lib.sh. These custom assignments throughout the test suite are no
longer necessary.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TRACE2_EVENT feed has a limited nesting depth to avoid
overloading the feed when recursing into deep paths while adding more
nested regions.
Some tests use the GIT_TRACE2_EVENT feed to look for internal events,
ensuring that intended behavior is happening.
One such example is in t4216-log-bloom.sh which looks for a statistic
given as a trace2_data_intmax() call. This test started failing under
'-x' with 2ca245f8be (csum-file.h: increase hashfile buffer size,
2021-05-18) because the change in stderr triggered the progress API to
create an extra trace2 region, ejecting the statistic.
This change increases the value of GIT_TRACE2_EVENT_NESTING across the
entire test suite to avoid errors like this. Future changes will remove
custom assignments of GIT_TRACE2_EVENT_NESTING from some test scripts
that were aware of this limitation.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b5cc003253 (add -i: ignore terminal escape sequences, 2011-05-17)
add an additional check to the original code to better handle keys
for escape sequences, but failed to account for the possibility
the first ReadKey call returned undef (ex: stdin closes) and that
was being handled fine by the original code in ca6ac7f135 (add -p:
prompt for single characters, 2009-02-05)
Add a test for undefined and encapsulate the loop and the original
print that relied on it within it.
After this, the following command (in a suitable repository state)
wouldn't print any error:
$ git -c interactive.singleKey add -p </dev/null
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs_for_each_reflog_ent() and refs_for_each_reflog_ent_reverse()
functions take a callback function that gets called with the details
of each reflog entry. Its parameters were not documented beyond
their names. Elaborate a bit on each of them.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mingw-w64's pthread_unistd.h had a bug that mistakenly (because there is
no support for the *lockfile() functions required[1]) defined
_POSIX_THREAD_SAFE_FUNCTIONS and that was being worked around since
3ecd153a3b (compat/mingw: support MSys2-based MinGW build, 2016-01-14).
The bug was fixed in winphtreads, but as a side effect, leaves the
reentrant functions from time.h no longer visible and therefore breaks
the build.
Since the intention all along was to avoid using the fallback functions,
formalize the use of POSIX by setting the corresponding feature flag and
compile out the implementation for the fallback functions.
[1] https://unix.org/whitepapers/reentrant.html
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "env" member from "struct child_process" in favor of always
using the "env_array". As with the preceding removal of "argv" in
favor of "args" this gets rid of current and future oddities around
memory management at the API boundary (see the amended API docs).
For some of the conversions we can replace patterns like:
child.env = env->v;
With:
strvec_pushv(&child.env_array, env->v);
But for others we need to guard the strvec_pushv() with a NULL check,
since we're not passing in the "v" member of a "struct strvec",
e.g. in the case of tmp_objdir_env()'s return value.
Ideally we'd rename the "env_array" member to simply "env" as a
follow-up, since it and "args" are now inconsistent in not having an
"_array" suffix, and seemingly without any good reason, unless we look
at the history of how they came to be.
But as we've currently got 122 in-tree hits for a "git grep env_array"
let's leave that for now (and possibly forever). Doing that rename
would be too disruptive.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend code added in 03831ef7b5 (difftool: implement the functionality
in the builtin, 2017-01-19) to use the "env_array" in the
run_command.[ch] API. Now we no longer need to manage our own
"index_env" buffer.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "argv" member from the run-command API, ever since "args"
was added in c460c0ecdc (run-command: store an optional argv_array,
2014-05-15) being able to provide either "argv" or "args" has led to
some confusion and bugs.
If we hadn't gone in that direction and only had an "argv" our
problems wouldn't have been solved either, as noted in [1] (and in the
documentation amended here) it comes with inherent memory management
issues: The caller would have to hang on to the "argv" until the
run-command API was finished. If the "argv" was an argument to main()
this wasn't an issue, but if it it was manually constructed using the
API might be painful.
We also have a recent report[2] of a user of the API segfaulting,
which is a direct result of it being complex to use. This commit
addresses the root cause of that bug.
This change is larger than I'd like, but there's no easy way to avoid
it that wouldn't involve even more verbose intermediate steps. We use
the "argv" as the source of truth over the "args", so we need to
change all parts of run-command.[ch] itself, as well as the trace2
logging at the same time.
The resulting Windows-specific code in start_command() is a bit nasty,
as we're now assigning to a strvec's "v" member, instead of to our own
"argv". There was a suggestion of some alternate approaches in reply
to an earlier version of this commit[3], but let's leave larger a
larger and needless refactoring of this code for now.
1. http://lore.kernel.org/git/YT6BnnXeAWn8BycF@coredump.intra.peff.net
2. https://lore.kernel.org/git/20211120194048.12125-1-ematsumiya@suse.de/
3. https://lore.kernel.org/git/patch-5.5-ea1011f7473-20211122T153605Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a pattern of hardcoding an "argv" array size, populating it and
assigning to the "argv" member of "struct child_process" to instead
use "strvec_push()" to add data to the "args" member.
As noted in the preceding commit this moves us further towards being
able to remove the "argv" member in a subsequent commit
These callers could have used strvec_pushl(), but moving to
strvec_push() makes the diff easier to read, and keeps the arguments
aligned as before.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a pattern of hardcoding an "argv" array size, populating it and
assigning to the "argv" member of "struct child_process" to instead
use "strvec_pushl()" to add data to the "args" member.
This implements the same behavior as before in fewer lines of code,
and moves us further towards being able to remove the "argv" member in
a subsequent commit.
Since we've entirely removed the "argv" variable(s) we can be sure
that no potential logic errors of the type discussed in a preceding
commit are being introduced here, i.e. ones where the local "argv" was
being modified after the assignment to "struct child_process"'s
"argv".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As in the preceding commit change this API user to use strvec_pushv()
instead of assigning to the "argv" member directly. This leaves us
without test coverage of how the "argv" assignment in this API works,
but we'll be removing it in a subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate those run-command API users that assign directly to the "argv"
member to use a strvec_pushv() of "args" instead.
In these cases it did not make sense to further refactor these
callers, e.g. daemon.c could be made to construct the arguments closer
to handle(), but that would require moving the construction from its
cmd_main() and pass "argv" through two intermediate functions.
It would be possible for a change like this to introduce a regression
if we were doing:
cp.argv = argv;
argv[1] = "foo";
And changed the code, as is being done here, to:
strvec_pushv(&cp.args, argv);
argv[1] = "foo";
But as viewing this change with the "-W" flag reveals none of these
functions modify variable that's being pushed afterwards in a way that
would introduce such a logic error.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This pattern added [1] in seems to have been intentional, but since
[2] and [3] we've wanted do initialization of what's now the "struct
strvec" "args" and "env_array" members. Let's not trample on that
initialization here.
1. 1bc01efed1 (upload-archive: use start_command instead of fork,
2011-11-19)
2. c460c0ecdc (run-command: store an optional argv_array, 2014-05-15)
3. 9a583dc39e (run-command: add env_array, an optional argv_array for
env, 2014-10-19)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_worktree() reuses a `child_process` for three run_command()
invocations, but to do so, it has overly-intimate knowledge of
run-command.c internals. In particular, it knows that it must reset
child_process::argv to NULL for each subsequent invocation[*] in order
for start_command() to latch the newly-populated child_process::args for
each invocation, even though this behavior is not a part of the
documented API. Beyond having overly-intimate knowledge of run-command.c
internals, the reuse of one `child_process` for three run_command()
invocations smells like an unnecessary micro-optimization. Therefore,
stop sharing one `child_process` and instead use a new one for each
run_command() call.
[*] If child_process::argv is not reset to NULL, then subsequent
run_command() invocations will instead incorrectly access a dangling
pointer to freed memory which had been allocated by child_process::args
on the previous run. This is due to the following code in
start_command():
if (!cmd->argv)
cmd->argv = cmd->args.v;
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless `command_requires_full_index` forces index expansion, ensure in-core
index sparsity matches config settings on read by calling
`ensure_correct_sparsity`. This makes the behavior of the in-core index more
consistent between different methods of updating sparsity: manually changing
the `index.sparse` config setting vs. executing
`git sparse-checkout --[no-]sparse-index init`
Although index sparsity is normally updated with `git sparse-checkout init`,
ensuring correct sparsity after a manual `index.sparse` change has some
practical benefits:
1. It allows for command-by-command sparsity toggling with
`-c index.sparse=<true|false>`, e.g. when troubleshooting issues with the
sparse index.
2. It prevents users from experiencing abnormal slowness after setting
`index.sparse` to `true` due to use of a full index in all commands until
the on-disk index is updated.
Helped-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `ensure_correct_sparsity` function is intended to provide a means of
aligning the in-core index with the sparsity required by the repository
settings and other properties of the index. The function first checks
whether a sparse index is allowed (per repository & sparse checkout pattern
settings). If the sparse index may be used, the index is converted to
sparse; otherwise, it is explicitly expanded with `ensure_full_index`.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting a full index to sparse, clear and recreate the cache tree
only if the cache tree is not fully valid. The convert_to_sparse operation
should exit silently if a cache tree update cannot be successfully completed
(e.g., due to a conflicted entry state). However, because this failure
scenario only occurs when at least a portion of the cache tree is invalid,
we can save ourselves the cost of clearing and recreating the cache tree by
skipping the check when the cache tree is fully valid.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move `prepare_repo_settings` after the git directory has been set up in
`test-read-cache.c`. The git directory settings must be initialized to
properly assign repo settings using the worktree-level git config.
Signed-off-by: Victoria Dye <vdye@github.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When prepare_cmd() fails for, e.g., pager process setup,
child_process_clear() frees the memory in pager_process.args, but .argv
was pointed to pager_process.args.v earlier in start_command(), so it's
now a dangling pointer.
setup_pager() is then called a second time, from cmd_log_init_finish()
in this case, and any further operations using its .argv, e.g. strvec_*,
will use the dangling pointer and eventually crash. According to trivial
tests, setup_pager() is not called twice if the first call is
successful.
This patch makes sure that pager_process is properly initialized on
setup_pager(). Drop CHILD_PROCESS_INIT from its declaration since it's
no longer really necessary.
Add a test to catch possible regressions.
Reproducer:
$ git config pager.show INVALID_PAGER
$ git show $VALID_COMMIT
error: cannot run INVALID_PAGER: No such file or directory
[1] 3619 segmentation fault (core dumped) git show $VALID_COMMIT
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "linux-clang" and "linux-gcc" jobs both run "make test" twice, but
with different environment variables. Running these in sequence seems
to have been done to work around some constraint on Travis, see
ae59a4e44f (travis: run tests with GIT_TEST_SPLIT_INDEX, 2018-01-07).
By having these run in parallel we'll get jobs that finish much sooner
than they otherwise would have.
We can also simplify the control flow in "ci/run-build-and-tests.sh"
as a result, since we won't run "make test" twice we don't need to run
"make" twice at all, let's default to "make all test" after setting
the variables, and then override it to just "all" for the compile-only
tests.
Add a comment to clarify that new "test" targets should adjust
$MAKE_TARGETS rather than being added after the "case/esac". This
should avoid future confusion where e.g. the compilation-only
"pedantic" target will unexpectedly start running tests. See [1] and
[2].
1. https://lore.kernel.org/git/211122.86ee78yxts.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/211123.86ilwjujmd.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the setup hooks for the CI to use "$runs_on_pool" for the
"$regular" job. Now we won't need as much boilerplate when adding new
jobs to the "regular" matrix, see 956d2e4639 (tests: add a test mode
for SANITIZE=leak, run it in CI, 2021-09-23) for the last such commit.
I.e. now instead of needing to enumerate each jobname when we select
packages we can install things depending on the pool we're running
in.
That we didn't do this dates back to the now gone dependency on Travis
CI, but even if we add a new CI target in the future this'll be easier
to port over, since we can probably treat "ubuntu-latest" as a
stand-in for some recent Linux that can run "apt" commands.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a follow-up to the preceding commit's shortening of CI job names,
rename the only job that starts with an upper-case letter to be
consistent with the rest. It was added in 88dedd5e72 (Travis: also
test on 32-bit Linux, 2017-03-05).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the names used for the GitHub CI workflows to be short enough
to (mostly) fit in the pop-up tool-tips that GitHub shows in the
commit view. I.e. when mouse-clicking on the passing or failing
check-mark next to the commit subject.
These names are seemingly truncated to 17-20 characters followed by
three dots ("..."). Since a "CI/PR / " prefix is added to them the job
names looked like this before (windows-test and vs-test jobs omitted):
CI/PR / ci-config (p...
CI/PR / windows-buil...
CI/PR / vs-build (pu...
CI/PR / regular (lin...
CI/PR / regular (lin...
CI/PR / regular (os...
CI/PR / regular (os...
CI/PR / regular (lin...
CI/PR / regular (lin...
CI/PR / dockerized (...
CI/PR / dockerized (...
CI/PR / dockerized (...
CI/PR / static-anal...
CI/PR / sparse (pu...
CI/PR / documenta...
By omitting the "/PR" from the top-level name, and pushing the
$jobname to the front we'll now instead get:
CI / config (push)
CI / win build (push...
CI / win+VS build (...
CI / linux-clang (ub...
CI / linux-gcc (ubun...
CI / osx-clang (osx)...
CI / osx-gcc (osx) (...
CI / linux-gcc-defau...
CI / linux-leaks (ub...
CI / linux-musl (alp...
CI / Linux32 (daald/...
CI / pedantic (fedor...
CI / static-analysis...
CI / sparse (push)...
CI / documentation
We then have no truncation in the expanded view. See [1] for how it
looked before, [2] for a currently visible CI run using this commit,
and [3] for the GitHub workflow syntax involved being changed here.
Let's also use the existing "pool" field as before. It's occasionally
useful to know we're running on say ubuntu v.s. fedora. The "-latest"
suffix is useful to some[4], and since it's now at the end it doesn't
hurt readability in the short view compared to saying "ubuntu" or
"macos".
1. https://github.com/git/git/tree/master/
2. https://github.com/avar/git/tree/avar/ci-rm-travis-cleanup-ci-names-3
3. https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions
3. https://lore.kernel.org/git/d9b07ca5-b58d-a535-d25b-85d7f12e6295@github.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove support for running the CI in travis. The last builds in it are
from 5 months ago[1] (as of 2021-11-19), and our documentation has
referred to GitHub CI instead since f003a91f5c (SubmittingPatches:
replace discussion of Travis with GitHub Actions, 2021-07-22).
We'll now run the "t9810 t9816" and tests on OSX. We didn't before, as
we'd carried the Travis exclusion of them forward from
522354d70f (Add Travis CI support, 2015-11-27). Let's hope whatever
issue there was with them was either Travis specific, or fixed since
then (I'm not sure).
The "apt-add-repository" invocation (which we were doing in GitHub CI)
isn't needed, it was another Travis-only case that was carried forward
into more general code. See 0f0c51181d (travis-ci: install packages
in 'ci/install-dependencies.sh', 2018-11-01).
Remove the "linux-gcc-4.8" job added in fb9d7431cf (travis-ci: build
with GCC 4.8 as well, 2019-07-18), it only ran in Travis CI.
1. https://travis-ci.org/github/git/git/builds
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t7006 check for a SIGPIPE result by recording $? and
comparing it with test_match_signal. Before the previous commit, the
command was on the left-hand side of a pipe, and so we had to do some
subshell trickery to extract it.
But now that this is no longer the case, we can do things much more
simply: just run the command directly, using braces to avoid wrecking
the &&-chain, and then record $?. We could almost use test_expect_code
here, but it doesn't know about test_match_signal.
Likewise, for tests which expect success (i.e., not SIGPIPE), we can
just put them in the &&-chain as usual. That even lets us get rid of the
!MINGW check, since the expectation is the same on both sides.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Comit c24b7f6736 (pager: test for exit code with and without SIGPIPE,
2021-02-02) introduced some tests that don't reliably generate SIGPIPE
where we expect it (i.e., when our pager doesn't read all of the output
from git-log).
There are two problems that somewhat cancel each other out.
First is that the output of git-log isn't very large (only around 800
bytes). So even if the pager doesn't read all of our output, it's racy
whether or not we'll actually get a SIGPIPE (we won't if we write all of
the output into the pipe buffer before the pager exits).
But we wrap git-log with test_terminal, which is supposed to propagate
the exit status of git-log. However, it doesn't always do so;
test_terminal will copy to stdout any lines that it got from our fake
pager, and it pipes to an empty command. So most of the time we are
seeing a SIGPIPE from test_terminal itself (though this is likewise
racy).
Let's try to make this more robust in two ways:
1. We'll put a commit with a huge message at the tip of history. Since
this is over a megabyte, it should fill the OS pipe buffer
completely, causing git-log to keep trying to write even after the
pager has exited.
2. We'll redirect the output of test_terminal to /dev/null. That means
it can never get SIGPIPE itself, and will always be giving us the
exit code from git-log.
These two changes reveal that one of the tests was looking for the wrong
behavior. If we try to start a pager that does not exist (according to
execve()), then the error propagates from start_command() back to the
pager code as an error, and we avoid redirecting git-log's stdout to the
broken pager entirely. Instead, it goes straight to the original stdout
(test_terminal's pty in this case), and we do not see a SIGPIPE at all.
So the test "git attempts to page to nonexisting pager command, gets
SIGPIPE" is checking the wrong outcome; it should be looking for a
successful exit (and was only confused by test_terminal's SIGPIPE).
There's a related test, "git discards nonexisting pager without
SIGPIPE", which sets the pager to a shell command which will read all
input and _then_ run a non-existing command. But that doesn't trigger
the same execve() behavior. We really do run the shell there and
redirect git-log's stdout to it. And the fact that the shell then exits
127 is not interesting. It is not different at that point than the
earlier test to check for "exit 1". So we can drop that test entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 507d7804c0 (pager: don't use unsafe functions in signal handlers,
2015-09-04), we have a separate code path in wait_or_whine() for the
case that we're in a signal handler. But that code path misses some of
the cases handled by the main logic.
This was improved in be8fc53e36 (pager: properly log pager exit code
when signalled, 2021-02-02), but that covered only case: actually
returning the correct error code. But there are some other cases:
- if waitpid() returns failure, we wouldn't notice and would look at
uninitialized garbage in the status variable; it's not clear if it's
possible to trigger this or not
- if the process exited by signal, then we would still report "-1"
rather than the correct signal code
This latter case even had a test added in be8fc53e36, but it doesn't
work reliably. It sets the pager command to:
>pager-used; test-tool sigchain
The latter command will die by signal, but because there are multiple
commands, there will be a shell in between. And it's the shell whose
waitpid() call will see the signal death, and it will then exit with
code 143, which is what Git will see.
To make matters even more confusing, some shells (such as bash) will
realize that there's nothing for the shell to do after test-tool
finishes, and will turn it into an exec. So the test was only checking
what it thought when /bin/sh points to a shell like bash (we're relying
on the shell used internally by Git to spawn sub-commands here, so even
running the test under bash would not be enough).
This patch adjusts the tests to explicitly call "exec" in the pager
command, which produces a consistent outcome regardless of shell. Note
that without the code change in this patch it _should_ fail reliably,
but doesn't. That test, like its siblings, tries to trigger SIGPIPE in
the git-log process writing to the pager, but only do so racily. That
will be fixed in a follow-on patch.
For the code change here, we have two options:
- we can teach the in_signal code to handle WIFSIGNALED()
- we can stop returning early when in_signal is set, and instead
annotate individual calls that we need to skip in this case
The former is a simpler patch, but means we're essentially duplicating
all of the logic. So instead I went with the latter. The result is a
bigger patch, and we do run the risk of new code being added but
forgetting to handle in_signal. But in the long run it seems more
maintainable.
I've skipped any non-trivial calls for the in_signal case, like calling
error(). We'll also skip the call to clear_child_for_cleanup(), as we
were before. This is arguably the wrong thing to do, since we wouldn't
want to try to clean it up again. But:
- we can't call it as-is, because it calls free(), which we must avoid
in a signal handler (we'd have to pass in_signal so it can skip the
free() call)
- we'll only go through the list of children to clean once, since our
cleanup_children_on_signal() handler pops itself after running (and
then re-raises, so eventually we'd just exit). So this cleanup only
matters if a process is on the cleanup list _and_ it has a separate
handler to clean itself up. Which is questionable in the first place
(and AFAIK we do not do).
- double-cleanup isn't actually that bad anyway. waitpid() will just
return an error, which we won't even report because of in_signal.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack index treats promisor packfiles (that is, packfiles that
have an accompanying .promisor file) the same as other packfiles. Remove
a section in the documentation that seems to indicate otherwise.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is only one caller, builtin/checkout.c, and it hardcodes
force_create=1.
This argument was introduced in abd0cd3a30 (refs: new public ref function:
safe_create_reflog, 2015-07-21), which promised to immediately use it in a
follow-on commit, but that never happened.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull" with any strategy when the other side is behind us
should succeed as it is a no-op, but doesn't.
* ev/pull-already-up-to-date-is-noop:
pull: should be noop when already-up-to-date
"git grep" looking in a blob that has non-UTF8 payload was
completely broken when linked with certain versions of PCREv2
library in the latest release.
* hm/paint-hits-in-log-grep:
Revert "grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data"
In certain environments or for specific test scenarios we might expect a
specific prerequisite check to succeed. Therefore we would like to abort
running our tests if this is not the case.
To remedy this we add the environment variable GIT_TEST_REQUIRE_PREREQ
which can be set to a space separated list of prereqs. If one of these
prereq tests fail then the whole test run will abort.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running the full test suite many tests can be skipped because of
missing prerequisites. It not easy right now to get an overview of which
ones are missing.
When switching to a new machine or environment some libraries and tools
might be missing or maybe a dependency broke completely. In this case
the tests would indicate nothing since all dependant tests are simply
skipped. This could hide broken behaviour or missing features in the
build. Therefore this patch summarizes the missing prereqs at the end of
the test run making it easier to spot such cases.
- Add failed prereqs to the test results.
- Aggregate and then show them with the totals.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier a066a90d (ci(check-whitespace): restrict to the intended
commits, 2021-07-14) changed the check-whitespace task to stop using a
shallow clone, and cc003621 (ci(check-whitespace): stop requiring a
read/write token, 2021-07-14) changed the way how the errors the task
discovered is signaled back to the user.
They however forgot to update the comment that outlines what is done in
the task. Correct them.
Signed-off-by: Hans Krentel (hakre) <hanskrentel@yahoo.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, we send the incoming pack to index-pack (or
unpack-objects) via the sideband demuxer. If index-pack hits an error
(e.g., because an object fails fsck), then it will die immediately. This
may cause us to get SIGPIPE on the fetch, as we're still trying to write
pack contents from the sideband demuxer (which is typically a thread,
and thus takes down the whole fetch process).
You can see this in action with:
./t5702-protocol-v2.sh --stress --run=59
which ends with (wrapped for readability):
test_must_fail: died by signal 13: git -c protocol.version=2 \
-c transfer.fsckobjects=1 -c fetch.uriprotocols=http,https \
clone http://127.0.0.1:5708/smart/http_parent http_child
not ok 59 - packfile-uri with transfer.fsckobjects fails on bad object
This is mostly cosmetic. The actual error of interest (in this case, the
object that failed the fsck check) comes from index-pack straight to
stderr, so the user still sees it. They _might_ even see fetch-pack
complaining about index-pack failing, because the main thread is racing
with the sideband-demuxer. But they'll definitely see the signal death
in the exit code, which is what the test is complaining about.
We can make this more predictable by just ignoring SIGPIPE. The sideband
demuxer uses write_or_die(), so it will notice and stop (gracefully,
because we hook die_routine() to exit just the thread). And during this
section we're not writing anywhere else where we'd be concerned about
SIGPIPE preventing us from wasting effort writing to nowhere.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using gcc-11 (or 12) to compile refs.o with -O3 results in:
In file included from hashmap.h:4,
from cache.h:6,
from refs.c:5:
In function ‘oidcpy’,
inlined from ‘ref_transaction_add_update’ at refs.c:1065:3,
inlined from ‘ref_transaction_update’ at refs.c:1094:2,
inlined from ‘ref_transaction_verify’ at refs.c:1132:9:
hash.h:262:9: warning: argument 2 null where non-null expected [-Wnonnull]
262 | memcpy(dst->hash, src->hash, GIT_MAX_RAWSZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:177,
from cache.h:4,
from refs.c:5:
refs.c: In function ‘ref_transaction_verify’:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared ‘nonnull’
43 | extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
| ^~~~~~
That call to memcpy() is in a conditional block that requires
REF_HAVE_NEW to be set. But in ref_transaction_update(), we make sure it
isn't set coming in:
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
and then only set it if the variable isn't NULL:
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
So it should be impossible to reach that memcpy() with a NULL oid. But
for whatever reason, gcc doesn't accept that hitting the BUG() means we
won't go any further, even though it's marked with the noreturn
attribute. And the conditional is correct; ALLOWED_FLAGS doesn't contain
HAVE_NEW or HAVE_OLD, and you can even simplify it to check for those
flags explicitly and the compiler still complains.
We can work around this by just clearing the disallowed flags
explicitly. This should be a noop because of the BUG() check, but it
makes the compiler happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, running 'git submodule deinit' on repos where the
submodule's '.git' is a directory, aborts with a message that is not
exactly user friendly.
Let's change this to instead warn the user that the .git/ directory
has been absorbed into the superproject.
The rest of the deinit function can operate as it already does with
new-style submodules.
In one test, we used to require "git submodule deinit" to fail even
with the "--force" option when the submodule's .git/ directory is not
absorbed. Adjust it to expect the operation to pass.
Suggested-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Mugdha Pattnaik <mugdhapattnaik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git no longer has a default strategy for reconciling divergent branches,
because there's no way for Git to know which strategy is appropriate in
any particular situation.
The initially proposed version in [*], that eventually became
031e2f7a (pull: abort by default when fast-forwarding is not
possible, 2021-07-22), dropped this phrase from the message, but
it was left in the final version by accident.
* https://lore.kernel.org/git/20210627000855.530985-1-alexhenrie24@gmail.com/
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test `amending already signed commit` is using git checkout to
select a specific commit to amend. In case an earlier test fails and
leaves behind a dirty index/worktree this test would fail as well.
Using `checkout -f` will avoid interference by most other tests.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The user.signingKey config for ssh signing supports either a path to a
file containing the key or for the sake of convenience a literal string
with the ssh public key. To differentiate between those two cases we
check if the first few characters contain "ssh-" which is unlikely to be
the start of a path. ssh supports other key types which are not prefixed
with "ssh-" and will currently be treated as a file path and therefore
fail to load. To remedy this we move the prefix check into its own
function and introduce the prefix `key::` for literal ssh keys. This way
we don't need to add new key types when they become available. The
existing `ssh-` prefix is retained for compatibility with current user
configs but removed from the official documentation to discourage its
use.
Signed-off-by: Fabian Stelzer <fs@gigacodes.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If writing a trace2 message fails, we optionally warn the user of this
fact. However, in 0ee10fd (usage: add trace2 entry upon warning(),
2020-11-23), we added a trace entry to the warning() function. This
means that we can enter an infinite loop of failing trace2 writes and
warnings. Fix this by disabling the failing trace2 destination before
issuing the warning.
Additionally, trace2 sets a default SIGPIPE handler
(tr2main_signal_handler) when it is initialized. This handler generates
its own trace2 messages when a signal is received. If a trace2 write
fails due to a broken pipe, this handler will run and then cause another
failed write. Fix this by temporarily ignoring SIGPIPE while writing
trace2 messages. This is safe because the write will still fail, and we
will disable the failing destination.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit, we would like external-facing functions to be
able to accept "struct repository" and "struct branch" as a pair. This
is useful for functions like pushremote_for_branch(), which need to take
values from the remote_state and branch, even if branch == NULL.
However, a caller may supply an unrelated repository and branch, which
is not supported behavior.
To prevent misuse, add a die_on_missing_branch() helper function that
dies if a given branch is not from a given repository. Speed up the
existence check by replacing the branches list with a branches_hash
hashmap.
Like read_config(), die_on_missing_branch() is only called from
non-static functions; static functions are less prone to misuse because
they have strong conventions for keeping remote_state and branch in
sync.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace all remaining references of the_repository->remote_state in
static functions with a struct remote_state parameter.
To do so, move read_config() calls to non-static functions and create a
family of static functions, "remotes_*", that behave like "repo_*", but
accept struct remote_state instead of struct repository. In the case
where a static function calls a non-static function, replace the
non-static function with its "remotes_*" equivalent.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without changing external-facing functions, replace
the_repository->remote_state internally by adding a struct remote_state
parameter.
As a result, external-facing functions are still tied to the_repository,
but most static functions no longer reference
the_repository->remote_state. The exceptions are those that are used in
a way that depends on external-facing functions e.g. the callbacks to
remote_get_1().
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote.c does not works with non-the_repository because it stores its
state as static variables. To support non-the_repository, we can use a
per-repository struct for the remotes subsystem.
Prepare for this change by defining a struct remote_state that holds
the remotes subsystem state and move the static variables of remote.c
into the_repository->remote_state.
This introduces no behavioral or API changes.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push remote-name" (that is, with no refspec given on the command
line) should push the refspecs in remote.remote-name.push. There is no
test case that checks this behavior in detached HEAD, so add one.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the histogram algorithm calls xdl_classify_record() it is no
longer necessary to use xdl_recmatch() to compare lines, it is
sufficient just to compare the hash values. This has a negligible
effect on performance.
Test HEAD~1 HEAD
-----------------------------------------------------------------------------
4000.1: log -3000 (baseline) 0.19(0.12+0.07) 0.18(0.14+0.04) -5.3%
4000.2: log --raw -3000 (tree-only) 0.98(0.81+0.16) 0.98(0.79+0.18) +0.0%
4000.3: log -p -3000 (Myers) 4.81(4.23+0.56) 4.80(4.26+0.53) -0.2%
4000.4: log -p -3000 --histogram 5.83(5.11+0.70) 5.82(5.15+0.65) -0.2%
4000.5: log -p -3000 --patience 5.31(4.61+0.69) 5.30(4.54+0.75) -0.2%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rindex and ha are only used by xdl_cleanup_records() which is not
called by the histogram or patience algorithms. The perf test results
show a small reduction in run time but that is probably within the
noise.
Test HEAD^ HEAD
-----------------------------------------------------------------------------
4000.1: log -3000 (baseline) 0.19(0.17+0.02) 0.19(0.12+0.07) +0.0%
4000.2: log --raw -3000 (tree-only) 0.98(0.78+0.20) 0.98(0.81+0.16) +0.0%
4000.3: log -p -3000 (Myers) 4.81(4.15+0.64) 4.81(4.23+0.56) +0.0%
4000.4: log -p -3000 --histogram 5.87(5.19+0.66) 5.83(5.11+0.70) -0.7%
4000.5: log -p -3000 --patience 5.35(4.60+0.73) 5.31(4.61+0.69) -0.7%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Histogram is the only diff algorithm not to call
xdl_classify_record(). xdl_classify_record() ensures that the hash
values of two strings that are not equal differ which means that it is
not necessary to use xdl_recmatch() when comparing lines, all that is
necessary is to compare the hash values. This gives a 7% reduction in
the runtime of "git log --patch" when using the histogram diff
algorithm.
Test HEAD^ HEAD
-----------------------------------------------------------------------------
4000.1: log -3000 (baseline) 0.18(0.14+0.04) 0.19(0.17+0.02) +5.6%
4000.2: log --raw -3000 (tree-only) 0.99(0.77+0.21) 0.98(0.78+0.20) -1.0%
4000.3: log -p -3000 (Myers) 4.84(4.31+0.51) 4.81(4.15+0.64) -0.6%
4000.4: log -p -3000 --histogram 6.34(5.86+0.46) 5.87(5.19+0.66) -7.4%
4000.5: log -p -3000 --patience 5.39(4.60+0.76) 5.35(4.60+0.73) -0.7%
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the tests in t5319 corrupts the checksum of the midx file by
writing a single 0xff over the final byte, and then confirms that we
detect the problem. This usually works fine, but would break if the
actual checksum ended with that same byte already.
It seems like this should happen in 1 out of 256 test runs, but it turns
out to be less often in practice. The contents of the midx are mostly
deterministic because it's based on the objects, and we remove most
sources of randomness by setting GIT_COMMITTER_DATE, etc. However,
there's still some randomness: some objects are duplicated between
packs, and the midx must decide which to use, which can be based on
timing.
So very occasionally we can end up with a real 0xff byte, and the test
fails. The most robust fix would be to read out the final byte and then
change it to something else (e.g., adding 1 mod 256). But that's awkward
to do in shell. Let's just blindly corrupt 10 bytes instead of 1, which
reduces our chances of an accidental noop to 1 in 2^80.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "checkout" command is one of the main sources of leaks in the test
suite, let's fix the common ones by not leaking from the "struct
branch_info".
Doing this is rather straightforward, albeit verbose, we need to
xstrdup() constant strings going into the struct, and free() the ones
we clobber as we go along.
This also means that we can delete previous partial leak fixes in this
area, i.e. the "path_to_free" accounting added by 96ec7b1e70 (Convert
resolve_ref+xstrdup to new resolve_refdup function, 2011-12-13).
There was some discussion about whether "we should retain the "const
char *" here and cast at free() time, or have it be a "char *". Since
this is not a public API with any sort of API boundary let's use
"char *", as is already being done for the "refname" member of the
same struct.
The tests to mark as passing were found with:
rm .prove; GIT_SKIP_TESTS=t0027 prove -j8 --state=save t[0-9]*.sh :: --immediate
# apply & compile this change
prove -j8 --state=failed :: --immediate
I.e. the ones that were newly passing when the --state=failed command
was run. I left out "t3040-subprojects-basic.sh" and
"t4131-apply-fake-ancestor.sh" to to optimization-level related
differences similar to the ones noted in[1], except that these would
be something the current 'linux-leaks' job would run into.
1. https://lore.kernel.org/git/cover-v3-0.6-00000000000-20211022T175227Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use size_t to match n when building the bitmask for checking whether a
rank is occupied, instead of the default signed int.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a formatting issue in "multi-pack-index.html", corresponding
to the nesting bulleted list of a wrong usage in "multi-pack-index.txt"
and this commit fix the problem.
In ASCIIDOC, it doesn't treat an indented character as the
beginning of a sub-list. If we want to write a nested bulleted list, we
could just use ASTERISK without any DASH like:
"
* Level 1 list item
** Level 2 list item
*** Level 3 list item
** Level 2 list item
* Level 1 list item
** Level 2 list item
* Level 1 list item
"
The DASH can be used for bulleted list too, But the DASH is suggested
only to be used as the marker for the first level because the DASH
doesn’t work well or a best practice for nested lists,
like (dash is as level 2 below):
"
* Level 1 list item
- Level 2 list item
* Level 1 list item
"
ASTERISK is recommanded to use because it works intuitively and clearly
("marker length = nesting level") in nested lists, but the DASH can't.
However, when you want to write a non-nested bulleted lists, DASH works
too, like:
"
- Level 1 list item
- Level 1 list item
- Level 1 list item
"
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current protocol EBNF allows command-request to end with the
capability list, if no command specific arguments follow, but the
protocol requires that after the capability list, there must be a
delim-pkt regardless of the number of command specific arguments. Fixed
the EBNF to match. Both JGit and libgit2's implementation has the
delim-pkt as mandatory. JGit's code is not publicly linkable, but
libgit2 is linked below[1]. As for currently implemented commands on v2
(ls-ref and fetch), the delim packet is already being passed through
[1]: https://github.com/libgit2/libgit2/blob/main/src/transports/git.c
Reported-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-fetch prints the URL after failing to fetch it. This can be
confusing to users (they cannot really do anything with it), and they
can share by accident a sensitive URL (e.g. with credentials) while
looking for help.
Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This
mimics the redaction of other sensitive information in git, like the
Authorization header in HTTP.
Fix also capitalization of previous die() message (must start in
lowercase).
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some setups, packfile uris act as bearer token. It is not
recommended to expose them plainly in logs, although in special
circunstances (e.g. debug) it makes sense to write them.
Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
variable is set to false. This mimics the redacting of the Authorization
header in HTTP.
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
unpack_object_header_buffer() attempts to protect against overflowing
left shifts, but the limit of the shift amount should not be the size of
the variable being shifted. It should be the size minus the size of its
contents. Fix that accordingly.
This was noticed at $DAYJOB by a fuzzer running internally.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_nodash_opt() function to use "enum
parse_opt_result". In 352e761388 (parse-options.[ch]: consistently
use "enum parse_opt_result", 2021-10-08) its only caller
parse_options_step() started using that return type, and the
get_value() which will be called and return from it uses the same
enum.
Let's do the same here so that this function always returns an "enum
parse_opt_result" value.
We could go for either PARSE_OPT_HELP (-2) or PARSE_OPT_ERROR (-1)
here. The reason we ended up with "-2" is that in code added in
07fe54db3c (parse-opt: do not print errors on unknown options, return
"-2" instead., 2008-06-23) we used that value in a meaningful way.
Then in 51a9949eda (parseopt: add PARSE_OPT_NODASH, 2009-05-07) the
use of "-2" was seemingly copy/pasted from parse_long_opt(), which was
the function immediately above the parse_nodash_opt() function added
in that commit.
Since we only care about whether the return value here is non-zero
let's use the more generic PARSE_OPT_ERROR.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before running the post-receive hook, status info is reported back to
the client. If a remote client exits before or during the status report,
receive-pack is killed by SIGPIPE and post-receive is never executed.
The post-receive hook is often used to send email notifications (see
contrib/hooks/post-receive-email), update bug trackers, start automatic
builds, etc. Not executing it after an interrupted yet "successful" push
can lead to inconsistencies.
Ignore SIGPIPE before reporting status to the client to increase the
chances of post-receive running if pre-receive was successful. This does
not guarantee 100% consistency but it should resist early disconnection
by the client.
Signed-off-by: Robin Jarry <robin@jarry.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently throw away any arguments given to "git jump merge". We
should instead pass them along to ls-files, since they're likely to be
pathspecs. This matches the behavior of "git jump diff", etc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description that 0640 makes sure that the group members can read
the repository is correct, but calling that octal number a <umask>
is wrong. Let's call it <perm>, as the value is used to set the
permission bits.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous explanation was mixing the format with the identity of
the field.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each member of the pair is explained but they are not defined
beforehand.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
URL being an acronym, it deserves to be kept uppercase.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
That's how alternative options are expressed in general.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to CodingGuidelines, multi-word placeholders should use
hyphens as word separators.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This discerns user inputs from verbatim options in the synopsis.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "define_categories()" and "define_category_names()" functions
to take the already-parsed output of "category_list()" as an argument,
which brings our number of passes over "command-list.txt" from three
to two.
Then have "category_list()" itself take the output of "command_list()"
as an argument, bringing the number of times we parse the file to one.
Compared to the pre-image this speeds us up quite a bit:
$ git show HEAD~:generate-cmdlist.sh >generate-cmdlist.sh.old
$ hyperfine --warmup 10 -L v ,.old 'sh generate-cmdlist.sh{v} command-list.txt'
Benchmark #1: sh generate-cmdlist.sh command-list.txt
Time (mean ± σ): 22.9 ms ± 0.3 ms [User: 15.8 ms, System: 9.6 ms]
Range (min … max): 22.5 ms … 24.0 ms 125 runs
Benchmark #2: sh generate-cmdlist.sh.old command-list.txt
Time (mean ± σ): 30.1 ms ± 0.4 ms [User: 24.4 ms, System: 17.5 ms]
Range (min … max): 29.5 ms … 32.3 ms 96 runs
Summary
'sh generate-cmdlist.sh command-list.txt' ran
1.32 ± 0.02 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the "grep" we run to exclude certain programs from the
generated output with a pure-shell loop that strips out the comments,
and sees if the "cmd" we're reading is on a list of excluded
programs. This uses a trick similar to test_have_prereq() in
test-lib-functions.sh.
On my *nix system this makes things quite a bit slower compared to
HEAD~:
o
'sh generate-cmdlist.sh.old command-list.txt' ran
1.56 ± 0.11 times faster than 'sh generate-cmdlist.sh command-list.txt'
18.00 ± 0.19 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
But when I tried running generate-cmdlist.sh 100 times in CI I found
that it helped across the board even on OSX & Linux. I tried testing
it in CI with this ad-hoc few-liner:
for i in $(seq -w 0 11 | sort -nr)
do
git show HEAD~$i:generate-cmdlist.sh >generate-cmdlist-HEAD$i.sh &&
git add generate-cmdlist* &&
cp t/t0000-generate-cmdlist.sh t/t00$i-generate-cmdlist.sh || : &&
perl -pi -e "s/HEAD0/HEAD$i/g" t/t00$i-generate-cmdlist.sh &&
git add t/t00*.sh
done && git commit -m"generated it"
Here HEAD~02 and the t0002* file refers to this change, and HEAD~03
and t0003* file to the preceding commit, the relevant results were:
linux-gcc:
[12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
[12:05:30] t0003-generate-cmdlist.sh .. ok 32 ms ( 0.00 usr 0.00 sys + 2.66 cusr 1.81 csys = 4.47 CPU)
osx-gcc:
[11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
[11:58:16] t0003-generate-cmdlist.sh .. ok 92127 ms ( 0.02 usr 0.01 sys + 22.54 cusr 14.27 csys = 36.84 CPU)
vs-test:
[12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
[12:03:20] t0003-generate-cmdlist.sh .. ok 32 s ( 0.00 usr 0.02 sys + 13.25 cusr 26.10 csys = 39.37 CPU)
I.e. even on *nix running 100 of these in a loop was up to ~2x faster
in absolute runtime, I suspect it's due factors that are exacerbated
in the CI, e.g. much slower process startup due to some platform
limits, or a slower FS.
The "cut -d" change here is because we're not emitting the
40-character aligned output anymore, i.e. we'll get the output from
command_list() now, not an as-is line from command-list.txt.
This also makes the parsing more reliable, as we could tweak the
whitespace alignment without breaking this parser. Let's reword a
now-inaccurate comment in "command-list.txt" describing that previous
alignment limitation. We'll still need the "### command-list [...]"
line due to the "Documentation/cmd-list.perl" logic added in
11c6659d85 (command-list: prepare machinery for upcoming "common
groups" section, 2015-05-21).
There was a proposed change subsequent to this one[3] which continued
moving more logic into the "command_list() function, i.e. replaced the
"cut | tr | grep" chain in "category_list()" with an argument to
"command_list()".
That change might have had a bit of an effect, but not as much as the
preceding commit, so I decided to drop it. The relevant performance
numbers from it were:
linux-gcc:
[12:05:33] t0001-generate-cmdlist.sh .. ok 13 ms ( 0.00 usr 0.00 sys + 3.33 cusr 2.78 csys = 6.11 CPU)
[12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
osx-gcc:
[11:58:03] t0001-generate-cmdlist.sh .. ok 78416 ms ( 0.02 usr 0.01 sys + 11.78 cusr 6.22 csys = 18.03 CPU)
[11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
vs-test:
[12:03:20] t0001-generate-cmdlist.sh .. ok 34 s ( 0.00 usr 0.03 sys + 12.42 cusr 19.55 csys = 32.00 CPU)
[12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
As above HEAD~2 and t0002* are testing the code in this commit (and
the line is the same), but HEAD~1 and t0001* are testing that dropped
change in [3].
1. https://lore.kernel.org/git/cover-v2-00.10-00000000000-20211022T193027Z-avarab@gmail.com/
2. https://lore.kernel.org/git/patch-v2-08.10-83318d6c0da-20211022T193027Z-avarab@gmail.com/
3. https://lore.kernel.org/git/patch-v2-10.10-e10a43756d1-20211022T193027Z-avarab@gmail.com/
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the "sed" invocation in get_synopsis() with a pure-shell
version. This speeds up generate-cmdlist.sh significantly. Compared to
HEAD~ (old) and "master" we are, according to hyperfine(1):
'sh generate-cmdlist.sh command-list.txt' ran
12.69 ± 5.01 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
18.34 ± 3.03 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a preceding commit we changed the print_command_list() loop to use
printf's auto-repeat feature. Let's now get rid of get_category_line()
entirely by not sorting the categories.
This will change the output of the generated code from e.g.:
- { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_complete | CAT_plumbingmanipulators },
To:
+ { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_plumbingmanipulators | CAT_complete },
I.e. the categories are no longer sorted, but as they're OR'd together
it won't matter for the end result.
This speeds up the generate-cmdlist.sh a bit. Comparing HEAD~ (old)
and "master" to this code:
'sh generate-cmdlist.sh command-list.txt' ran
1.07 ± 0.33 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
1.15 ± 0.36 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is just a small code reduction. There is a small probability that
the new code breaks when the category list is empty. But that would be
noticed during the compile step.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This doesn't matter for performance, but let's not include the empty
lines in our sorting. This makes the intent of the code clearer.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This isn't for optimization as the get_categories() is a purely shell
function, but rather for ease of readability, let's just inline these
two lines. We'll be changing this code some more in subsequent commits
to make this worth it.
Rename the get_categories() function to get_category_line(), since
that's what it's doing now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function get_categories() is invoked in a loop over all commands.
As it runs several processes, this takes an awful lot of time on
Windows. To reduce the number of processes, move the process that
filters empty lines to the other invoker of the function, where it is
needed. The invocation of get_categories() in the loop does not need
the empty line filtered away because the result is word-split by the
shell, which eliminates the empty line automatically.
Furthermore, use sort -u instead of sort | uniq to remove yet another
process.
[Ævar: on Linux this seems to speed things up a bit, although with
hyperfine(1) the results are fuzzy enough to land within the
confidence interval]:
$ git show HEAD~:generate-cmdlist.sh >generate-cmdlist.sh.old
$ hyperfine --warmup 1 -L s ,.old -p 'make clean' 'sh generate-cmdlist.sh{s} command-list.txt'
Benchmark #1: sh generate-cmdlist.sh command-list.txt
Time (mean ± σ): 371.3 ms ± 64.2 ms [User: 430.4 ms, System: 72.5 ms]
Range (min … max): 320.5 ms … 517.7 ms 10 runs
Benchmark #2: sh generate-cmdlist.sh.old command-list.txt
Time (mean ± σ): 489.9 ms ± 185.4 ms [User: 724.7 ms, System: 141.3 ms]
Range (min … max): 346.0 ms … 885.3 ms 10 runs
Summary
'sh generate-cmdlist.sh command-list.txt' ran
1.32 ± 0.55 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add " " before a "|" at the end of a line in generate-cmdlist.sh for
consistency with other code in the file. Some of the surrounding code
will be modified in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should keep these files sorted in the C locale, e.g. in the C
locale the order is:
git-check-mailmap
git-check-ref-format
git-checkout
But under en_US.UTF-8 it's:
git-check-mailmap
git-checkout
git-check-ref-format
In a subsequent commit I'll change generate-cmdlist.sh to use C sort
order, and without this change we'd be led to believe that that change
caused a meaningful change in the output, so let's do this as a
separate step, right now the generate-cmdlist.sh script just uses the
order found in this file.
Note that this refers to the sort order of the lines in
command-list.txt, a subsequent commit will also change how we treat
the sort order of the "category" fields, but that's unrelated to this
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If prepare_bitmap_git() returns NULL (one easy-to-trigger cause being
that the repository does not have bitmaps at all), then we'll segfault
accessing bitmap_git->hashes:
$ t/helper/test-tool bitmap dump-hashes
Segmentation fault
We should treat this the same as a repository with bitmaps but no
name-hashes, and quietly produce an empty output. The later call to
free_bitmap_index() in the cleanup label is OK, as it treats a NULL
pointer as a noop.
This isn't a big deal in practice, as this function is intended for and
used only by test-tool. It's probably worth fixing to avoid confusion,
but not worth adding coverage for this to the test suite.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strftime() function has a non-standard "%s" extension, which prints
the number of seconds since the epoch. But the "struct tm" we get has
already been adjusted for a particular time zone; going back to an epoch
time requires knowing that zone offset. Since strftime() doesn't take
such an argument, round-tripping to a "struct tm" and back to the "%s"
format may produce the wrong value (off by tz_offset seconds).
Since we're already passing in the zone offset courtesy of c3fbf81a85
(strbuf: let strbuf_addftime handle %z and %Z itself, 2017-06-15), we
can use that same value to adjust our epoch seconds accordingly.
Note that the description above makes it sound like strftime()'s "%s" is
useless (and really, the issue is shared by mktime(), which is what
strftime() would use under the hood). But it gets the two cases for
which it's designed correct:
- the result of gmtime() will have a zero offset, so no adjustment is
necessary
- the result of localtime() will be offset by the local zone offset,
and mktime() and strftime() are defined to assume this offset when
converting back (there's actually some magic here; some
implementations record this in the "struct tm", but we can't
portably access or manipulate it. But they somehow "know" whether a
"struct tm" is from gmtime() or localtime()).
This latter point means that "format-local:%s" actually works correctly
already, because in that case we rely on the system routines due to
6eced3ec5e (date: use localtime() for "-local" time formats,
2017-06-15). Our problem comes when trying to show times in the author's
zone, as the system routines provide no mechanism for converting in
non-local zones. So in those cases we have a "struct tm" that came from
gmtime(), but has been manipulated according to our offset.
The tests cover the broken round-trip by formatting "%s" for a time in a
non-system timezone. We use the made-up "+1234" here, which has two
advantages. One, we know it won't ever be the real system zone (and so
we're actually testing a case that would break). And two, since it has a
minute component, we're testing the full decoding of the +HHMM zone into
a number of seconds. Likewise, we test the "-1234" variant to make sure
there aren't any sign mistakes.
There's one final test, which covers "format-local:%s". As noted, this
already passes, but it's important to check that we didn't regress this
case. In particular, the caller in show_date() is relying on localtime()
to have done the zone adjustment, independent of any tz_offset we
compute ourselves. These should match up, since our local_tzoffset() is
likewise built around localtime(). But it would be easy for a caller to
forget to pass in a correct tz_offset to strbuf_addftime(). Fortunately
show_date() does this correctly (it has to because of the existing
handling of %z), and the test continues to pass. So this one is just
future-proofing against a change in our assumptions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As described in https://trojansource.codes/trojan-source.pdf, it is
possible to abuse directional formatting (a feature of Unicode) to
deceive human readers into interpreting code differently from compilers.
For example, an "if ()" expression could be enclosed in a comment, but
rendered as if it was outside of that comment. In effect, this could
fool a reviewer into misinterpreting the code flow as benign when it is
not.
It is highly unlikely that Git's source code wants to contain such
directional formatting in the first place, so let's just disallow it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the logical variable GIT_DEFAULT_BRANCH which represents the
the default branch name that will be used by "git init".
Currently this variable is equivalent to
git config init.defaultbranch || 'master'
This however will break if at one point the default branch is changed as
indicated by `default_branch_name_advice` in `refs.c`.
By providing this command ahead of time users of git can make their
code forward-compatible.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The filter system allows for alterations to file contents when they're
moved between the database and the worktree. We already made sure that
it is possible for smudge filters to produce contents that are larger
than `unsigned long` can represent (which matters on systems where
`unsigned long` is narrower than `size_t`, most notably 64-bit Windows).
Now we make sure that clean filters can _consume_ contents that are
larger than that.
Note that this commit only allows clean filters' _input_ to be larger
than can be represented by `unsigned long`.
This change makes only a very minute dent into the much larger project
to teach Git to use `size_t` instead of `unsigned long` wherever
appropriate.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces an additional guard for platforms where `unsigned long`
and `size_t` are not of the same size. If the size of an object in the
database would overflow `unsigned long`, instead we now exit with an
error.
A complete fix will have to update _many_ other functions throughout the
codebase to use `size_t` instead of `unsigned long`. It will have to be
implemented at some stage.
This commit puts in a stop-gap for the time being.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is mixed use of size_t and unsigned long to deal with sizes in the
codebase. Recall that Windows defines unsigned long as 32 bits even on
64-bit platforms, meaning that converting size_t to unsigned long narrows
the range. This mostly doesn't cause a problem since Git rarely deals
with files larger than 2^32 bytes.
But adjunct systems such as Git LFS, which use smudge/clean filters to
keep huge files out of the repository, may have huge file contents passed
through some of the functions in entry.c and convert.c. On Windows, this
results in a truncated file being written to the workdir. I traced this to
one specific use of unsigned long in write_entry (and a similar instance
in write_pc_item_to_fd for parallel checkout). That appeared to be for
the call to read_blob_entry, which expects a pointer to unsigned long.
By altering the signature of read_blob_entry to expect a size_t,
write_entry can be switched to use size_t internally (which all of its
callers and most of its callees already used). To avoid touching dozens of
additional files, read_blob_entry uses a local unsigned long to call a
chain of functions which aren't prepared to accept size_t.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The filter system allows for alterations to file contents when they're
added to the database or working tree. ("Smudge" when moving to the
working tree; "clean" when moving to the database.) This is used
natively to handle CRLF to LF conversions. It's also employed by Git-LFS
to replace large files from the working tree with small tracking files
in the repo and vice versa.
Git reads the entire smudged file into memory to convert it into a
"clean" form to be used in-core. While this is inefficient, there's a
more insidious problem on some platforms due to inconsistency between
using unsigned long and size_t for the same type of data (size of a file
in bytes). On most 64-bit platforms, unsigned long is 64 bits, and
size_t is typedef'd to unsigned long. On Windows, however, unsigned long
is only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to
unsigned long long in order to be 64 bits).
Practically speaking, this means 64-bit Windows users of Git-LFS can't
handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer
this limitation.
This commit introduces a test exposing the issue; future commits make it
pass. The test simulates the way Git-LFS works by having a tiny file
checked into the repository and expanding it to a huge file on checkout.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matt Cooper <vtbassmatt@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.
This imitates the `LONG_IS_64BIT` prerequisite.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this developer's tests, producing one gigabyte worth of NULs in a
busy loop that writes out individual bytes, unbuffered, took ~27sec.
Writing chunked 256kB buffers instead only took ~0.6sec
This matters because we are about to introduce a pair of test cases that
want to be able to produce 5GB of NULs, and we cannot use `/dev/zero`
because of the HP NonStop platform's lack of support for that device.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d5cfd142ec (tests: teach the test-tool to generate NUL bytes and
use it, 2019-02-14), add a way to generate zeroes in a portable
way without using /dev/zero (needed by HP NonStop), but uses a
long variable that is limited to 2^31 in Windows.
Use instead a (POSIX/C99) intmax_t that is at least 64bit wide
in 64-bit Windows to use in a future test.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*fast-import*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*config*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*status*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*clone*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*add*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*diff*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*apply*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*notes*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*update-index*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*rev-parse*" as passing when git is
compiled with SANITIZE=leak. They'll now be listed as running under
the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks"
CI target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark some tests that match "*rev-list*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As in 7ff24785cb (leak tests: mark some misc tests as passing with
SANITIZE=leak, 2021-10-12) continue marking various miscellaneous
tests as passing when git is compiled with SANITIZE=leak. They'll now
be listed as running under the "GIT_TEST_PASSING_SANITIZE_LEAK=true"
test mode (the "linux-leaks" CI target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark all but one tests that match "*gettext*" as passing when git is compiled
with SANITIZE=leak. They'll now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
In the case of "t0202-gettext-perl.sh" this isn't very meaningful as
most of the work is on the Perl side, but let's mark it anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark a test that was recently added in e031e9719d (test-mergesort:
add test subcommand, 2021-10-01) as passing with SANITIZE=leak. It
will now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "t1002-read-tree-m-u-2way.sh" test has passed under SANITIZE=leak
since 04988c8d18 (unpack-trees: introduce preserve_ignored to
unpack_trees_options, 2021-09-27) was combined with
e5a917fcf4 (unpack-trees: don't leak memory in
verify_clean_subdirectory(), 2021-10-07), but as both were in-flight
at the time neither could mark it as passing.
It will now be listed as running under the
"GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI
target).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The %(describe) placeholder by default, like `git describe`, uses a
seven-character abbreviated commit object name. This may not be
sufficient to fully describe all commits in a given repository,
resulting in a placeholder replacement changing its length because the
repository grew in size. This could cause the output of git-archive to
change.
Add the --abbrev option to `git describe` to the placeholder interface
in order to provide tools to the user for fine-tuning project defaults
and ensure reproducible archives.
One alternative would be to just always specify --abbrev=40 but this may
be a bit too biased...
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The %(describe) placeholder by default, like `git describe`, only
supports annotated tags. However, some people do use lightweight tags
for releases, and would like to describe those anyway. The command line
tool has an option to support this.
Teach the placeholder to support this as well.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It contains option arguments only, not options. We would like to add
option support here too, but to do that we need to distinguish between
different types of options.
Lay out the groundwork for distinguishing between bools, strings, etc.
and move the central logic (validating values and pushing new arguments
to *args) into the successful match, because that will be fairly
conditional on what type of argument is being parsed.
Signed-off-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This compatilibity implementation has been returning a wrong type,
ever since 731043fd (Add compat/unsetenv.c ., 2006-01-25) added to
the system, yet nobody noticed it in the past 16 years, presumably
because no code checks failures in their unsetenv() calls. Sigh.
For now, make it always succeed.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In several places, headers need to be included or else the code won't
compile. Since this is the first object walk, it would be nice to
include them in the tutorial to make it easier to follow.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two errors in the example code caused compilation failures due to
a missing semicolon as well as initialization with an empty struct.
This commit fixes that to make the MyFirstObjectWalk tutorial easier to
follow.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "GIT_TEST_FSYNC" environment variable now exists for
disabling fsync() even on packfiles and other "critical" data.
Running "make test -j8 NO_SVN_TESTS=1" on a noisy 8-core system
on an HDD, test runtime drops from ~4 minutes down to ~3 minutes.
Using "GIT_TEST_FSYNC=1" re-enables fsync() for comparison
purposes.
SVN interopability tests are minimally affected since SVN will
still use fsync in various places.
This will also be useful for 3rd-party tools which create
throwaway git repositories of temporary data, but remains
undocumented for end users.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function free_bitmap_index() is somewhat lax in what it frees. There
are two notable examples:
- While it does call kh_destroy_oid_map on the "bitmaps" map, which
maps commit OIDs to their corresponding bitmaps, the bitmaps
themselves are not freed. Note here that we recycle already-freed
ewah_bitmaps into a pool, but these are handled correctly by
ewah_pool_free().
- We never bother to free the extended index's "positions" map, which
we always allocate in load_bitmap().
Fix both of these.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_bitmap_walk() is used to implement `git rev-list --test-bitmap`,
which compares the result of the on-disk bitmaps with ones generated
on-the-fly during a revision walk.
In fa95666a40 (pack-bitmap.c: harden 'test_bitmap_walk()' to check type
bitmaps, 2021-08-24), we hardened those tests to also check the four
special type-level bitmaps, but never freed those bitmaps. We should
have, since each required an allocation when we EWAH-decompressed them.
Free those, plugging that leak, and also free the base (the scratch-pad
bitmap), too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To ask for the name of a MIDX and its corresponding .rev file, callers
invoke get_midx_filename() and get_midx_rev_filename(), respectively.
These both invoke xstrfmt(), allocating a chunk of memory which must be
freed later on.
This makes callers in pack-bitmap.c somewhat awkward. Specifically,
midx_bitmap_filename(), which is implemented like:
return xstrfmt("%s-%s.bitmap",
get_midx_filename(midx->object_dir),
hash_to_hex(get_midx_checksum(midx)));
this leaks the second argument to xstrfmt(), which itself was allocated
with xstrfmt(). This caller could assign both the result of
get_midx_filename() and the outer xstrfmt() to a temporary variable,
remembering to free() the former before returning. But that involves a
wasteful copy.
Instead, get_midx_filename() and get_midx_rev_filename() take a strbuf
as an output parameter. This way midx_bitmap_filename() can manipulate
and pass around a temporary buffer which it detaches back to its caller.
That allows us to implement the function without copying or open-coding
get_midx_filename() in a way that doesn't leak.
Update the other callers of get_midx_filename() and
get_midx_rev_filename() accordingly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `multi-pack-index` builtin dynamically allocates an array of
command-line option for each of its separate modes by calling
add_common_options() to concatante the common options with sub-command
specific ones.
Because this operation allocates a new array, we have to be careful to
remember to free it. We already do this in the repack and write
sub-commands, but verify and expire don't. Rectify this by calling
FREE_AND_NULL as the other modes do.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git repack` invokes a handful of child processes: one to write the
actual pack, and optionally ones to repack promisor objects and update
the MIDX.
Most of these are freed automatically by calling `start_command()` (which
invokes it on error) and `finish_command()` which calls it
automatically.
But repack_promisor_objects() can initialize a child_process, populate
its array of arguments, and then return from the function before even
calling start_command().
Make sure that the prepared list of arguments is freed by calling
child_process_clear() ourselves to avoid leaking memory along this path.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unused 'ps' argument was a left-over from original copy-paste of
stash_patch(). Removed.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The placeholders represent atoms of tokens and must not be
aggregates.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"reset" was previously treated as a standalone special color name
representing `\e[m`. Now, it can apply to other color properties,
allowing exact specifications without implicit attribute inheritance.
For example, "reset green" now renders `\e[;32m`, which is interpreted
as "reset everything; then set foreground to green". This means the
background and other attributes are also reset to their defaults.
Previously, this was impossible to represent in a single color:
"reset" could be specified alone, or a color with attributes, but some
thing like clearing a background color were impossible.
There is a separate change that introduces the "default" color name to
assist with that, but even then, the above could only to be represented
by explicitly disabling each of the attributes:
green default no-bold no-dim no-italic no-ul no-blink no-reverse no-strike
Signed-off-by: Robert Estelle <robertestelle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name "default" can now be used in foreground or background colors,
and means to use the terminal's default color, discarding any
explicitly-set color without affecting the other attributes. On many
modern terminals, this is *not* the same as specifying "white" or
"black".
Although attributes could previously be cleared like "no-bold", there
had not been a similar mechanism available for colors, other than a full
"reset", which cannot currently be combined with other settings.
Note that this is *not* the same as the existing name "normal", which is
a no-op placeholder to permit setting the background without changing
the foreground. (i.e. what is currently called "normal" might have been
more descriptively named "inherit", "none", "pass" or similar).
Signed-off-by: Robert Estelle <robertestelle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The colors black and white where conspicuously missing from the color
constants. Although they are not currently used in the codebase, having
them included makes it easier to visually verify the ANSI codes, and to
distinguish them explicitly from "GIT_COLOR_DEFAULT" in a subsequent
change.
Signed-off-by: Robert Estelle <robertestelle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-send-email(1) does not mention that "git format-patch" options are
accepted. Augment SYNOPSIS and DESCRIPTION to mention it.
Update git-send-email.perl USAGE to be consistent with
git-send-email(1).
Signed-off-by: Thiago Perrotta <tbperrotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git send-email --git-completion-helper" only prints "format-patch"
flags. Make it print "send-email" flags as well, extracting them
programmatically from its three existing "GetOptions".
Introduce a "uniq" subroutine, otherwise --cc-cover, --to-cover and
other flags would show up twice. In addition, deduplicate flags common
to both "send-email" and "format-patch", like --from.
Remove extraneous flags: --h and --git-completion-helper.
Add trailing "=" to options that expect an argument, inline with
the format-patch implementation.
Add a completion test for "send-email --validate", a send-email flag.
Signed-off-by: Thiago Perrotta <tbperrotta@gmail.com>
Based-on-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When constructing arguments to pass to setup_revision(), pack-objects
only frees the memory used by that array after calling
get_object_list().
Ensure that we call strvec_clear() whether or not we use the arguments
array by cleaning up whenever we exit the function (and rewriting one
early return to jump to a label which frees the memory and then
returns).
We could avoid setting this array up altogether unless we are in the
if-else block that calls get_object_list(), but setting up the argument
array is intermingled with lots of other side-effects, e.g.:
if (exclude_promisor_objects) {
use_internal_rev_list = 1;
fetch_if_missing = 0;
strvec_push(&rp, "--exclude-promisor-objects");
}
So it would be awkward to check exclude_promisor_objects twice: first to
set use_internal_rev_list and fetch_if_missing, and then again above
get_object_list() to push the relevant argument onto the array.
Instead, leave the array's construction alone and make sure to free it
unconditionally.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `read_midx_file()` to show information about a MIDX or list
the objects contained within it we fail to call `close_midx()`, leaking
the memory allocated to store that MIDX.
Fix this by calling `close_midx()` before exiting the function. We can
drop the "early" return when `show_objects` is non-zero, since the next
instruction is also a return.
(We could just as easily put a `cleanup` label here as with previous
patches. But the only other time we terminate the function early is
when we fail to load a MIDX in the first place. `close_midx()` does
handle a NULL argument, but the extra complexity is likely not
warranted).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function midx.c:verify_midx_file() allocates a MIDX struct by
calling load_multi_pack_index(). But when cleaning up, it calls free()
without freeing any resources associated with the MIDX.
Call the more appropriate close_midx() which does free those resources,
which causes t5319.3 to pass when Git is compiled with SANITIZE=leak.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of `git reset --hard` being able to use the sparse index
without expanding it, replace the command in `sparse-index is expanded and
converted back` with `git reset -- folder1/a`. This command will need to
expand the index to work properly, even after integrating the rest of
`reset` with sparse index.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change `update_index_from_diff` to set `skip-worktree` when applicable for
new index entries. When `git reset --mixed <tree-ish>` is run, entries in
the index with differences between the pre-reset HEAD and reset <tree-ish>
are identified and handled with `update_index_from_diff`. For each file, a
new cache entry in inserted into the index, created from the <tree-ish> side
of the reset (without changing the working tree). However, the newly-created
entry must have `skip-worktree` explicitly set in either of the following
scenarios:
1. the file is in the current index and has `skip-worktree` set
2. the file is not in the current index but is outside of a defined sparse
checkout definition
Not setting the `skip-worktree` bit leads to likely-undesirable results for
a user. It causes `skip-worktree` settings to disappear on the
"diff"-containing files (but *only* the diff-containing files), leading to
those files now showing modifications in `git status`. For example, when
running `git reset --mixed` in a sparse checkout, some file entries outside
of sparse checkout could show up as deleted, despite the user never deleting
anything (and not wanting them on-disk anyway).
Additionally, add a test to `t7102` to ensure `skip-worktree` is preserved
in a basic `git reset --mixed` scenario and update a failure-documenting
test from 19a0acc (t1092: test interesting sparse-checkout scenarios,
2021-01-23) with new expected behavior.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's encourage first-time contributors to tell us what commit they
based their work on with the format-patch invocation. As the
example already forks from origin/master and branch.autosetupmerge
by default records the upstream when the psuh branch was created, we
can use --base=auto for this. Also, mention that the range of
commits can simply be given with `@{u}` if they are on the `psuh`
branch already.
As we are getting one more option on the command line, and spending
one paragraph each to explain them, let's reformat that part of the
description as a bulleted list.
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The v2 porcelain format is very convenient for obtaining a lot of
information about the current state of the repo, but does not contain
any info about the stash. git status already accepts --show-stash but
it's silently ignored when --porcelain=v2 is given.
Let's add a simple line to print the number of stash entries but in a
format similar in style to the rest of the format.
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the counting of stash entries contained in one simple function as
it will be used in the next commit.
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the sane_grep() shell function in git-sh-setup. The two reasons
for why it existed don't apply anymore:
1. It was added due to GNU grep supporting GREP_OPTIONS. See
e1622bfcba (Protect scripted Porcelains from GREP_OPTIONS insanity,
2009-11-23).
Newer versions of GNU grep ignore that, but even on older versions
its existence won't matter, none of these sane_grep() uses care
about grep's output, they're merely using it to check if a string
exists in a file or stream. We also don't care about the "LC_ALL=C"
that "sane_grep" was using, these greps for fixed or ASCII strings
will behave the same under any locale.
2. The SANE_TEXT_GREP added in 71b401032b (sane_grep: pass "-a" if
grep accepts it, 2016-03-08) isn't needed either, none of these grep
uses deal with binary data.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_zero_oid() function in git-submodule.sh has not been used since
e83e3333b5 (submodule: port submodule subcommand 'summary' from shell
to C, 2020-08-13), so we can remove it, and the sane_egrep() function,
dead is_zero_oid() was the only function which still referenced it.
Unlike some other functions in git-sh-setup.sh, this function has not
been documented in git-sh-setup(1), so per [1] it should be OK to
remove it. I'm still unclear about the future of some of the other
functions[2], but any questions in that area should not apply here.
1. https://lore.kernel.org/git/xmqqr1dtgnn8.fsf@gitster.g/
1. https://lore.kernel.org/git/87tuiwjfvi.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a check for whether mod_perl is a supported mode of gitweb.cgi
added in a51d37c1df (Add git-instaweb, instantly browse the working
repo with gitweb, 2006-07-01).
The reason for the check was to support users who had a newer version
of git and an older version of gitweb, it was then subsequently
adjusted for changes in the script in f0e588dffc (git-instaweb: fix
mod_perl detection for apache2, 2009-08-08).
It's a fair bet that nobody's running a git from 2021 and gitweb from
pre-2007 anymore, so we can unconditionally assume that this will be
supported by gitweb.cgi.
This allows a subsequent commit to remove the sane_grep() wrapper,
this change is split up from that since this is the only case where
the "grep" invocation could be removed entirely.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including $(NO_CURL) in $(SCRIPT_DEFINES). The "@NO_CURL@"
replacement added in 6c5c62f340 (Print an error if cloning a http
repo and NO_CURL is set, 2006-02-15) has not been referenced by
anything in-tree since 49eb8d39c7 (Remove contrib/examples/*,
2018-03-25).
That commit removed the reference from contrib/examples/*, but this
@@NO_CURL@@ hasn't been used since git-pull.sh was the primary entry
point for "git pull".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the $(GIT_VERSION) from $(SCRIPT_DEFINES). Now every time HEAD
changes in a development copy we don't need to re-build the scripts
and script libraries.
This has not been needed since 2b9391bc67 (Makefile: do not replace
@@GIT_VERSION@@ in shell scripts, 2012-06-20). On my setup this
changes the re-making of 44 targets in a development copy where moved
HEAD to 27.
The $(GIT_VERSION) was seemingly left here by mistake or omission. We
didn't need it since 2b9391bc67, but in the later
e4dd89ab98 (Makefile: update scripts when build-time parameters
change, 2012-06-20) it was added to SCRIPT_DEFINES.
The two were part of the same series of patches, and given the summary
in [1] and [2] it looks like this was probably a case of some earlier
version of a later patch being combined with an updated earlier patch.
1. https://lore.kernel.org/git/20120619232231.GA6328@sigill.intra.peff.net/
2. https://lore.kernel.org/git/20120619232453.GB6496@sigill.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "GIT-SCRIPT-DEFINES" was added in e4dd89ab98 (Makefile: update
scripts when build-time parameters change, 2012-06-20) the rules for
generating the scripts themselves were moved further away from the
"cmd_munge_script" added in 46bac90458 (Do not install shell
libraries executable, 2010-01-31).
Let's move these around so that the variables and defines needed by
given targets immediately precede them. This is not needed for any
subsequent changes to work, but makes the code consistent with how
GIT-PERL-DEFINES is structured.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to read the contents of a MIDX, we initialize a chunkfile
structure which can read the table of contents and assign pointers into
different sections of the file for us.
We do call free(), since the chunkfile struct is heap allocated, but not
the more appropriate free_chunkfile(), which also frees memory that the
structure itself owns.
Call that instead to avoid leaking memory in this function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The for-each-ref family of commands invoke parsers immediately when
it sees each --sort=<atom> option, and die before even seeing the
other options on the command line when the <atom> is unrecognised.
Instead, accumulate them in a string list, and have them parsed into
a ref_sorting structure after the command line parsing is done. As
a consequence, "git branch --sort=bogus -h" used to fail to give the
brief help, which arguably may have been a feature, now does so,
which is more consistent with how other options work.
The patch is smaller than the actual extent of the "damage" to the
codebase, thanks to the fact that the original code consistently
used OPT_REF_SORT() macro to handle command line options. We only
needed to replace the variable used for the list, and implementation
of the callback function used in the macro.
The old rule was for the users of the API to:
- Declare ref_sorting and ref_sorting_tail variables;
- OPT_REF_SORT() macro will instantiate ref_sorting instance (which
may barf and die) and append it to the tail;
- Append to the tail each ref_sorting read from the configuration
by parsing in the config callback (which may barf and die);
- See if ref_sorting is null and use ref_sorting_default() instead.
Now the rule is not all that different but is simpler:
- Declare ref_sorting_options string list.
- OPT_REF_SORT() macro will append it to the string list;
- Append to the string list the sort key read from the
configuration;
- call ref_sorting_options() to turn the string list to ref_sorting
structure (which also deals with the default value).
As side effects, this change also cleans up a few issues:
- 95be717c (parse_opt_ref_sorting: always use with NONEG flag,
2019-03-20) muses that "git for-each-ref --no-sort" should simply
clear the sort keys accumulated so far; it now does.
- The implementation detail of "struct ref_sorting" and the helper
function parse_ref_sorting() can now be private to the ref-filter
API implementation.
- If you set branch.sort to a bogus value, the any "git branch"
invocation, not only the listing mode, would abort with the
original code; now it doesn't
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ab/ref-filter-leakfix:
branch: use ref_sorting_release()
ref-filter API user: add and use a ref_sorting_release()
tag: use a "goto cleanup" pattern, leak less memory
Stash only the changes that are staged.
This mode allows to easily stash-out for later reuse some changes
unrelated to the current work in progress.
Unlike 'stash push --patch', --staged supports use of any tool to
select the changes to stash-out, including, but not limited to 'git
add --interactive'.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the transitory refs_werrres_ref_unsafe() function to
refs_resolve_ref_unsafe(), now that all callers of the old function
have learned to pass in a "failure_errno" parameter.
The coccinelle semantic patch added in the preceding commit works, but
I couldn't figure out how to get spatch(1) to re-flow these argument
lists (and sometimes make lines way too long), so this rename was done
with:
perl -pi -e 's/refs_werrres_ref_unsafe/refs_resolve_ref_unsafe/g' \
$(git grep -l refs_werrres_ref_unsafe -- '*.c')
But after that "make contrib/coccinelle/refs.cocci.patch" comes up
empty, so the result would have been the same. Let's remove that
transitory semantic patch file, we won't need to retain it for any
other in-flight changes, refs_werrres_ref_unsafe() only existed within
this patch series.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preceding commits all callers of refs_resolve_ref_unsafe() were
migrated to the transitory refs_werrres_ref_unsafe() function.
As a first step in getting rid of it let's remove the old function
from the public API (it went unused in a preceding commit).
We then provide both a coccinelle rule to do the rename, and a macro
to avoid breaking the existing callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In run_transaction_hook() we've checked errno since 6754159767 (refs:
implement reference transaction hook, 2020-06-19), let's reset errno
afterwards to make sure nobody using refs.c directly or indirectly
relies on it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of these two is rather trivial, and it's easy to see none of
their callers care about errno. So let's move them from
refs_resolve_ref_unsafe() to refs_resolve_ref_unsafe_with_errno(),
these were the last two callers, so we can get rid of that wrapper
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the resolve_ref_unsafe() wrapper function to use the underlying
refs_werrres_ref_unsafe() directly.
From a reading of the callers I determined that the only one who cared
about errno was a sequencer.c caller added in e47c6cafcb (commit:
move print_commit_summary() to libgit, 2017-11-24), I'm migrating it
to using refs_werrres_ref_unsafe() directly.
This adds another "set errno" instance, but in this case it's OK and
idiomatic. We are setting it just before calling die_errno(). We could
have some hypothetical die_errno_var(&saved_errno, ...) here, but I
don't think it's worth it. The problem with errno is subtle action at
distance, not this sort of thing. We already use this pattern in a
couple of places in wrapper.c
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move refs_ref_exists from the legacy refs_resolve_ref_unsafe() to the
new refs_werrres_ref_unsafe(). I have read its callers and determined
that they don't care about errno being set, in particular:
git grep -W -w -e refs_ref_exists -e ref_exists
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move refs_resolve_refdup() from the legacy refs_resolve_ref_unsafe()
to the new refs_werrres_ref_unsafe(). I have read its callers and
determined that they don't care about errno being set.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cmd_resolve_ref() function has always ignored errno on failure,
but let's do so explicitly when using the refs_resolve_ref_unsafe()
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are only handful of callers of find_shared_symref(), none of
whom care about errno, so let's migrate to the non-errno-propagating
version of refs_resolve_ref_unsafe() and explicitly ignore errno here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The static add_head_info() function is only used indirectly by callers
of get_worktrees(), none of whom care about errno, and even if they
did having the faked-up one from refs_resolve_ref_unsafe() would only
confuse them if they used die_errno() et al. So let's explicitly
ignore it here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the callers of rename_ref() and copy_ref() care about errno,
and as seen in the context here we already emit our own non-errno
using error() in the case where we'd use it.
So let's have it explicitly ignore errno, and do the same in
commit_ref_update(), which is only used within other code in
files_copy_or_rename_ref() itself which doesn't care about errno
either.
It might actually be sensible to have the callers use errno if the
failure was filesystem-specific, and with the upcoming reftable
backend we don't want to rely on that sort of thing, so let's keep
ignoring that for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the refs_resolve_ref_unsafe() invoked in loose_fill_ref_dir()
to a form that ignores errno. The only eventual caller of this
function is create_ref_cache(), whose callers in turn don't have their
failure depend on any errno set here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I have carefully read the upstream callers of resolve_gitlink_ref()
and determined that they don't care about errno.
So let's move away from the errno-setting refs_resolve_ref_unsafe()
wrapper to refs_werrres_ref_unsafe(), and explicitly ignore the errno
it sets for us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the refs_read_ref_full() wrapper in favor of migrating various
refs.c API users to the underlying refs_werrres_ref_unsafe() function.
A careful reading of these callers shows that the callers of this
function did not care about "errno", by moving away from the
refs_resolve_ref_unsafe() wrapper we can be sure that nothing relies
on it anymore.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In lock_ref_oid_basic() we'll happily lock a reference that doesn't
exist yet. That's normal, and is how references are initially born,
but we don't need to retain checks here in lock_ref_oid_basic() about
the state of the ref, when what we're checking is either checked
already, or something we're about to discover by trying to lock the
ref with raceproof_create_file().
The one exception is the caller in files_reflog_expire(), who passes
us a "type" to find out if the reference is a symref or not. We can
move the that logic over to that caller, which can now defer its
discovery of whether or not the ref is a symref until it's needed. In
the preceding commit an exhaustive regression test was added for that
case in a new test in "t1417-reflog-updateref.sh".
The improved diagnostics here were added in
5b2d8d6f21 (lock_ref_sha1_basic(): improve diagnostics for ref D/F
conflicts, 2015-05-11), and then much of the surrounding code went
away recently in my 245fbba46d (refs/files: remove unused "errno ==
EISDIR" code, 2021-08-23).
The refs_resolve_ref_unsafe() code being removed here looks like it
should be tasked with doing that, but it's actually redundant to other
code.
The reason for that is as noted in 245fbba46d this once widely used
function now only has a handful of callers left, which all handle this
case themselves.
To the extent that we're racy between their check and ours removing
this check actually improves the situation, as we'll be doing fewer
things between the not-under-lock initial check and acquiring the
lock.
Why this is OK for all the remaining callers of lock_ref_oid_basic()
is noted below. There are only two of those callers:
* "git branch -[cm] <oldbranch> <newbranch>":
In files_copy_or_rename_ref() we'll call this when we copy or rename
refs via rename_ref() and copy_ref(). but only after we've checked
if the refname exists already via its own call to
refs_resolve_ref_unsafe() and refs_rename_ref_available().
As the updated comment to the latter here notes neither of those are
actually needed. If we delete not only this code but also
refs_rename_ref_available() we'll do just fine, we'll just emit a
less friendly error message if e.g. "git branch -m A B/C" would have
a D/F conflict with a "B" file.
Actually we'd probably die before that in case reflogs for the
branch existed, i.e. when the try to rename() or copy_file() the
relevant reflog, since if we've got a D/F conflict with a branch
name we'll probably also have the same with its reflogs (but not
necessarily, we might have reflogs, but it might not).
As some #leftoverbits that code seems buggy to me, i.e. the reflog
"protocol" should be to get a lock on the main ref, and then perform
ref and/or reflog operations. That code dates back to
c976d415e5 (git-branch: add options and tests for branch renaming,
2006-11-28) and probably pre-dated the solidifying of that
convention. But in any case, that edge case is not our bug or
problem right now.
* "git reflog expire <ref>":
In files_reflog_expire() we'll call this without previous ref
existence checking in files-backend.c, but that code is in turn
called by code that's just finished checking if the refname whose
reflog we're expiring exists.
See ae35e16cd4 (reflog expire: don't lock reflogs using previously
seen OID, 2021-08-23) for the current state of that code, and
5e6f003ca8 (reflog_expire(): ignore --updateref for symbolic
references, 2015-03-03) for the code we'd break if we only did a
"update = !!ref" here, which is covered by the aforementioned
regression test in "t1417-reflog-updateref.sh".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests that cover blindspots in "git reflog delete --updateref"
behavior. Before this change removing the "type & REF_ISSYMREF" check
added in 5e6f003ca8 (reflog_expire(): ignore --updateref for symbolic
references, 2015-03-03) would not fail any tests.
The "--updateref" option was added in 55f1056537 (git-reflog: add
option --updateref to write the last reflog sha1 into the ref,
2008-02-22) for use in git-stash.sh, see e25d5f9c82 (git-stash: add
new 'drop' subcommand, 2008-02-22).
Even though the regression test I need is just the "C" case here,
let's test all these combinations for good measure. I started out
doing these as a for-loop, but I think this is more readable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the refs_rename_ref_available() function into
"refs/files-backend.c". It is file-backend specific.
This function was added in 5fe7d825da (refs.c: pass a list of names
to skip to is_refname_available, 2014-05-01) as rename_ref_available()
and was only ever used in this one file-backend specific codepath. So
let's move it there.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_loose_ref_contents() function to stop setting "errno"
and failure, and to instead pass up a "failure_errno" via a
parameter. This requires changing its callers to do the same.
The EINVAL error from parse_loose_ref_contents is used in files-backend
to create a custom error message.
In untangling this we discovered a tricky edge case. The
refs_read_special_head() function was relying on
parse_loose_ref_contents() setting EINVAL.
By converting it to use "saved_errno" we can migrate away from "errno"
in this part of the code entirely, and do away with an existing
"save_errno" pattern, its only purpose was to not clobber the "errno"
we previously needed at the end of files_read_raw_ref().
Let's assert that we can do that by not having files_read_raw_ref()
itself operate on *failure_errno in addition to passing it on. Instead
we'll assert that if we return non-zero we actually do set errno, thus
assuring ourselves and callers that they can trust the resulting
"failure_errno".
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a "failure_errno" to refs_read_raw_ref(), his allows
refs_werrres_ref_unsafe() to pass along its "failure_errno", as a
first step before its own callers are migrated to pass it further up
the chain.
We are leaving out out the refs_read_special_head() in
refs_read_raw_ref() for now, as noted in a subsequent commit moving it
to "failure_errno" will require some special consideration.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new refs_werrres_ref_unsafe() function, which is like
refs_resolve_ref_unsafe() except that it explicitly saves away the
"errno" to a passed-in parameter, the refs_resolve_ref_unsafe() then
becomes a wrapper for it.
In subsequent commits we'll migrate code over to it, before finally
making "refs_resolve_ref_unsafe()" with an "errno" parameter the
canonical version, so this this function exists only so that we can
incrementally migrate callers, it will be going away in a subsequent
commit.
As the added comment notes has a rather tortured name to be the same
length as "refs_resolve_ref_unsafe", to avoid churn as we won't need
to re-indent the argument lists, similarly the documentation and
structure of it in refs.h is designed to minimize a diff in a
subsequent commit, where that documentation will be added to the new
refs_resolve_ref_unsafe().
At the end of this migration the "meaningful errno" TODO item left in
76d70dc0c6 (refs.c: make resolve_ref_unsafe set errno to something
meaningful on error, 2014-06-20) will be resolved.
As can be seen from the use of refs_read_raw_ref() we'll also need to
convert some functions that the new refs_werrres_ref_unsafe() itself
calls to take this "failure_errno". That will be done in subsequent
commits.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test for "git branch" to cover the case where .git/refs is
symlinked. To check availability, refs_verify_refname_available() will
run refs_read_raw_ref() on each prefix, leading to a read() from
.git/refs (which is a directory).
It would probably be more robust to re-issue the lstat() as a normal
stat(), in which case, we would fall back to the directory case, but
for now let's just test for the existing behavior as-is. This test
covers a regression in a commit that only ever made it to "next", see
[1].
1. http://lore.kernel.org/git/pull.1068.git.git.1629203489546.gitgitgadget@gmail.com
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a URL to normalize it, we allow hostnames to contain only
dot (".") or dash ("-"), plus brackets and colons for IPv6 literals.
This matches the old URL standard in RFC 1738, which says:
host = hostname | hostnumber
hostname = *[ domainlabel "." ] toplabel
domainlabel = alphadigit | alphadigit *[ alphadigit | "-" ] alphadigit
But this was later updated by RFC 3986, which is more liberal:
host = IP-literal / IPv4address / reg-name
reg-name = *( unreserved / pct-encoded / sub-delims )
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
While names with underscore in them are not common and possibly violate
some DNS rules, they do work in practice, and we will happily contact
them over http://, git://, or ssh://. It seems odd to ignore them for
purposes of URL matching, especially when the URL RFC seems to allow
them.
There shouldn't be any downside here. It's not a syntactically
significant character in a URL, so we won't be confused about parsing;
we'd have simply rejected such a URL previously (the test here checks
the url code directly, but the obvious user-visible effect would be
failing to match credential.http://foo_bar.example.com.helper, or
similar config in http.<url>.*).
Arguably we'd want to allow tilde ("~") here, too. There's likewise
probably no downside, but I didn't add it simply because it seems like
an even less likely character to appear in a hostname.
Reported-by: Alex Waite <alex@waite.eu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This command dumps individual tables or a stack of of tables.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
provide a command-line utility for inspecting individual tables, and
inspecting a complete ref database
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packed/loose format has restrictions on refnames: a and a/b cannot
coexist. This limitation does not apply to reftable per se, but must be
maintained for interoperability. This code adds validation routines to
abort transactions that are trying to add invalid names.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds an abstract, read-only interface to the ref database.
This primitive is used to construct the read view of the ref database
(the read view is constructed by merging several *.ref files). It also
provides the mechanism to provide a unified view of the refs in the main
repository and the per-worktree refs.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is needed to create a merged view multiple reftables
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With support for reading and writing files in place, we can construct files (in
memory) and attempt to read them back.
Because some sections of the format are optional (eg. indices, log entries), we
have to exercise this code using multiple sizes of input data
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This supports reading a single reftable file.
The commit introduces an abstract iterator type, which captures the usecases
both of reading individual refs, and iterating over a segment of the ref
namespace.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable format includes support for an (OID => ref) map. This map can speed
up visibility and reachability checks. In particular, various operations along
the fetch/push path within Gerrit have ben sped up by using this structure.
The map is constructed with help of a binary tree. Object IDs are hashes, so
they are uniformly distributed. Hence, the tree does not attempt forced
rebalancing.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable format is structured as a sequence of block. Within a block,
records are prefix compressed, with an index of offsets for fully expand keys to
enable binary search within blocks.
This commit provides the logic to read and write these blocks.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will be needed for reading reflog blocks in reftable.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable format is structured as a sequence of blocks, and each block
contains a sequence of prefix-compressed key-value records. There are 4 types of
records, and they have similarities in how they must be handled. This is
achieved by introducing a polymorphic 'record' type that encapsulates ref, log,
index and object records.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable format is usually used with files for storage. However, we abstract
away this using the blocksource data structure. This has two advantages:
* log blocks are zlib compressed, and handling them is simplified if we can
discard byte segments from within the block layer.
* for unittests, it is useful to read and write in-memory. The blocksource
allows us to abstract the data away from on-disk files.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reftable/ directory is structured as a library, so it cannot
crash on misuse. Instead, it returns an error code.
In addition to signaling errors, the error code can be used to signal
conditions from lower levels of the library to be handled by higher
levels of the library. For example, in a transaction we might
legitimately write an empty reftable file, but in that case, we want to
shortcut the transaction.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The objective of this code is to be usable as a C library, so it can be reused
in libgit2.
This is currently using a BSD license as it is the liberal license I could find,
but this could be changed to whatever fits the stated goal above.
This code is currently imported from github.com/hanwen/reftable. Once this code
lands in git.git, the C code will be removed from github.com/hanwen/reftable,
and the git.git code will be the source of truth.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename and invert value of `is_missing` to `is_in_reset_tree` to make the
variable more descriptive of what it represents.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will simplify referencing them from code that is not deeply integrated with
Git, in particular, the reftable library.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:47:43 -07:00
1327 changed files with 120158 additions and 98129 deletions
Reference logs, or "reflogs", record when the tips of branches and
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.