Fix a minor regression that some compiler might notice.
* 'jk/function-pointer-mismatches-fix' (early part):
fsck: use enum object_type for fsck_walk callback
We switched the function interface for fsck callbacks in a1aad71601
(fsck.h: use "enum object_type" instead of "int", 2021-03-28). However,
we accidentally flipped the type back to "int" as part of 0b4e9013f1
(fsck: mark unused parameters in various fsck callbacks, 2023-07-03).
The mistake happened because that commit was written before a1aad71601
and rebased forward, and I screwed up while resolving the conflict.
Curiously, the compiler does not warn about this mismatch, at least not
when using gcc and clang on Linux (nor in any of our CI environments).
Based on 28abf260a5 (builtin/fsck.c: don't conflate "int" and "enum" in
callback, 2021-06-01), I'd guess that this would cause the AIX xlc
compiler to complain. I noticed because clang-18's UBSan now identifies
mis-matched function calls at runtime, and does complain of this case
when running the test suite.
I'm not entirely clear on whether this mismatch is a problem in
practice. Compilers are certainly free to make enums smaller than "int"
if they don't need the bits, but I suspect that they have to promote
back to int for function calls (though I didn't dig in the standard, and
I won't be surprised if I'm simply wrong and the real-world impact would
depend on the ABI).
Regardless, switching it back to enum is obviously the right thing to do
here; the switch to "int" was simply a mistake.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typo/grammofix to documentation added during this cycle.
* tl/notes-separator:
notes doc: tidy up `--no-stripspace` paragraph
notes doc: split up run-on sentences
With `--stdin`, we read *from* standard input, not *for*.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit 00bf685975 (show-ref doc: update for internal consistency,
2023-05-19) switched from double quotes to backticks around our {caret}
macro, we started rendering "{caret}" literally. Fix this by replacing
by a "^" character.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Where we document the `--no-stripspace` option, remove a superfluous
"For" to fix the grammar. Mark option names and command names using
`backticks` to set them in monospace.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit c4e2aa7d45 (notes.c: introduce "--[no-]stripspace" option,
2023-05-27) mentioned the new `--no-stripspace` in the documentation for
`-m` and `-F`, it created run-on sentences. It also used slightly
different language in the two sections for no apparent reason. Split the
sentences in two to improve readability, and while touching the two
sites, make them more similar.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:git/git: (34 commits)
Git 2.42-rc2
t4053: avoid writing to unopened pipe
t4053: avoid race when killing background processes
Git 2.42-rc1
git maintenance: avoid console window in scheduled tasks on Windows
win32: add a helper to run `git.exe` without a foreground window
t9001: remove excessive GIT_SEND_EMAIL_NOTTY=1
mv: handle lstat() failure correctly
parse-options: disallow negating OPTION_SET_INT 0
repack: free geometry struct
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
t0040: declare non-tab indentation to be okay in this script
advice: handle "rebase" in error_resolve_conflict()
A few more topics before -rc1
mailmap: change primary address for Glen Choo
gitignore: ignore clangd .cache directory
docs: update when `git bisect visualize` uses `gitk`
compat/mingw: implement a native locate_in_PATH()
run-command: conditionally define locate_in_PATH()
...
Windows updates.
* ds/maintenance-on-windows-fix:
git maintenance: avoid console window in scheduled tasks on Windows
win32: add a helper to run `git.exe` without a foreground window
Adjust to newer Term::ReadLine to prevent it from breaking
the interactive prompt code in send-email.
* jk/send-email-with-new-readline:
send-email: avoid creating more than one Term::ReadLine object
send-email: drop FakeTerm hack
Developer support to detect meaningless combination of options.
* rs/parse-opt-forbid-set-int-0-without-noneg:
parse-options: disallow negating OPTION_SET_INT 0
This fixes an occasional hang I see when running t4053 with
--verbose-log using dash.
Commit 1e3f26542a (diff --no-index: support reading from named pipes,
2023-07-05) added a test that "diff --no-index" will complain when
comparing a named pipe and a directory. The minimum we need to test this
is to mkfifo the pipe, and then run "git diff --no-index pipe some_dir".
But the test does one thing more: it spawns a background shell process
that opens the pipe for writing, like this:
{
(>pipe) &
} &&
This extra writer _could_ be useful if Git misbehaves and tries to open
the pipe for reading. Without the writer, Git would block indefinitely
and the test would never end. But since we do not have such a bug, Git
does not open the pipe and it is the writing process which will block
indefinitely, since there are no readers. The test addresses this by
running "kill $!" in a test_when_finished block. Since the writer should
be blocking forever, this kill command will reliably find it waiting.
However, this seems to be somewhat racy, in that the writing process
sometimes hangs around even after the "kill". In a normal run of the
test script without options, this doesn't have any effect; the
main test script completes anyway. But with --verbose-log, we spawn a
"tee" process that reads the script output, and it won't end until all
descriptors pointing to its input pipe are closed. And the background
process that is hanging around still has its stderr, etc, pointed into
that pipe.
You can reproduce the situation like this:
cd t
./t4053-diff-no-index.sh --verbose-log --stress
Let that run for a few minutes, and then you'll find that some of the
runs have hung. For example, at 11:53, I ran:
$ ps xk start o pid,start,command | grep tee | head
713459 11:48:06 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-9.out
713527 11:48:06 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-15.out
719434 11:48:07 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-1.out
728117 11:48:08 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-5.out
738738 11:48:09 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-31.out
739457 11:48:09 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-27.out
744432 11:48:10 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-21.out
744471 11:48:10 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-29.out
761961 11:48:12 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-0.out
812299 11:48:19 tee -a /home/peff/compile/git/t/test-results/t4053-diff-no-index.stress-8.out
All of these have been hung for several minutes. We can investigate one
and see that it's waiting to get EOF on its input:
$ strace -p 713459
strace: Process 713459 attached
read(0,
^C
Who else has that descriptor open?
$ lsof -a -p 713459 -d 0 +E
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
tee 713459 peff 0r FIFO 0,13 0t0 3943636 pipe 719203,sh,5w 719203,sh,7w 719203,sh,12w 719203,sh,13w
sh 719203 peff 5w FIFO 0,13 0t0 3943636 pipe 713459,tee,0r 719203,sh,7w 719203,sh,12w 719203,sh,13w
sh 719203 peff 7w FIFO 0,13 0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,12w 719203,sh,13w
sh 719203 peff 12w FIFO 0,13 0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,7w 719203,sh,13w
sh 719203 peff 13w FIFO 0,13 0t0 3943636 pipe 713459,tee,0r 719203,sh,5w 719203,sh,7w 719203,sh,12w
It's a shell, presumably a subshell spawned by the main script. Though
it may seem odd, having the same descriptor open several times is not
unreasonable (they're all basically the original stdout/stderr of the
script that has been copied). And they should all close when the process
exits. So what's it doing? Curiously, it will exit as soon as we strace
it:
$ strace -s 64 -p 719203
strace: Process 719203 attached
openat(AT_FDCWD, "pipe", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 ENOENT (No such file or directory)
write(2, "./t4053-diff-no-index.sh: 7: eval: ", 35) = 35
write(2, "cannot create pipe: Directory nonexistent", 41) = 41
write(2, "\n", 1) = 1
exit_group(2) = ?
+++ exited with 2 +++
I think what happens is this:
- it is blocking in the openat() call for the pipe, as we expect (so
this is definitely the backgrounded subshell mentioned above)
- strace sends signals (probably STOP/CONT); those cause the kernel to
stop blocking, but libc will restart the system call automatically
- by this time, the "pipe" fifo is gone, so we'll actually try to
create a regular file. But of course the surrounding directory is
gone, too! So we get ENOENT, and then exit as normal.
So the blocking is something we expect to happen. But what we didn't
expect is for the process to still exist at all! It should have been
killed earlier when the parent process called "kill", but it wasn't. And
we can't catch the race at this point, because it happened much earlier.
One can guess, though, that there is some race with the shell setting up
the signal handling in the backgrounded subshell, and possibly blocking
or ignoring signals at the time that the "kill" is received. Curiously,
the race does not seem to happen if I use "bash" instead of "dash", so
presumably bash's setup here is more atomic.
One fix might be to try killing the subshell more aggressively, either
using SIGKILL, or looping on kill/wait. But that seems complex and
likely to introduce new problems/races. Instead, we can observe that the
writer is not needed at all. Git will notice the pipe via stat() before
it is ever opened. So we can simply drop the writer subshell entirely.
If we ever changed Git to open the path and fstat() it, this would
result in the test hanging. But we're not likely to do that. After all,
we have to stat() paths to see if they are openable at all (e.g., it
could be a directory), so this seems like a low risk. And anybody who
does make such a change will immediately see the issue, as Git would
hang consistently.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test 'diff --no-index reads from pipes' starts a couple of
background processes that write to the pipes that are passed to "diff
--no-index". If the test passes then we expect these processes to exit
as all their output will have been read. However if the test fails
then we want to make sure they do not hang about on the users machine
and the test remembers they should be killed by calling
test_when_finished "! kill $!"
after each background process is created. Unfortunately there is a
race where test_when_finished may run before the background process
exits even when all its output has been read resulting in the kill
command succeeding which causes the test to fail. Fix this by ignoring
the exit status of the kill command. If the diff is successful we
could instead wait for the background process to exit and check their
status but that feels like it is testing the platform's printf
implementation rather than git's code.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase -i" with a series of squash/fixup, when one of the
steps stopped in conflicts and ended up getting skipped, did not
handle the accumulated commit log messages, which has been
corrected.
* pw/rebase-skip-commit-message-fix:
rebase --skip: fix commit message clean up when skipping squash
"git bisect visualize" stopped running "gitk" on Git for Windows
when the command was reimplemented in C around Git 2.34 timeframe.
This has been corrected.
* ma/locate-in-path-for-windows:
docs: update when `git bisect visualize` uses `gitk`
compat/mingw: implement a native locate_in_PATH()
run-command: conditionally define locate_in_PATH()
Exclude "." from the set of characters to be removed from the
beginning and the end of the human-readable name.
* bc/ident-dot-is-no-longer-crud-letter:
ident: don't consider '.' a crud
Adjust to OpenSSL 3+, which deprecates its SHA-1 functions based on
its traditional API, by using its EVP API instead.
* ew/hash-with-openssl-evp:
avoid SHA-1 functions deprecated in OpenSSL 3+
sha256: avoid functions deprecated in OpenSSL 3+
We just introduced a helper to avoid showing a console window when the
scheduled task runs `git.exe`. Let's actually use it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, there are two kinds of executables, console ones and
non-console ones. Git's executables are all console ones.
When launching the former e.g. in a scheduled task, a CMD window pops
up. This is not what we want for the tasks installed via the `git
maintenance` command.
To work around this, let's introduce `headless-git.exe`, which is a
non-console program that does _not_ pop up any window. All it does is to
re-launch `git.exe`, suppressing that console window, passing through
all command-line arguments as-are.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Helped-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was added by 3ece9bf0f9 (send-email: clear the $message_id after
validation, 2023-05-17) for no apparent reason, as this is required only
in cases when git's stdin is (must be) redirected, which isn't the case
here.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When moving a directory onto another with `git mv` various checks are
performed. One of of these validates that the destination is not existing.
When calling `lstat` on the destination path and it fails as the path
doesn't exist, some environments seem to overwrite the passed in
`stat` memory nonetheless (I observed this issue on debian 12 of x86_64,
running on OrbStack on ARM, emulated with Rosetta).
This would affect the code that followed as it would still acccess a now
modified `st` structure, which now seems to contain uninitialized memory.
`S_ISDIR(st_dir_mode)` would then typically return false causing the code
to run into a bad case.
The fix avoids overwriting the existing `st` structure, providing an
alternative that exists only for that purpose.
Note that this patch minimizes complexity instead of stack-frame size.
Signed-off-by: Sebastian Thiel <sebastian.thiel@icloud.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An option of type OPTION_SET_INT can be defined to set its variable to
zero. It's negated variant will do the same, though, which is
confusing. Several such options were fixed by disabling negation,
changing the value to set or using a different option type:
991c552916 (ls-tree: fix --no-full-name, 2023-07-18)
e12cb98e1e (branch: reject "--no-all" and "--no-remotes" early, 2023-07-18)
68cbb20e73 (show-branch: reject --[no-](topo|date)-order, 2023-07-19)
3821eb6c3d (reset: reject --no-(mixed|soft|hard|merge|keep) option, 2023-07-19)
36f76d2a25 (pack-objects: fix --no-quiet, 2023-07-21)
3a5f308741 (pack-objects: fix --no-keep-true-parents, 2023-07-21)
c95ae3ff9c (describe: fix --no-exact-match, 2023-07-21)
d089a06421 (bundle: use OPT_PASSTHRU_ARGV, 2023-07-29)
Check for such options that allow negation in parse_options_check() and
report them to find future cases quicker.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the program is ending, we call clear_pack_geometry() to free any
resources in the pack_geometry struct. But the struct itself is
allocated on the heap, and leak-checkers will complain about the
resulting small leak.
This one was marked by Coverity as a "new" leak, though it has existed
since 0fabafd0b9 (builtin/repack.c: add '--geometric' option,
2021-02-22). This might be because recent unrelated changes in the file
confused it about what is new and what is not. But regardless, it is
worth addressing.
We can fix it easily by free-ing the struct. We'll convert our "clear"
function to "free", since the allocation happens in the matching init()
function (though since there is only one call to each, and the struct is
local to this file, it's mostly academic).
Another option would be to put the struct on the stack rather than the
heap. However, this gets tricky, as we check the pointer against NULL in
several places to decide whether we're in geometric mode.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Every time git-send-email calls its ask() function to prompt the user,
we call term(), which instantiates a new Term::ReadLine object. But in
v1.46 of Term::ReadLine::Gnu (which provides the Term::ReadLine
interface on some platforms), its constructor refuses to create a second
instance[1]. So on systems with that version of the module, most
git-send-email instances will fail (as we usually prompt for both "to"
and "in-reply-to" unless the user provided them on the command line).
We can fix this by keeping a single instance variable and returning it
for each call to term(). In perl 5.10 and up, we could do that with a
"state" variable. But since we only require 5.008, we'll do it the
old-fashioned way, with a lexical "my" in its own scope.
Note that the tests in t9001 detect this problem as-is, since the
failure mode is for the program to die. But let's also beef up the
"Prompting works" test to check that it correctly handles multiple
inputs (if we had chosen to keep our FakeTerm hack in the previous
commit, then the failure mode would be incorrectly ignoring prompts
after the first).
[1] For discussion of why multiple instances are forbidden, see:
https://github.com/hirooih/perl-trg/issues/16
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 280242d1cc (send-email: do not barf when Term::ReadLine does not
like your terminal, 2006-07-02), we added a fallback for when
Term::ReadLine's constructor failed: we'd have a FakeTerm object
instead, which would then die if anybody actually tried to call
readline() on it. Since we instantiated the $term variable at program
startup, we needed this workaround to let the program run in modes when
we did not prompt the user.
But later, in f4dc9432fd (send-email: lazily load modules for a big
speedup, 2021-05-28), we started loading Term::ReadLine lazily only when
ask() is called. So at that point we know we're trying to prompt the
user, and we can just die if ReadLine instantiation fails, rather than
making this fake object to lazily delay showing the error.
This should be OK even if there is no tty (e.g., we're in a cron job),
because Term::ReadLine will return a stub object in that case whose "IN"
and "OUT" functions return undef. And since 5906f54e47 (send-email:
don't attempt to prompt if tty is closed, 2009-03-31), we check for that
case and skip prompting.
And we can be sure that FakeTerm was not kicking in for such a
situation, because it has actually been broken since that commit! It
does not define "IN" or "OUT" methods, so perl would barf with an error.
If FakeTerm was in use, we were neither honoring what 5906f54e47 tried
to do, nor producing the readable message that 280242d1cc intended.
So we're better off just dropping FakeTerm entirely, and letting the
error reported by constructing Term::ReadLine through.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By necessity, this script needs to verify that certain Git output
matches expectations, including text indented with spaces instead of
tabs.
Most recently, such a check was introduced in 448abbba63 (short help:
allow multi-line opthelp, 2023-07-18) which is reported by `git diff
--check 448abbba6347^!` as having whitespace issues.
Let's not complain about this because it is intentional.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes sure that we get a properly translated message rather than
inserting the command (which we failed to translate) into a generic
fallback message.
The function is called indirectly via die_resolve_conflict() with fixed
strings, and directly with the string obtained via action_name(), which
in turn returns a string from a fixed set. Hence we know that the now
covered set of strings is exhausitive, and will therefore BUG() out when
encountering an unexpected string. We also know that all covered strings
are actually used.
Arguably, the above suggests that it would be cleaner to pass the
command as an enum in the first place, but that's left for another time.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfixes.
* ew/sha256-gcrypt-leak-fixes:
sha256/gcrypt: die on gcry_md_open failures
sha256/gcrypt: fix memory leak with SHA-256 repos
sha256/gcrypt: fix build with SANITIZE=leak
Tone down the warning on SHA-256 repositories being an experimental
curiosity. We do not have support for them to interoperate with
traditional SHA-1 repositories, but at this point, we do not plan
to make breaking changes to SHA-256 repositories and there is no
longer need for such a strongly phrased warning.
* am/doc-sha256:
doc: sha256 is no longer experimental
In at least some versions of clangd, including version 15 in Ubuntu
23.04, a directory, .cache, is written in the root of the repository
with index information about the files in the repository. Since clangd
is the most common language server protocol (LSP) implementation for C,
and we already support it using the GENERATE_COMPILATION_DATABASE flags
to make it functional, it's likely many users are using or will want to
use it.
As a result, ignore the ".cache" directory to help avoid users
accidentally committing the data.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch -f X" to repoint the branch X said that X was "checked
out" in another worktree, even when branch X was not and instead
being bisected or rebased. The message was reworded to say the
branch was "in use".
* jc/branch-in-use-error-message:
branch: update the message to refuse touching a branch in-use
"git blame --contents=file" has been taught to work in a bare
repository.
* hy/blame-in-bare-with-contents:
blame: allow --contents to work with bare repo
Command line parser fix, and a small parse-options API update.
* jc/parse-options-short-help:
short help: allow a gap smaller than USAGE_GAP
remote: simplify "remote add --tags" help text
short help: allow multi-line opthelp
Clarify how to pick a starting point for a new topic in the
SubmittingPatches document.
* la/doc-choose-starting-point-fixup:
SubmittingPatches: use of older maintenance tracks is an exception
SubmittingPatches: explain why 'next' and above are inappropriate base
SubmittingPatches: choice of base for fixing an older maintenance track
Rewrite the description of giving a custom command to the
submodule.<name>.update configuration variable.
* pv/doc-submodule-update-settings:
doc: highlight that .gitmodules does not support !command
Fix tests with unportable regex patterns.
* ja/worktree-orphan-fix:
t2400: rewrite regex to avoid unintentional PCRE
builtin/worktree.c: convert tab in advice to space
t2400: drop no-op `--sq` from rev-parse call
The implementation of "get_sha1_hex()" that reads a hexadecimal
string that spells a full object name has been extended to cope
with any hash function used in the repository, but the "sha1" in
its name survived. Rename it to get_hash_hex(), a name that is
more consistent within its friends like get_hash_hex_algop().
* jc/retire-get-sha1-hex:
hex: retire get_sha1_hex()
Clarify how to choose the starting point for a new topic in
developer guidance document.
* la/doc-choose-starting-point:
SubmittingPatches: simplify guidance for choosing a starting point
SubmittingPatches: emphasize need to communicate non-default starting points
SubmittingPatches: de-emphasize branches as starting points
SubmittingPatches: discuss subsystems separately from git.git
SubmittingPatches: reword awkward phrasing
This check has involved more environment variables than just `DISPLAY` since
508e84a790 (bisect view: check for MinGW32 and MacOSX in addition to X11,
2008-02-14), so let's update the documentation accordingly.
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
since 5e1f28d (bisect--helper: reimplement `bisect_visualize()` shell
function in C, 2021-09-13) `git bisect visualize` uses exists_in_PATH()
to check wether it should call `gitk`, but exists_in_PATH() relies on
locate_in_PATH() which currently only understands POSIX-ish PATH variables
(a list of paths, separated by colons) on native Windows executables
we encounter Windows PATH variables (a list of paths that often contain
drive letters (and thus colons), separated by semicolons). Luckily we do
already have a function that can lookup executables on windows PATHs:
path_lookup(). Implement a small replacement for the existing
locate_in_PATH() based on path_lookup().
Reported-by: Louis Strous <Louis.Strous@intellimagic.com>
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit doesn't change any behaviour by itself, but allows us to easily
define compat replacements for locate_in_PATH(). It prepares us for the next
commit that adds a native Windows implementation of locate_in_PATH().
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During a series of "fixup" and/or "squash" commands, the interactive
rebase accumulates a commit message from all the commits that are being
squashed together. If one of the commits has conflicts when it is picked
and the user chooses to skip that commit then we need to remove that
commit's message from accumulated messages. To do this 15ef69314d
(rebase --skip: clean up commit message after a failed fixup/squash,
2018-04-27) updated commit_staged_changes() to reset the accumulated
message to the commit message of HEAD (which does not contain the
message from the skipped commit) when the last command was "fixup" or
"squash" and there are no staged changes. Unfortunately the code to do
this contains two bugs.
(1) If parse_head() fails we pass an invalid pointer to
unuse_commit_buffer().
(2) The reconstructed message uses the entire commit buffer from HEAD
including the headers, rather than just the commit message.
The first issue is fixed by splitting up the "if" condition into several
statements each with its own error handling. The second issue is fixed
by finding the start of the commit message within the commit buffer
using find_commit_subject().
The existing test added by 15ef69314d is modified to show the effect of
this bug. The bug is triggered when skipping the first command in the
chain (as the test does before this commit) but the effect is hidden
because opts->current_fixup_count is set to zero which leads
update_squash_messages() to recreate the squash message file from
scratch overwriting the bad message created by
commit_staged_changes(). The test is also updated to explicitly check
the commit messages rather than relying on grep to ensure they do not
contain any stray commit headers.
To check the commit message the function test_commit_message() is moved
from t3437-rebase-fixup-options.sh to test-lib.sh. As the function is
now publicly available it is updated to provide better error detection
and avoid overwriting the commonly used files "actual" and "expect".
Support for reading the expected commit message from stdin is also
added.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we process a user's name (as in user.name), we strip all
leading and trailing crud from it. Right now, we consider a dot
a crud character, and strip it off.
However, this is unsuitable for many personal names because humans
frequently have abbreviated suffixes, such as "Jr." or "Sr." at the end
of their names, and this corrupts them. Some other users may wish to
use an abbreviated name or initial, which will pose a problem especially
in cultures that write the family name first, followed by the personal
name.
Since the current approach causes lots of practical problems, let's
avoid it by no longer considering a dot to be crud.
Note that "." in the name forces the entire name to be quoted to
please mailers, but stripping "." only at the beginning and the end
does not help a name with "." in the middle (like "brian m. carlson")
so this change will not make it much worse. A name like "Given
Family, Jr." that did not have to be quoted now would need to be, in
order to be placed on the e-mail headers, though.
This is based on a weather-balloon patch by Jeff King sent in Aug 2021
https://lore.kernel.org/git/YSKm8Q8nyTavQaox@coredump.intra.peff.net/
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch --list --format=<format>" and friends are taught
a new "%(describe)" placeholder.
* ks/ref-filter-describe:
ref-filter: add new "describe" atom
ref-filter: add multiple-option parsing functions
When the user edits "rebase -i" todo file so that it starts with a
"fixup", which would make it invalid, the command truncated the
rest of the file before giving an error and returning the control
back to the user. Stop truncating to make it easier to correct
such a malformed todo file.
* ah/sequencer-rewrite-todo-fix:
sequencer: finish parsing the todo list despite an invalid first line
Instead of inventing a custom counter variables for debugging,
use existing trace2 facility in the fsync customization codepath.
* bb/use-trace2-counters-for-fsync-stats:
wrapper: use trace2 counters to collect fsync stats
"./configure --with-expat=no" did not work as a way to refuse use
of the expat library on a system with the library installed, which
has been corrected.
* ah/autoconf-fixes:
configure.ac: always save NO_ICONV to config.status
configure.ac: don't overwrite NO_CURL option
configure.ac: don't overwrite NO_EXPAT option
Code simplification.
* jc/tree-walk-drop-base-offset:
tree-walk: drop unused base_offset from do_match()
tree-walk: lose base_offset that is never used in tree_entry_interesting
OpenSSL 3+ deprecates the SHA1_Init, SHA1_Update, and SHA1_Final
functions, leading to errors when building with `DEVELOPER=1'.
Use the newer EVP_* API with OpenSSL 3+ (only) despite being more
error-prone and less efficient due to heap allocations.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OpenSSL 3+ deprecates the SHA256_Init, SHA256_Update, and SHA256_Final
functions, leading to errors when building with `DEVELOPER=1'.
Use the newer EVP_* API with OpenSSL 3+ despite being more
error-prone and less efficient due to heap allocations.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove scary wording that basically stops people using sha256
repositories not because of interoperability issues with sha1
repositories, but from fear that their work will suddenly become
incompatible in some future version of git.
We should be clear that currently sha256 repositories will not work with
sha1 repositories but stop the scary words.
Signed-off-by: Adam Majer <adamm@zombino.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`gcry_md_open' allocates memory and must (like all allocation
functions) be checked for failure.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`gcry_md_open' needs to be paired with `gcry_md_close' to ensure
resources are released. Since our internal APIs don't have
separate close/release callbacks, sticking it into the finalization
callback seems appropriate.
Building with SANITIZE=leak and running `git fsck' on a SHA-256
repository no longer reports leaks.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-static functions cause `undefined reference' errors when
building with `SANITIZE=leak' due to the lack of prototypes.
Mark all these functions as `static inline' as we do in
sha256/nettle.h to avoid the need to maintain prototypes.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git bundle" passes the progress control options to "git pack-objects"
by parsing and then recreating them explicitly. Simplify that process
by using OPT_PASSTHRU_ARGV instead.
This also fixes --no-quiet, which has been doing the same as --quiet
since its introduction by 79862b6b77 (bundle-create: progress output
control, 2019-11-10) because it had been defined using OPT_SET_INT with
a value of 0, which sets 0 when negated as well.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finding mistakes in and improving your own patches is a good idea,
but doing so too quickly is being inconsiderate to reviewers who
have just seen the initial iteration and taking their time to review
it. Encourage new developers to perform such a self review before
they send out their patches, not after. After sending a patch that
they immediately found mistakes in, they are welcome to comment on
them, mentioning what and how they plan to improve them in an
updated version, before sending out their updates.
Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Command line parser fixes.
* jc/parse-options-show-branch:
show-branch: reject --[no-](topo|date)-order
show-branch: --no-sparse should give dense output
While we could technically fix each and every bug on top of the
commit that introduced it, it is not necessarily practical. For
trivial and low-value bugfixes, it often is simpler and sufficient
to just fix it in the current maintenance track, leaving the bug
unfixed in the older maintenance tracks.
Demote the "use older maintenance track to fix old bugs" as a side
note, and explain that the choice is used only in exceptional cases.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'next' branch is primarily meant to be a testing ground to make
sure that topics that are reasonably well done work well together.
Building a new work on it would mean everything that was already in
'next' must have graduated to 'master' before the new work can also
be merged to 'master', and that is why we do not encourage basing
new work on 'next'.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace all cases of `\s` with ` ` as it is not part of POSIX BRE or ERE
and therefore not all versions of grep handle it.
For the same reason all cases of `\S` are replaced with `[^ ]`. It is
not an exact replacement but it is close enough for this use case.
Also, do not write `\+` in BRE and expect it to mean 1 or more;
it is a GNU extension that may not work everywhere.
Remove `.*` from the end of a pattern that is not right-anchored.
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working on an high-value bugfix that must be given to ancient
maintenance tracks, a starting point that is older than `maint` may
have to be chosen.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bugfix for fc01a5d2 (submodule update documentation: don't repeat
ourselves, 2016-12-27).
The `custom command` and `none` options are described as sharing the
same limitations, but one is allowed in .gitmodules and the other is
not.
Rewrite the description for custom commands to be more precise,
and make it easier for readers to notice that custom commands cannot
be used in the .gitmodules file.
Signed-off-by: Petar Vutov <pvutov@imap.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git tag --list --points-at X" showed tags that directly refers to
object X, but did not list a tag that points at such a tag, which
has been corrected.
* jk/nested-points-at:
ref-filter: simplify return type of match_points_at
ref-filter: avoid parsing non-tags in match_points_at()
ref-filter: avoid parsing tagged objects in match_points_at()
ref-filter: handle nested tags in --points-at option
Mark-up unused parameters in the code so that we can eventually
enable -Wunused-parameter by default.
* jk/unused-parameter:
t/helper: mark unused callback void data parameters
tag: mark unused parameters in each_tag_name_fn callbacks
rev-parse: mark unused parameter in for_each_abbrev callback
replace: mark unused parameter in each_mergetag_fn callback
replace: mark unused parameter in ref callback
merge-tree: mark unused parameter in traverse callback
fsck: mark unused parameters in various fsck callbacks
revisions: drop unused "opt" parameter in "tweak" callbacks
count-objects: mark unused parameter in alternates callback
am: mark unused keep_cr parameters
http-push: mark unused parameter in xml callback
http: mark unused parameters in curl callbacks
do_for_each_ref_helper(): mark unused repository parameter
test-ref-store: drop unimplemented reflog-expire command
Names of MinGW header files are spelled in mixed case in some
source files, but the build host can be using case sensitive
filesystem with header files with their name spelled in all
lowercase.
* mh/mingw-case-sensitive-build:
mingw: use lowercase includes for some Windows headers
Various offset computation in the code that accesses the packfiles
and other data in the object layer has been hardened against
arithmetic overflow, especially on 32-bit systems.
* tb/object-access-overflow-protection:
commit-graph.c: prevent overflow in `verify_commit_graph()`
commit-graph.c: prevent overflow in `write_commit_graph()`
commit-graph.c: prevent overflow in `merge_commit_graph()`
commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
commit-graph.c: prevent overflow in `load_tree_for_commit()`
commit-graph.c: prevent overflow in `fill_commit_in_graph()`
commit-graph.c: prevent overflow in `fill_commit_graph_info()`
commit-graph.c: prevent overflow in `load_oid_from_graph()`
commit-graph.c: prevent overflow in add_graph_to_chain()
commit-graph.c: prevent overflow in `write_commit_graph_file()`
pack-bitmap.c: ensure that eindex lookups don't overflow
midx.c: prevent overflow in `fill_included_packs_batch()`
midx.c: prevent overflow in `write_midx_internal()`
midx.c: store `nr`, `alloc` variables as `size_t`'s
midx.c: prevent overflow in `nth_midxed_offset()`
midx.c: prevent overflow in `nth_midxed_object_oid()`
midx.c: use `size_t`'s for fanout nr and alloc
packfile.c: use checked arithmetic in `nth_packed_object_offset()`
packfile.c: prevent overflow in `load_idx()`
packfile.c: prevent overflow in `nth_packed_object_id()`
Help newbies by suggesting that there are cases where force-pushing
is a valid and sensible thing to update a branch at a remote
repository, rather than reconciling with merge/rebase.
* ah/advise-force-pushing:
push: don't imply that integration is always required before pushing
remote: don't imply that integration is always required before pushing
wt-status: don't show divergence advice when committing
The naming convention around get_sha1_hex() and its friends is
awkward these days, after "struct object_id" was introduced.
There are three public functions around this area:
* get_sha1_hex() - use the implied the_hash_algo, fill uchar *
* get_oid_hex() - use the implied the_hash_algo, fill oid *
* get_oid_hex_algop() - use the passed algop, fill oid *
Between the latter two, the "_algop" suffix signals whether the
the_hash_algo is used as the implied algorithm or the caller should
pass an algorithm explicitly. That is very much understandable and
is a good convention.
Between the former two, however, the "SHA1" vs "OID" in the names
differentiate in what type of variable the result is stored.
We could argue that it makes sense to use "SHA1" to mean "flat byte
buffer" to honor the historical practice in the days before "struct
object_id" was invented, but the natural fourth friend of the above
group would take an algop and fill a flat byte buffer, and it would
be strange to name it get_sha1_hex_algop(). Do we use the passed in
algo, or are we limited to SHA-1 ;-)?
In fact, such a function exists, albeit as a private helper function
used by the implementation of these functions, and is named a lot
more sensibly: get_hash_hex_algop().
Correct the misnomer of get_sha1_hex() and use "hash", instead of
"sha1", as "flat byte buffer that stores binary (as opposed to
hexadecimal) representation of the hash".
The four (2x2) friends now become:
* get_hash_hex() - use the implied the_hash_algo, fill uchar *
* get_oid_hex() - use the implied the_hash_algo, fill oid *
* get_hash_hex_algop() - use the passed algop, fill uchar *
* get_oid_hex_algop() - use the passed algop, fill oid *
As there are only two remaining calls to get_sha1_hex() in the
codebase right now, the blast radious of this change is fairly
small.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous commit, we introduced a sub-shell in the implementation of
`graph_git_behavior()`, in order to allow us to pass `-C "$DIR"`
directly to the git processes spawned by `graph_git_two_modes()`.
Now that its callers are always operating from the "$TRASH_DIRECTORY"
instead of one of its sub-directories, we can drop the inner sub-shell,
as it is no longer required.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as the last commit, avoid top-level directory
changes in the last remaining commit-graph related test, t5328.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid changing the current working directory from outside of a sub-shell
during the tests in t5318.
Each test has mostly straightforward changes, either:
- Removing the top-level `cd "$TRASH_DIRECTORY/full"`, which is
unnecessary after ensuring that other tests don't change their
working directory outside of a sub-shell.
- Changing any Git invocations which want to be in a sub-directory by
either (a) adding a "-C $DIR" argument, or (b) moving the whole test
into a sub-shell.
While we're here, remove any explicit "git config core.commitGraph true"
invocations which were designed to enable use of the commit-graph. These
are unnecessary following 31b1de6a09 (commit-graph: turn on
commit-graph by default, 2019-08-13).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `graph_git_behavior()` helper asserts that a number of common Git
operations (such as `git log --oneline`, `git log --topo-order`, etc.)
produce identical output regardless of whether or not a commit-graph is
in use.
This helper takes as its second argument the location (relative to the
`$TRASH_DIRECTORY`) of the Git repostiory under test. In order to run
each of its commands within that repository, it first changes into that
directory, without the use of a sub-shell.
This pollutes future tests which expect to be run in the top-level
`$TRASH_DIRECTORY` as usual. We could wrap `graph_git_behavior()` in a
sub-shell, like:
graph_git_behavior() {
# ...
(
cd "$TRASH_DIRECTORY/$DIR" &&
graph_git_two_modesl
)
}
, but since we're invoking git directly, we can pass along a "-C $DIR"
when "$DIR" is non-empty.
Note, however, that until the remaining callers are cleaned up to avoid
changing working directories outside of a sub-shell, that we need to
ensure that we are operating in the top-level $TRASH_DIRECTORY. The
inner-subshell will go away in a future commit once it is no longer
necessary.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `graph_read_expect()` function is used to ensure that the output of
the "read-graph" test helper matches certain parameters (e.g., how many
commits are in the graph, which chunks were written, etc.).
It expects the Git repository being tested to be at the current working
directory. However, a handful of t5318 tests use different repositories
stored in sub-directories. To work around this, several tests in t5318
change into the relevant repository outside of a sub-shell, altering the
context for the rest of the suite.
Prepare to remove these globally-scoped directory changes by teaching
`graph_read_expect()` to take an optional "-C dir" to specify where the
repository containing the commit-graph being tested is.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Duplicate the logic of %(describe) and friends from pretty to
ref-filter. In the future, this change helps in unifying both the
formats as ref-filter will be able to do everything that pretty is doing
and we can have a single interface.
The new atom "describe" and its friends are equivalent to the existing
pretty formats with the same name.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions
match_placeholder_arg_value()
match_placeholder_bool_arg()
were added in pretty 4f732e0fd7 (pretty: allow %(trailers) options
with explicit value, 2019-01-29) to parse multiple options in an
argument to --pretty. For example,
git log --pretty="%(trailers:key=Signed-Off-By,separator=%x2C )"
will output all the trailers matching the key and seperates them by
a comma followed by a space per commit.
Add similar functions,
match_atom_arg_value()
match_atom_bool_arg()
in ref-filter.
There is no atom yet that can use these functions in ref-filter, but we
are going to add a new %(describe) atom in a subsequent commit where we
parse options like tags=<bool-value> or match=<pattern> given to it.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before the todo list is edited it is rewritten to shorten the OIDs of
the commits being picked and to append advice about editing the list.
The exact advice depends on whether the todo list is being edited for
the first time or not. After the todo list has been edited it is
rewritten to lengthen the OIDs of the commits being picked and to remove
the advice. If the edited list cannot be parsed then this last step is
skipped.
Prior to db81e50724 (rebase-interactive: use todo_list_write_to_file()
in edit_todo_list(), 2019-03-05) if the existing todo list could not be
parsed then the initial rewrite was skipped as well. This had the
unfortunate consequence that if the list could not be parsed after the
initial edit the advice given to the user was wrong when they re-edited
the list. This change relied on todo_list_parse_insn_buffer() returning
the whole todo list even when it cannot be parsed. Unfortunately if the
list starts with a "fixup" command then it will be truncated and the
remaining lines are lost. Fix this by continuing to parse after an
initial "fixup" commit as we do when we see any other invalid line.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[jc: removed an apparently unneeded subshell around the test body]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git branch -f" command can refuse to force-update a branch that
is used by another worktree. The original rationale for this
behaviour was that updating a branch that is checked out in another
worktree, without making a matching change to the index and the
working tree files in that worktree, will lead to a very confused
user. "git diff HEAD" will no longer give a useful patch, because
HEAD is a commit unrelated to what the index and the working tree in
the worktree were based on, for example.
These days, the same mechanism also protects branches that are being
rebased or bisected, and the same machanism is expected to be the
right place to add more checks, when we decide to protect branches
undergoing other kinds of operations. We however forgot to rethink
the messaging, which originally said that we are refusing to touch
the branch because it is "checked out" elsewhere, when d2ba271a
(branch: check for bisects and rebases, 2022-06-14) started to
protect branches that are being rebased or bisected.
The spirit of the check has always been that we do not want to
disrupt the use of the same branch in other worktrees. Let's reword
the message slightly to say that the branch is "used by" another
worktree, instead of "checked out".
We could teach the branch.c:prepare_checked_out_branches() function
to remember why it decided that a particular branch needs protecting
(i.e. was it because it was checked out? being bisected? something
else?) in addition to which worktree the branch was in use, and use
that in the error message to say "you cannot force update this
branch because it is being bisected in the worktree X", etc., but it
is dubious that such extra complexity is worth it. The message
already tells which directory the worktree in question is, and it
should be just a "chdir" away for the user to find out what state it
is in, if the user felt curious enough. So let's not go there yet.
Helped-by: Josh Sref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enumerating refs in the packed-refs file, while excluding refs that
match certain patterns, has been optimized.
* tb/refs-exclusion-and-packed-refs:
ls-refs.c: avoid enumerating hidden refs where possible
upload-pack.c: avoid enumerating hidden refs where possible
builtin/receive-pack.c: avoid enumerating hidden references
refs.h: implement `hidden_refs_to_excludes()`
refs.h: let `for_each_namespaced_ref()` take excluded patterns
revision.h: store hidden refs in a `strvec`
refs/packed-backend.c: add trace2 counters for jump list
refs/packed-backend.c: implement jump lists to avoid excluded pattern(s)
refs/packed-backend.c: refactor `find_reference_location()`
refs: plumb `exclude_patterns` argument throughout
builtin/for-each-ref.c: add `--exclude` option
ref-filter.c: parameterize match functions over patterns
ref-filter: add `ref_filter_clear()`
ref-filter: clear reachable list pointers after freeing
ref-filter.h: provide `REF_FILTER_INIT`
refs.c: rename `ref_filter`
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted the option --no-quiet, but it
does the same as --quiet. That's because it's defined using OPT_SET_INT
with a value of 0, which sets 0 when negated, too.
Make --no-quiet equivalent to --progress and ignore it if --all-progress
was given.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 99fb6e04cb (pack-objects: convert to use parse_options(),
2012-02-01) git pack-objects has accepted --no-keep-true-parents, but
this option does the same as --keep-true-parents. That's because it's
defined using OPT_SET_INT with a value of 0, which sets 0 when negated
as well.
Turn --no-keep-true-parents into the opposite of --keep-true-parents by
using OPT_BOOL and storing the option's status directly in a variable
named "grafts_keep_true_parents" instead of in negative form in
"grafts_replace_parents".
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 2c33f75754 (Teach git-describe --exact-match to avoid expensive
tag searches, 2008-02-24) git describe accepts --no-exact-match, but it
does the same as --exact-match, an alias for --candidates=0. That's
because it's defined using OPT_SET_INT with a value of 0, which sets 0
when negated as well.
Let --no-exact-match set the number of candidates to the default value
instead. Users that need a more specific lack of exactitude can specify
their preferred value using --candidates, as before.
The "--no-exact-match" option was not covered in the tests, so let's
add a few. Also add a case where --exact-match option is used on a
commit that cannot be described without distance from tags and make
sure the command fails.
Signed-off-by: René Scharfe <l.s.r@web.de>
[jc: added trivial tests]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --contents option can be used with git blame to blame the file
as if it had the contents from the specified file. Since 1a3119ed
(blame: allow --contents to work with non-HEAD commit, 2023-03-24),
the --contents option can work with non-HEAD commit. However, if you
try to use --contents in a bare repository, you get the following
error:
fatal: this operation must be run in a work tree
This is because before trying to generate a fake working tree
commit, we always call setup_work_tree(). But in a bare repo,
working tree is not available. The call to setup_work_tree is used
to prepare the reading of the blamed file in the working tree, which
isn't necessary if we are reading the contents from the specific
file instead of the file in the working tree.
Add a check in setup_scoreboard to skip setup_work_tree if we are
reading from the file specified in --contents.
This enables us to use --contents in a bare repo. This is a nice
addition on top of 1a3119ed, having a working tree to use --contents
is optional.
Add test for the --contents option with bare repo to the
annotate-tests.sh test script.
Signed-off-by: Han Young <hanyang.tony@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As mentioned in the thread starting at [1], trace2 counters should be
used to count events instead of ad-hoc static variables.
Convert the two fsync static variables to trace2 counters, reducing the
coupling between wrapper.c and the trace2 subsystem. Adjust t/t5351 to
match the trace2 counter output format.
The counters are not per-thread because the ones being replaced also
were not.
[1] https://lore.kernel.org/git/20230627195251.1973421-2-calvinwan@google.com/
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git reset --no-mixed" behaved exactly like "git reset --mixed",
which was nonsense.
If there were only two kinds, e.g. "mixed" vs "separate", it might
have made sense to make "git reset --no-mixed" behave identically to
"git reset --separate" and vice-versa, but because we have many
types of reset, let's just forbid "--no-mixed" and negated form of
other types.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git show-branch --no-topo-order" behaved exactly the same way as
"git show-branch --topo-order" did, which was nonsense. This was
because we choose between topo- and date- by setting a variable to
either REV_SORT_IN_GRAPH_ORDER or REV_SORT_BY_COMMIT_DATE with
OPT_SET_INT() and REV_SORT_IN_GRAPH_ORDER happens to be 0. The
OPT_SET_INT() macro assigns 0 to the target variable in respose to
the negated form of its option.
"--no-date-order" by luck behaves identically to "--topo-order"
exactly for the same reason, and it sort-of makes sense right now,
but the "sort-of makes sense" will quickly break down once we add a
third way to sort. Not-A may be B when there are only two choices
between A and B, but once your choices become among A, B, and C,
not-A does not mean B.
Just mark these two ordering options to reject negation, and add a
test, which was missing. "git show-branch --no-reflog" is also
unnegatable, so throw in a test for that while we are at it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the trace2 counter mechanism was added in 81071626ba (trace2: add
global counter mechanism, 2022-10-24), the name of the file where new
counters are added was misspelled in a comment.
Use the correct file name.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parse-options API responds to "git cmd -h" by listing the option
flag (padded to the USAGE_OPTS_WIDTH column), followed by USAGE_GAP
(set to 2) whitespaces, followed by the help text. If the flags
part does not fit within the USAGE_OPTS_WIDTH, the help text is given
on its own line. Imagine that "@" below depicts the USAGE_OPTS_WIDTH'th
column, and "#" are for the usage help text, the output may look
like this:
@@@@@@@@@@@@@ ########################################
-f description of the flag '-f' comes here
--short=<num> description of the flag '--short'
--very-long-option=<number>
description of the flag '--very-long-option'
This is all good and nice in principle, but it becomes awkward when
the flags part is just one column over the limit and forces a line
break. See the description of the "--almost" option below:
@@@@@@@@@@@@@ ########################################
-f description of the flag '-f' comes here
--short=<num> description of the flag '--short'
--almost=<num>
description of the flag '--almost'
--very-long-option=<number>
description of the flag '--very-long-option'
If we allow shrinking the gap to a single whitespace only in such a
case, we would instead get:
@@@@@@@@@@@@@ ########################################
-f description of the flag '-f' comes here
--short=<num> description of the flag '--short'
--almost=<num> description of the flag '--almost'
--very-long-option=<number>
description of the flag '--very-long-option'
and the boundary between the flags and their descriptions does not
become any harder to see, while saving precious vertical screen real
estate.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The help text for the --tags option was split into two option[]
entries, which was a hacky way to give two lines of help text (the
second entry did not have either short or long help, and there was
no way to invoke its entry---it was there only for the help text).
As we now support multi-line text in the option help, let's make
the second line of the help a proper second line and remove the
hacky second entry.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "-h" triggers the short-help in a command that implements its
option parsing using the parse-options API, the option help text is
shown with a single fprintf() as a long line. When the text is
multi-line, the second and subsequent lines are not left padded,
that breaks the alignment across options.
Borrowing the idea from the advice API where its hint strings are
shown with (localized) "hint:" prefix, let's internally split the
(localized) help text into lines, and showing the first line, pad
the remaining lines to align.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case 'configure --with-iconv=no' is used, NO_ICONV is not saved to
config.status and thus git is built with iconv support.
Always save NO_ICONV to config.status to honor what user selected
during configure step.
Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if 'configure --with-curl=no' was run, curl support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.
Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if 'configure --with-expat=no' was run, expat support is used,
because library detection overwrites it. Avoid this overwrite.
Configure should obey what the user has specified.
Signed-off-by: Andreas Herrmann <aherrmann@suse.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git show-branch --no-sparse" behaved exactly the same way as "git
show-branch --sparse", which did not make any sense. This was
because it used a variable "dense" initialized to 1 by default to
give "non sparse" behaviour, and OPT_SET_INT() to set the varilable
to 0 in response to the "--sparse" option. Unfortunately,
OPT_SET_INT() sets 0 to the given variable when the option is
negated.
Flip the polarity of the variable "dense" by renaming it to "sparse"
and initializing it to 0, and have OPT_SET_INT() set the variable to
1 when "--sparse" is given. This way, "--no-sparse" would set 0 to
the variable and would give us the "dense" behaviour.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now we have introduced OPT_IPVERSION(), tweak its implementation so
that "git clone", "git fetch", and "git push" reject the negated
form of "Use only IP version N" options.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line option parsing for "git clone", "git fetch", and
"git push" have duplicated implementations of parsing "--ipv4" and
"--ipv6" options, by having two OPT_SET_INT() for "ipv4" and "ipv6".
Introduce a new OPT_IPVERSION() macro and use it in these three
commands.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the command line parser for "git branch --all" forgets to use
PARSE_OPT_NONEG, it accepted "git branch --no-all", and then passed
a nonsense value to the underlying machinery, leading to a fatal
error "filter_refs: invalid type". The "--remotes" option had
exactly the same issue.
Catch the unsupported options early in the option parser.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Command line options "--keep-cr" and its negation trigger
OPT_SET_INT_F(PARSE_OPT_NONEG) to set a variable to 1 and 0
respectively. Using OPT_SET_INT() to implement the positive variant
that sets the variable to 1 without specifying PARSE_OPT_NONEG gives
us the negative variant to set it to 0 for free.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "PATTERN FORMAT" section, all the other pattern elements are
shown as `monospace` literals inside "double quoted" strings. Do
the same for the explanation of a slash to make it consistent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 61fdbcf98b (ls-tree: migrate to parse-options, 2009-11-13) git
ls-tree has accepted the option --no-full-name, but it does the same
as --full-name, contrary to convention. That's because it's defined
using OPT_SET_INT with a value of 0, where the negative variant sets
0 as well.
Turn --no-full-name into the opposite of --full-name by using OPT_BOOL
instead and storing the option's status directly in a variable named
"full_name" instead of in negated form in "chomp_prefix".
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fsck --no-progress" still spewed noise from the commit-graph
subsystem, which has been corrected.
* tb/fsck-no-progress:
commit-graph.c: avoid duplicated progress output during `verify`
commit-graph.c: pass progress to `verify_one_commit_graph()`
commit-graph.c: iteratively verify commit-graph chains
commit-graph.c: extract `verify_one_commit_graph()`
fsck: suppress MIDX output with `--no-progress`
fsck: suppress commit-graph output with `--no-progress`
The recent change to "git repack" made it react less nicely when a
leftover .idx file that no longer has the corresponding .pack file
in the repository, which has been corrected.
* tb/repack-cleanup:
builtin/repack.c: avoid dir traversal in `collect_pack_filenames()`
builtin/repack.c: only repack `.pack`s that exist
Among four examples, only this one used "double quoted" sample
patterns, but all others marked up the patterns in `monospace`.
Signed-off-by: Johan Ruokangas <johan@latehours.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return the oid that matched, but the sole caller only cares whether
we matched anything at all. This is mostly academic, since there's only
one caller, but the lifetime of the returned pointer is not immediately
clear. Sometimes it points to an oid in a tag struct, which should live
forever. And sometimes to the oid passed in, which only lives as long as
the each_ref_fn callback we're called from.
Simplify this to a boolean return which is more direct and obvious. As a
bonus, this lets us avoid the weird pattern of overwriting our "oid"
parameter in the loop (since we now only refer to the tagged oid one
time, and can just inline the call to get it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When handling --points-at, we have to try to peel each ref to see if
it's a tag that points at a requested oid. We start this process by
calling parse_object() on the oid pointed to by each ref.
The cost of parsing each object adds up, especially in an output that
doesn't otherwise need to open the objects at all. Ideally we'd use
peel_iterated_oid() here, which uses the cached information in the
packed-refs file. But we can't, because our --points-at must match not
only the fully peeled value, but any interim values (so if tag A points
to tag B which points to commit C, we should match --points-at=B, but
peel_iterated_oid() will only tell us about C).
So the best we can do (absent changes to the packed-refs peel traits) is
to avoid parsing non-tags. The obvious way to do that is to call
oid_object_info() to check the type before parsing. But there are a few
gotchas there, like checking if the object has already been parsed.
Instead we can just tell parse_object() that we are OK skipping the hash
check, which lets it turn on several optimizations. Commits can be
loaded via the commit graph (so it's both fast and we have the benefit
of the parsed data if we need it later at the output stage). Blobs are
not loaded at all. Trees are still loaded, but it's rather rare to have
a ref point directly to a tree (and since this is just an optimization,
kicking in 99% of the time is OK).
Even though we're paying for an extra lookup, the cost to avoid parsing
the non-tags is a net benefit. In my git.git repository with 941 tags
and 1440 other refs pointing to commits, this significantly cuts the
runtime:
Benchmark 1: ./git.old for-each-ref --points-at=HEAD
Time (mean ± σ): 26.8 ms ± 0.5 ms [User: 24.5 ms, System: 2.2 ms]
Range (min … max): 25.9 ms … 29.2 ms 107 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD
Time (mean ± σ): 9.1 ms ± 0.3 ms [User: 6.8 ms, System: 2.2 ms]
Range (min … max): 8.6 ms … 10.2 ms 308 runs
Summary
'./git.new for-each-ref --points-at=HEAD' ran
2.96 ± 0.10 times faster than './git.old for-each-ref --points-at=HEAD'
In a repository that is mostly annotated tags, we'd expect less
improvement (we might still skip a few object loads, but that's balanced
by the extra lookups). In my clone of linux.git, which has 782 tags and
3 branches, the run-time is about the same (it's actually ~1% faster on
average after this patch, but that's within the run-to-run noise).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we peel tags to check if they match a --points-at oid, we
recursively parse the tagged object to see if it is also a tag. But
since the tag itself tells us the type of the object it points to (and
even gives us the appropriate object struct via its "tagged" member), we
can use that directly.
We do still have to make sure to call parse_tag() before looking at each
tag. This is redundant for the outermost tag (since we did call
parse_object() to find its type), but that's OK; parse_tag() is smart
enough to make this a noop when the tag has already been parsed.
In my clone of linux.git, with 782 tags (and only 3 non-tags), this
yields a significant speedup (bringing us back where we were before the
commit before this one started recursively dereferencing tags):
Benchmark 1: ./git.old for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 20.3 ms ± 0.5 ms [User: 11.1 ms, System: 9.1 ms]
Range (min … max): 19.6 ms … 21.5 ms 141 runs
Benchmark 2: ./git.new for-each-ref --points-at=HEAD --format="%(refname)"
Time (mean ± σ): 11.4 ms ± 0.2 ms [User: 6.3 ms, System: 5.0 ms]
Range (min … max): 11.0 ms … 12.2 ms 250 runs
Summary
'./git.new for-each-ref --points-at=HEAD --format="%(refname)"' ran
1.79 ± 0.05 times faster than './git.old for-each-ref --points-at=HEAD --format="%(refname)"'
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tags are dereferenced until reaching a different object type to handle
nested tags, e.g. on checkout. In contrast, "git tag --points-at=..."
fails to list such nested tags because only one level of indirection is
obtained in filter_refs(). Implement the recursive dereferencing for the
"--points-at" option when filtering refs to unify the behaviour.
Signed-off-by: Jan Klötzke <jan@kloetzke.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files '(attr:X)D/'" that triggers the common prefix
optimization codepath failed to read from "D/.gitattributes",
which has been corrected.
* jc/pathspec-match-with-common-prefix:
dir: match "attr" pathspec magic with correct paths
t6135: attr magic with path pattern
Further shuffling of declarations across header files to streamline
file dependencies.
* cw/compat-util-header-cleanup:
git-compat-util: move alloc macros to git-compat-util.h
treewide: remove unnecessary includes for wrapper.h
kwset: move translation table from ctype
sane-ctype.h: create header for sane-ctype macros
git-compat-util: move wrapper.c funcs to its header
git-compat-util: move strbuf.c funcs to its header
Code snippets in a tutorial document no longer compiled after
recent header shuffling, which have been corrected.
* vd/adjust-mfow-doc-to-updated-headers:
docs: add necessary headers to Documentation/MFOW.txt
"git diff --no-index" learned to read from named pipes as if they
were regular files, to allow "git diff <(process) <(substitution)"
some shells support.
* pw/diff-no-index-from-named-pipes:
diff --no-index: support reading from named pipes
t4054: test diff --no-index with stdin
diff --no-index: die on error reading stdin
diff --no-index: refuse to compare stdin to a directory
Use the now common skip_prefix() cascade instead of a case statement to
parse the strftime(3) format in strbuf_addftime(). skip_prefix() parses
the "fmt" pointer and advances it appropriately, making additional
pointer arithmetic unnecessary. The resulting code is more compact and
consistent with most other strbuf_expand_step() loops.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a test introduced by 26c9c03f0a (ref-filter: add new "signature"
atom, 2023-06-04) the file named "file" is added by a setup step that
requires GPG and modified by a second setup step that requires GPGSSH.
Systems lacking the first prerequisite skip the initial setup step and
then "git commit -a" in the second one doesn't find the modified file.
Add it explicitly.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"imap-send" codepaths got cleaned up to get rid of unused
parameters.
* jk/imap-send-unused-variable-cleanup:
imap-send: drop unused fields from imap_cmd_cb
imap-send: drop unused parameter from imap_cmd_cb callback
imap-send: use server conf argument in setup_curl()
"git bugreport" tests did not test what it wanted to test, which
has been corrected.
* ma/t0091-fixup:
t0091-bugreport.sh: actually verify some content of report
The "git for-each-ref" family of commands learned placeholders
related to GPG signature verification.
* ks/ref-filter-signature:
ref-filter: add new "signature" atom
t/lib-gpg: introduce new prereq GPG2
Background: The guidance to "base your work on the oldest branch that
your change is relevant to" was added in d0c26f0f56 (SubmittingPatches:
Add new section about what to base work on, 2010-04-19). That commit
also added the bullet points which describe the scenarios where one
would use one of the following named branches: "maint", "master",
"next", and "seen" ("pu" in the original as that was the name of this
branch before it was renamed, per 828197de8f (docs: adjust for the
recent rename of `pu` to `seen`, 2020-06-25)). The guidance was probably
taken from existing similar language introduced in the "Merge upwards"
section of gitworkflows in f948dd8992 (Documentation: add manpage about
workflows, 2008-10-19).
Summary: This change simplifies the guidance by pointing users to just
"maint" or "master". But it also gives an explanation of why that is
preferred and what is meant by preferring "older" branches (which might
be confusing to some because "old" here is meant in relative terms
between these named branches, not in terms of the age of the branches
themselves). We also add an example to illustrate why it would be a bad
idea to use "next" as a starting point, which may not be so obvious to
new contributors.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The phrase
and unless it targets the `master` branch (which is the default),
mark your patches as such.
is tightly packed with several things happening in just two lines of
text. It also feels like it is not that important because of the terse
treatment. This is a problem because (1) it has the potential to confuse
new contributors, and (2) it may be glossed over for those skimming the
docs.
Emphasize and elaborate on this guidance by promoting it to its own
separate paragraph.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It could be that a suitable branch does not exist, so instead just use
the phrase "starting point". Technically speaking the starting point
would be a commit (not a branch) anyway.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The discussion around subsystems disrupts the flow of discussion in the
surrounding area, which only deals with starting points used for the
git.git project. So move this bullet point out to the end.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I noticed this test was producing output like
```
t4002-diff-basic.sh: test_expect_successdiff can read from stdin: not found
```
which is rather odd. Investigation shows an error of shell syntax:
foo'abc' is the same as fooabc to the shell. Perhaps obviously, this is
not a valid command for the test.
I am surprised this doesn't count as an error in the test, but that
accounts for it going unnoticed.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an OID out of an existing commit-graph during
verification.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when trying to read an existing OID while writing a new commit-graph.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merging two commit graphs, ensure that we don't attempt to merge
two graphs which, when combined, have more total commits than the 32-bit
unsigned maximum.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when choosing how to split and merge different layers of the
commit-graph.
In particular, avoid a potential overflow between `size_mult` and
`num_commits`, as well as a potential overflow between the number of
commits currently in the merged graph, and the number of commits in the
graph about to be merged.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when computing an offset into the commit_data chunk when the (relative)
graph position exceeds 2^32-1/GRAPH_DATA_WIDTH.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when the lex_index of the commit we are trying to fill out exceeds
2^32-1/(g->hash_len+16).
The other hunk touched in this patch is not susceptible to overflow,
since an explicit cast is made to a 64-bit unsigned value. For clarity
and consistency with the rest of the commits in this series, avoid a
tricky to reason about cast, and use `st_mult()` directly.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
in a few spots within `fill_commit_graph_info()`:
- First, when computing an offset into the commit data chunk, which
can occur when the `lex_index` of the item we're looking up exceeds
2^32-1/GRAPH_DATA_WIDTH.
- A similar issue when computing the generation date offset for
commits with `lex_index` greater than 2^32-1/4. Note that in
practice this will never overflow, since the left-hand operand is
from calling `sizeof(...)` and is thus already a `size_t`. But wrap
that in an `st_mult()` to make it clear that we intend to perform
this computation using 64-bit operands.
- Finally, a nearly identical issue as above when computing an offset
into the `generation_data_overflow` chunk.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, ensure that we don't overflow
when trying to compute an offset into the `chunk_oid_lookup` table when
the `lex_index` of the item we're trying to look up exceeds
`2^32-1/g->hash_len`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph uses a fanout table with 4-byte entries to store the
number of commits at each shard of the commit-graph. So it is OK to have
a commit graph with as many as 2^32-1 stored commits. But we risk
overflowing any computation which may exceed the 32-bit (unsigned)
maximum when those computations are (incorrectly) performed using 32-bit
operands.
There are a couple of spots in `add_graph_to_chain()` where we could
potentially overflow the result:
- First, when comparing the list of existing entries in the
commit-graph chain. It is unlikely that this should ever overflow,
since it would require having roughly 2^32-1/g->hash_len
commit-graphs in the chain. But let's guard that computation with a
`st_mult()` just to be safe.
- Second, when computing the number of commits in the graph added to
the front of the chain. This value is also a 32-bit unsigned, but we
should make sure that it does not grow beyond the maximum value.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a commit-graph, we use the chunk-format API to write out
each individual chunk of the commit-graph. Each chunk of the
commit-graph is tracked via a call to `add_chunk()`, along with the
expected size of that chunk.
Similar to an earlier commit which handled the identical issue in the
MIDX machinery, guard against overflow when dealing with a commit-graph
with a large number of entries to avoid corrupting the contents of the
commit-graph itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a bitmap is used to answer some reachability query, it creates a
pseudo-bitmap called the "extended index" on top of any existing bitmaps
to store objects that are relevant to the query, but not mentioned in
the bitmap.
When looking up the ith object in the extended index in a bitmap, it is
common to write something like:
bitmap_get(result, i + bitmap_num_objects(bitmap_git))
, indicating that we want the ith object following all other objects
mentioned in the bitmap_git.
Since the type of `i` and the return type of `bitmap_num_objects()` are
both `uint32_t`s, But if there are either a large number of objects in
the bitmap, or a large number of objects in the extended index (or
both), this addition can overflow when the sum is greater than 2^32-1.
Having that large of a bitmap position is entirely acceptable, but we
need to ensure that the computed bitmap position for that object is
performed using 64-bits and doesn't overflow.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as in previous commits, avoid an integer overflow
when computing the expected size of a MIDX.
(Note that this is also OK as-is, since `p->pack_size` is an `off_t`, so
this computation should already be done as 64-bit integers. But again,
let's use `st_mult()` to make this fact clear).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a MIDX, we use the chunk-format API to write out each
individual chunk of the MIDX. Each chunk of the MIDX is tracked via a
call to `add_chunk()`, along with the expected size of that chunk.
Guard against overflow when dealing with a MIDX with a large number of
entries (and consequently, large chunks within the MIDX file itself) to
avoid corrupting the contents of the MIDX itself.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the `write_midx_context` structure, we use two `uint32_t`'s to track
the length and allocated size of the packs, and one `uint32_t` to track
the number of objects in the MIDX.
In practice, having these be 32-bit unsigned values shouldn't cause any
problems since we are unlikely to have that many objects or packs in any
real-world repository. But these values should be `size_t`'s, so change
their type to reflect that.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous patches, avoid an overflow when looking
up object offsets in the MIDX's large offset table by guarding the
computation via `st_mult()`.
This instance is also OK as-is, since the left operand is the result of
`sizeof(...)`, which is already a `size_t`. But use `st_mult()` instead
here to make it explicit that this computation is to be performed using
64-bit unsigned integers.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as previous commits, avoid overflow when looking up
an object's OID in a MIDX when its position is greater than
`2^32-1/m->hash_len`.
As usual, it is perfectly OK for a MIDX to have as many as 2^32-1
objects (since we use 32-bit fields to count the number of objects at
each fanout layer). But if we have more than `2^32-1/m->hash_len` number
of objects, we will incorrectly perform the computation using 32-bit
integers, overflowing the result.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `midx_fanout` struct is used to keep track of a set of OIDs
corresponding to each layer of the MIDX's fanout table. It stores an
array of entries, along with the number of entries in the table, and the
allocated size of the array.
Both `nr` and `alloc` are stored as 32-bit unsigned integers. In
practice, this should never cause any problems, since most packs have
far fewer than 2^32-1 objects.
But storing these as `size_t`'s is more appropriate, and prevents us
from accidentally overflowing some result when multiplying or adding to
either of these values. Update these struct members to be `size_t`'s as
appropriate.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as the previous commits, ensure that we use
`st_add()` or `st_mult()` when computing values that may overflow the
32-bit unsigned limit.
Note that in each of these instances, we prevent 32-bit overflow
already since we have explicit casts to `size_t`.
So this code is OK as-is, but let's clarify it by using the `st_xyz()`
helpers to make it obvious that we are performing the relevant
computations using 64 bits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prevent an overflow when locating a pack's CRC offset when the number
of packed items is greater than 2^32-1/hashsz by guarding the
computation with an `st_mult()`.
Note that to avoid truncating the result, the `crc_offset` member must
itself become a `size_t`. The only usage of this variable (besides the
assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the
index-pack code. There we use the `crc_offset` as a pointer offset, so
we are already equipped to handle the type change.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many callback interfaces have an extra void data parameter, but we don't
always need it (especially for dumping functions like the ones in test
helpers). Mark them as unused to avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We iterate over the set of input tag names using callbacks. But not all
operations need the same inputs, so some parameters go unused (but of
course not the same ones for each operation). Mark the unused ones to
avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't need to use the "data" parameter in this instance. Let's mark
it to avoid -Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't look at the "commit" parameter to our callback, as our
"mergetag_data" pointer contains the original name "ref", which we use
instead. But we can't get rid of it, since other for_each_mergetag
callbacks do use it. Let's mark the parameter to avoid
-Wunused-parameter warnings.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't look at the "flags" parameter, which is natural for something
that is just printing the contents of the replace refs. But let's mark
it to appease -Wunused-parameter.
This probably should have been part of 63e14ee2d6 (refs: mark unused
each_ref_fn parameters, 2022-08-19), but I missed it as this one is a
repo_each_ref_fn, which takes an extra repository argument.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our threeway_callback() does not bother to look at its "n" parameter. It
is static in this file and used only by trivial_merge_trees(), which
always passes 3 trees (hence the name "threeway"). It also does not look
at "dirmask". This is OK, as it handles directories specifically by
looking at the mode bits.
Other traverse_info callbacks need these, so we can't get drop them from
the interface. But let's annotate these ones to avoid complaints from
-Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few callback functions which are used with the fsck code,
but it's natural that not all callbacks need all parameters. For
reporting, even something as obvious as "the oid of the object which had
a problem" is not always used, as some callers are only checking a
single object in the first place. And for both reporting and walking,
things like void data pointers and the fsck_options aren't always
necessary.
But since each such parameter is used by _some_ callback, we have to
keep them in the interface. Mark the unused ones in specific callbacks
to avoid triggering -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The setup_revision_opt struct has a "tweak" function pointer, which can
be used to adjust parameters after setup_revisions() parses arguments,
but before it finalizes setup. In addition to the rev_info struct, the
callback receives a pointer to the setup_revision_opt, as well.
But none of the existing callbacks looks at the extra "opt" parameter,
leading to -Wunused-parameter warnings.
We could mark it as UNUSED, but instead let's remove it entirely. It's
conceivable that it could be useful for a callback to have access to the
"opt" struct. But in the 13 years that this mechanism has existed,
nobody has used it. So let's just drop it in the name of simplifying.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callbacks to for_each_altodb() get a void data pointer, but we don't
need it here. Mark it as unused to silence -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing the input, we have a "keep_cr" parameter to tell us how to
handle line endings. But this doesn't apply to stgit or hg patches
(which are not mailbox formats where we have to worry about that), so we
ignore the parameter entirely in those functions.
Let's mark these as unused so that -Wunused-parameter does not complain
about them.
Note that we could just drop these parameters entirely. They are
necessary to conform to the mail_conv_fn interface used by
split_mail_conv(), but these two callbacks are the only ones used with
that function. The other formats (which _do_ care about keep_cr) use
split_mail_mbox(). But it's conceivable that we'd eventually add another
format that does care about this option, so let's leave it as part of
the generic interface.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The xml_start_tag() function is passed the expat library's
XML_SetElementHandler() function, so it has to conform to the
expected interface. But we don't actually care about the attributes
list. Mark it so that -Wunused-parameter does not complain.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions are all used as callbacks for curl, so they have to
conform to a particular interface. But they don't need all of their
parameters:
- fwrite_null() throws away the input, so it doesn't look at most
parameters
- fwrite_wwwauth() in theory could take the auth struct in its void
pointer, but instead we just access it as the global http_auth
(matching the rest of the code in this file)
- curl_trace() always writes via the trace mechanism, so it doesn't
need its void pointer to know where to send things. Likewise, it
ignores the CURL parameter, since nothing we trace requires querying
the handle.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function gets a repository parameter because it's a callback for
do_for_each_repo_ref_iterator(). But it's just a wrapper that passes
along each call to a regular each_ref_fn callback, and the latter
doesn't accept a repository argument.
Probably in the long run all of the each_ref_fn callbacks should get a
repository parameter, too. But changing that now would require updates
all over the code base. Until that happens, let's annotate this wrapper
callback to quiet the compiler's -Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-expire command has been unimplemented since it was added in
80f2a6097c (t/helper: add test-ref-store to test ref-store functions,
2017-03-26). This causes -Wunused-parameter to complain, since the
function just calls die() without looking at its arguments.
We could mark these as UNUSED to silence the warning. But let's just
drop the function. It has no callers in the test suite and is not doing
anything useful, beyond perhaps reminding us that it's something we
_could_ be testing.
But since the bulk of the work in adding such tests would be the shell
bits that actually examine the reflog state before and after expiration,
this is not even a useful step in that direction. Somebody who wants to
do that work later can easily add this function back.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two messages were introduced in 8ba221e245 (bundle: output hash
information in 'verify', 2022-03-22) and 105c6f14ad (bundle: parse
filter capability, 2022-03-09) but never for translation.
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a narrow but common case, the user is the only author of a branch and
doesn't mind overwriting the corresponding branch on the remote. This
workflow is especially common on GitHub, GitLab, and Gerrit, which keep
a permanent record of every version of a branch that is pushed while a
pull request is open for that branch. On those platforms, force-pushing
is encouraged and is analogous to emailing a new version of a patchset.
When giving advice about divergent branches, tell the user about
`git pull`, but don't unconditionally instruct the user to do it. A less
prescriptive message will help prevent users from thinking that they are
required to create an integrated history instead of simply replacing the
previous history. Also, don't put `git pull` in an awkward
parenthetical, because `git pull` can always be used to reconcile
branches and is the normal way to do so.
Due to the difficulty of knowing which command for force-pushing is best
suited to the user's situation, no specific advice is given about
force-pushing. Instead, the user is directed to the Git documentation to
read about possible ways forward that do not involve integration.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a narrow but common case, the user is the only author of a branch and
doesn't mind overwriting the corresponding branch on the remote. This
workflow is especially common on GitHub, GitLab, and Gerrit, which keep
a permanent record of every version of a branch that is pushed while a
pull request is open for that branch. On those platforms, force-pushing
is encouraged and is analogous to emailing a new version of a patchset.
When giving advice about divergent branches, tell the user about
`git pull`, but don't unconditionally instruct the user to do it. A less
prescriptive message will help prevent users from thinking that they are
required to create an integrated history instead of simply replacing the
previous history. Likewise, don't imply that `git pull` is only for
merging.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user is in the middle of making a commit, they are not yet at
the point where they are ready to think about integrating their local
branch with the corresponding remote branch or force-pushing over the
remote branch. Don't include advice on how to deal with divergent
branches in the commit template, to avoid giving the impression that the
divergence needs to be dealt with immediately. Similar advice will be
printed when it is most relevant, that is, if the user does try to push
without first reconciling the two branches.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 37fec86a83 (packfile: abstract away hash constant values,
2018-05-02), `nth_packed_object_id()` started using the variable
`the_hash_algo->rawsz` instead of a fixed constant when trying to
compute an offset into the ".idx" file for some object position.
This can lead to surprising truncation when looking for an object
towards the end of a large enough pack, like the following:
(gdb) p hashsz
$1 = 20
(gdb) p n
$2 = 215043814
(gdb) p hashsz * n
$3 = 5908984
, which is a debugger session broken on a known-bad call to the
`nth_packed_object_id()` function.
This behavior predates 37fec86a83, and is original to the v2 index
format, via: 74e34e1fca (sha1_file.c: learn about index version 2,
2007-04-09).
This is due to §6.4.4.1 of the C99 standard, which states that an
untyped integer constant will take the first type in which the value can
be accurately represented, among `int`, `long int`, and `long long int`.
Since 20 can be represented as an `int`, and `n` is a 32-bit unsigned
integer, the resulting computation is defined by §6.3.1.8, and the
(signed) integer value representing `n` is converted to an unsigned
type, meaning that `20 * n` (for `n` having type `uint32_t`) is
equivalent to a multiplication between two unsigned 32-bit integers.
When multiplying a sufficiently large `n`, the resulting value can
exceed 2^32-1, wrapping around and producing an invalid result. Let's
follow the example in f86f769550 (compute pack .idx byte offsets using
size_t, 2020-11-13) and replace this computation with `st_mult()`, which
will ensure that the computation is done using 64-bits.
While here, guard the corresponding computation for packs with v1
indexes, too. Though the likelihood of seeing a bug there is much
smaller, since (a) v1 indexes are generated far less frequently than v2
indexes, and (b) they all correspond to packs no larger than 2 GiB, so
having enough objects to trigger this overflow is unlikely if not
impossible.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When repacking, the function `collect_pack_filenames()` is responsible
for collecting the set of existing packs in the repository, and
partitioning them into "kept" (if the pack has a ".keep" file or was
given via `--keep-pack`) and "nonkept" (otherwise) lists.
This function comes from the original C port of git-repack.sh from back
in a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15),
where it first appears as `get_non_kept_pack_filenames()`. At the time,
the implementation was a fairly direct translation from the relevant
portion of git-repack.sh, which looped over the results of
find "$PACKDIR" -type f -name '*.pack'
either ignoring the pack as kept, or adding it to the list of existing
packs.
So the choice to directly translate this function in terms of
`readdir()` in a1bbc6c017 made sense. At the time, it was possible to
refine the C version in terms of packed_git structs, but was never done.
However, manually enumerating a repository's packs via `readdir()` is
confusing and error-prone. It leads to frustrating inconsistencies
between which packs Git considers to be part of a repository (i.e.,
could be found in the list of packs from `get_all_packs()`), and which
packs `collect_pack_filenames()` considers to meet the same criteria.
This bit us in 73320e49ad (builtin/repack.c: only collect fully-formed
packs, 2023-06-07), and again in the previous commit.
Prevent these issues from biting us in the future by implementing the
`collect_pack_filenames()` function by looping over an array of pointers
to `packed_git` structs, ensuring that we use the same criteria to
determine the set of available packs.
One gotcha here is that we have to ignore non-local packs, since the
original version of `collect_pack_filenames()` only looks at the local
pack directory to collect existing packs.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 73320e49ad (builtin/repack.c: only collect fully-formed packs,
2023-06-07), we switched the check for which packs to collect by
starting at the .idx files and looking for matching .pack files. This
avoids trying to repack pack-files that have not had their pack-indexes
installed yet.
However, it does cause maintenance to halt if we find the (problematic,
but not insurmountable) case of a .idx file without a corresponding
.pack file. In an environment where packfile maintenance is a critical
function, such a hard stop is costly and requires human intervention to
resolve (by deleting the .idx file).
This was not the case before. We successfully repacked through this
scenario until the recent change to scan for .idx files.
Further, if we are actually in a case where objects are missing, we
detect this at a different point during the reachability walk.
In other cases, Git prepares its list of packfiles by scanning .idx
files and then only adds it to the packfile list if the corresponding
.pack file exists. It even does so without a warning! (See
add_packed_git() in packfile.c for details.)
This case is much less likely to occur than the failures seen before
73320e49ad. Packfiles are "installed" by writing the .pack file before
the .idx and that process can be interrupted. Packfiles _should_ be
deleted by deleting the .idx first, followed by the .pack file, but
unlink_pack_path() does not do this: it deletes the .pack _first_,
allowing a window where this process could be interrupted. We leave the
consideration of changing this order as a separate concern. Knowing that
this condition is possible from interrupted Git processes and not other
tools lends some weight that Git should be more flexible around this
scenario.
Add a check to see if the .pack file exists before adding it to the list
for repacking. This will stop a number of maintenance failures seen in
production but fixed by deleting the .idx files.
This brings us closer to the case before 73320e49ad in that 'git
repack' will not fail when there is an orphaned .idx file, at least, not
due to the way we scan for packfiles. In the case that the .pack file
was erroneously deleted without copies of its objects in other installed
packfiles, then 'git repack' will fail due to the reachable object walk.
This does resolve the case where automated repacks will no longer be
halted on this case. The tests in t7700 show both these successful
scenarios and the case of failing if the .pack was truly required.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar fashion as in previous commits, teach `ls-refs` to avoid
enumerating hidden references where possible.
As before, this is linux.git with one hidden reference per commit.
$ hyperfine -L v ,.compile 'git{v} -c protocol.version=2 ls-remote .'
Benchmark 1: git -c protocol.version=2 ls-remote .
Time (mean ± σ): 89.8 ms ± 0.6 ms [User: 84.3 ms, System: 5.7 ms]
Range (min … max): 88.8 ms … 91.3 ms 32 runs
Benchmark 2: git.compile -c protocol.version=2 ls-remote .
Time (mean ± σ): 6.5 ms ± 0.1 ms [User: 2.4 ms, System: 4.3 ms]
Range (min … max): 6.2 ms … 8.3 ms 397 runs
Summary
'git.compile -c protocol.version=2 ls-remote .' ran
13.85 ± 0.33 times faster than 'git -c protocol.version=2 ls-remote .'
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar fashion as a previous commit, teach `upload-pack` to avoid
enumerating hidden references where possible.
Note, however, that there are certain cases where cannot avoid
enumerating even hidden references, in particular when either of:
- `uploadpack.allowTipSHA1InWant`, or
- `uploadpack.allowReachableSHA1InWant`
are set, corresponding to `ALLOW_TIP_SHA1` and `ALLOW_REACHABLE_SHA1`,
respectively.
When either of these bits are set, upload-pack's `is_our_ref()` function
needs to consider the `HIDDEN_REF` bit of the referent's object flags.
So we must visit all references, including the hidden ones, in order to
mark their referents with the `HIDDEN_REF` bit.
When neither `ALLOW_TIP_SHA1` nor `ALLOW_REACHABLE_SHA1` are set, the
`is_our_ref()` function considers only the `OUR_REF` bit, and not the
`HIDDEN_REF` one. `OUR_REF` is applied via `mark_our_ref()`, and only
to objects at the tips of non-hidden references, so we do not need to
visit hidden references in this case.
When neither of those bits are set, `upload-pack` can potentially avoid
enumerating a large number of references. In the same example as a
previous commit (linux.git with one hidden reference per commit,
"refs/pull/N"):
$ printf 0000 >in
$ hyperfine --warmup=1 \
'git -c transfer.hideRefs=refs/pull upload-pack . <in' \
'git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in' \
'git.compile -c transfer.hideRefs=refs/pull upload-pack . <in'
Benchmark 1: git -c transfer.hideRefs=refs/pull upload-pack . <in
Time (mean ± σ): 406.9 ms ± 1.1 ms [User: 357.3 ms, System: 49.5 ms]
Range (min … max): 405.7 ms … 409.2 ms 10 runs
Benchmark 2: git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in
Time (mean ± σ): 406.5 ms ± 1.3 ms [User: 356.5 ms, System: 49.9 ms]
Range (min … max): 404.6 ms … 408.8 ms 10 runs
Benchmark 3: git.compile -c transfer.hideRefs=refs/pull upload-pack . <in
Time (mean ± σ): 4.7 ms ± 0.2 ms [User: 0.7 ms, System: 3.9 ms]
Range (min … max): 4.3 ms … 6.1 ms 472 runs
Summary
'git.compile -c transfer.hideRefs=refs/pull upload-pack . <in' ran
86.62 ± 4.33 times faster than 'git.compile -c transfer.hideRefs=refs/pull -c uploadpack.allowTipSHA1InWant upload-pack . <in'
86.70 ± 4.33 times faster than 'git -c transfer.hideRefs=refs/pull upload-pack . <in'
As above, we must visit every reference when
uploadPack.allowTipSHA1InWant is set. But when it is unset, we can visit
far fewer references.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that `refs_for_each_fullref_in()` has the ability to avoid
enumerating references matching certain pattern(s), use that to avoid
visiting hidden refs when constructing the ref advertisement via
receive-pack.
Note that since this exclusion is best-effort, we still need
`show_ref_cb()` to check whether or not each reference is hidden or not
before including it in the advertisement.
As was the case when applying this same optimization to `upload-pack`,
`receive-pack`'s reference advertisement phase can proceed much quicker
by avoiding enumerating references that will not be part of the
advertisement.
(Below, we're still using linux.git with one hidden refs/pull/N ref per
commit):
$ hyperfine -L v ,.compile 'git{v} -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
Benchmark 1: git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
Time (mean ± σ): 89.1 ms ± 1.7 ms [User: 82.0 ms, System: 7.0 ms]
Range (min … max): 87.7 ms … 95.5 ms 31 runs
Benchmark 2: git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 0.5 ms, System: 3.9 ms]
Range (min … max): 4.1 ms … 5.6 ms 508 runs
Summary
'git.compile -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git' ran
20.00 ± 1.05 times faster than 'git -c transfer.hideRefs=refs/pull receive-pack --advertise-refs .git'
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent commits, we'll teach `receive-pack` and `upload-pack` to
use the new jump list feature in the packed-refs iterator by ignoring
references which are mentioned via its respective hideRefs lists.
However, the packed-ref jump lists cannot handle un-hiding rules (that
begin with '!'), or namespace comparisons (that begin with '^'). Add a
convenience function to the refs.h API to detect when either of these
conditions are met, and returns an appropriate value to pass as excluded
patterns.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future commit will want to call `for_each_namespaced_ref()` with
a list of excluded patterns.
We could introduce a variant of that function, say,
`for_each_namespaced_ref_exclude()` which takes the extra parameter, and
reimplement the original function in terms of that. But all but one
caller (in `http-backend.c`) will supply the new parameter, so add the
new parameter to `for_each_namespaced_ref()` itself instead of
introducing a new function.
For now, supply NULL for the list of excluded patterns at all callers to
avoid changing behavior, which we will do in a future change.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent commits, it will be convenient to have a 'const char **'
of hidden refs (matching `transfer.hiderefs`, `uploadpack.hideRefs`,
etc.), instead of a `string_list`.
Convert spots throughout the tree that store the list of hidden refs
from a `string_list` to a `strvec`.
Note that in `parse_hide_refs_config()` there is an ugly const-cast used
to avoid an extra copy of each value before trimming any trailing slash
characters. This could instead be written as:
ref = xstrdup(value);
len = strlen(ref);
while (len && ref[len - 1] == '/')
ref[--len] = '\0';
strvec_push(hide_refs, ref);
free(ref);
but the double-copy (once when calling `xstrdup()`, and another via
`strvec_push()`) is wasteful.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit added low-level tests to ensure that the packed-refs
iterator did not enumerate excluded sections of the refspace.
However, there was no guarantee that these sections weren't being
visited, only that they were being suppressed from the output. To harden
these tests, add a trace2 counter which tracks the number of regions
skipped by the packed-refs iterator, and assert on its value.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When iterating through the `packed-refs` file in order to answer a query
like:
$ git for-each-ref --exclude=refs/__hidden__
it would be useful to avoid walking over all of the entries in
`refs/__hidden__/*` when possible, since we know that the ref-filter
code is going to throw them away anyways.
In certain circumstances, doing so is possible. The algorithm for doing
so is as follows:
- For each excluded pattern, find the first record that matches it,
and the first record that *doesn't* match it (i.e. the location
you'd next want to consider when excluding that pattern).
- Sort the set of excluded regions from the previous step in ascending
order of the first location within the `packed-refs` file that
matches.
- Clean up the results from the previous step: discard empty regions,
and combine adjacent regions. The set of regions which remains is
referred to as the "jump list", and never contains any references
which should be included in the result set.
Then when iterating through the `packed-refs` file, if `iter->pos` is
ever contained in one of the regions from the previous steps, advance
`iter->pos` past the end of that region, and continue enumeration.
Note that we only perform this optimization when none of the excluded
pattern(s) have special meta-characters in them. For a pattern like
"refs/foo[ac]", the excluded regions ("refs/fooa", "refs/fooc", and
everything underneath them) are not connected. A future implementation
that handles this case may split the character class (pretending as if
two patterns were excluded: "refs/fooa", and "refs/fooc").
There are a few other gotchas worth considering. First, note that the
jump list is sorted, so once we jump past a region, we can avoid
considering it (or any regions preceding it) again. The member
`jump_pos` is used to track the first next-possible region to jump
through.
Second, note that the jump list is best-effort, since we do not handle
loose references, and because of the meta-character issue above. The
jump list may not skip past all references which won't appear in the
results, but will never skip over a reference which does appear in the
result set.
In repositories with a large number of hidden references, the speed-up
can be significant. Tests here are done with a copy of linux.git with a
reference "refs/pull/N" pointing at every commit, as in:
$ git rev-list HEAD | awk '{ print "create refs/pull/" NR " " $0 }' |
git update-ref --stdin
$ git pack-refs --all
, it is significantly faster to have `for-each-ref` jump over the
excluded references, as opposed to filtering them out after the fact:
$ hyperfine \
'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"' \
'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' \
'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
Benchmark 1: git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"
Time (mean ± σ): 798.1 ms ± 3.3 ms [User: 687.6 ms, System: 146.4 ms]
Range (min … max): 794.5 ms … 805.5 ms 10 runs
Benchmark 2: git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
Time (mean ± σ): 98.9 ms ± 1.4 ms [User: 93.1 ms, System: 5.7 ms]
Range (min … max): 97.0 ms … 104.0 ms 29 runs
Benchmark 3: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 0.7 ms, System: 3.8 ms]
Range (min … max): 4.1 ms … 5.8 ms 524 runs
Summary
'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"' ran
21.87 ± 1.05 times faster than 'git.prev for-each-ref --format="%(objectname) %(refname)" --exclude="refs/pull"'
176.52 ± 8.19 times faster than 'git for-each-ref --format="%(objectname) %(refname)" | grep -vE "^[0-9a-f]{40} refs/pull/"'
(Comparing stock git and this patch isn't quite fair, since an earlier
commit in this series adds a naive implementation of the `--exclude`
option. `git.prev` is built from the previous commit and includes this
naive implementation).
Using the jump list is fairly straightforward (see the changes to
`refs/packed-backend.c::next_record()`), but constructing the list is
not. To ensure that the construction is correct, add a new suite of
tests in t1419 covering various corner cases (overlapping regions,
partially overlapping regions, adjacent regions, etc.).
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `find_reference_location()` is used to perform a
binary search-like function over the contents of a repository's
`$GIT_DIR/packed-refs` file.
The search it implements is unlike a standard binary search in that the
records it searches over are not of a fixed width, so the comparison
must locate the end of a record before comparing it.
Extract the core routine of `find_reference_location()` in order to
implement a function in the following patch which will find the first
location in the `packed-refs` file that *doesn't* match the given
pattern.
The behavior of `find_reference_location()` is unchanged.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subsequent patch will want to access an optional `excluded_patterns`
array within `refs/packed-backend.c` that will cull out certain
references matching any of the given patterns on a best-effort basis.
To do so, the refs subsystem needs to be updated to pass this value
across a number of different locations.
Prepare for a future patch by introducing this plumbing now, passing
NULLs at top-level APIs in order to make that patch less noisy and more
easily readable.
Signed-off-by: Taylor Blau <me@ttaylorr.co>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using `for-each-ref`, it is sometimes convenient for the caller to
be able to exclude certain parts of the references.
For example, if there are many `refs/__hidden__/*` references, the
caller may want to emit all references *except* the hidden ones.
Currently, the only way to do this is to post-process the output, like:
$ git for-each-ref --format='%(refname)' | grep -v '^refs/hidden/'
Which is do-able, but requires processing a potentially large quantity
of references.
Teach `git for-each-ref` a new `--exclude=<pattern>` option, which
excludes references from the results if they match one or more excluded
patterns.
This patch provides a naive implementation where the `ref_filter` still
sees all references (including ones that it will discard) and is left to
check whether each reference matches any excluded pattern(s) before
emitting them.
By culling out references we know the caller doesn't care about, we can
avoid allocating memory for their storage, as well as spending time
sorting the output (among other things). Even the naive implementation
provides a significant speed-up on a modified copy of linux.git (that
has a hidden ref pointing at each commit):
$ hyperfine \
'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"' \
'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/'
Benchmark 1: git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"
Time (mean ± σ): 820.1 ms ± 2.0 ms [User: 703.7 ms, System: 152.0 ms]
Range (min … max): 817.7 ms … 823.3 ms 10 runs
Benchmark 2: git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/
Time (mean ± σ): 106.6 ms ± 1.1 ms [User: 99.4 ms, System: 7.1 ms]
Range (min … max): 104.7 ms … 109.1 ms 27 runs
Summary
'git.compile for-each-ref --format="%(objectname) %(refname)" --exclude refs/pull/' ran
7.69 ± 0.08 times faster than 'git.compile for-each-ref --format="%(objectname) %(refname)" | grep -vE "[0-9a-f]{40} refs/pull/"'
Subsequent patches will improve on this by avoiding visiting excluded
sections of the `packed-refs` file in certain cases.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`match_pattern()` and `match_name_as_path()` both take a `struct
ref_filter *`, and then store a stack variable `patterns` pointing at
`filter->patterns`.
The subsequent patch will add a new array of patterns to match over (the
excluded patterns, via a new `git for-each-ref --exclude` option),
treating the return value of these functions differently depending on
which patterns are being used to match.
Tweak `match_pattern()` and `match_name_as_path()` to take an array of
patterns to prepare for passing either in.
Once we start passing either in, `match_pattern()` will have little to
do with a particular `struct ref_filter *` instance. To clarify this,
drop it from the argument list, and replace it with the only bit of the
`ref_filter` that we care about (`filter->ignore_case`).
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We did not bother to clean up at all in `git branch` or `git tag`, and
`git for-each-ref` only cleans up a couple of members.
Add and call `ref_filter_clear()` when cleaning up a `struct
ref_filter`. Running this patch (without any test changes) indicates a
couple of now leak-free tests. This was found by running:
$ make SANITIZE=leak
$ make -C t GIT_TEST_PASSING_SANITIZE_LEAK=check GIT_TEST_OPTS=--immediate
(Note that the `reachable_from` and `unreachable_from` lists should be
cleaned as they are used. So this is just covering any case where we
might bail before running the reachability check.)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `reach_filter()`, we pop all commits from the reachable lists,
leaving them empty. But because we're operating on a list pointer that
was passed by value, the original `filter.reachable_from` and
`filter.unreachable_from` pointers are left dangling.
As is the case with the previous commit, nobody touches either of these
fields after calling `reach_filter()`, so leaving them dangling is OK.
But this future proofs against dangerous situations.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide a sane initialization value for `struct ref_filter`, which in a
subsequent patch will be used to initialize a new field.
In the meantime, ensure that the `ref_filter` struct used in the
test-helper's `cmd__reach()` is zero-initialized. The lack of
initialization is OK, since `commit_contains()` only looks at the single
`with_commit_tag_algo` field that *is* initialized directly above.
So this does not fix a bug, but rather prevents one from biting us in
the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The refs machinery has its own implementation of a `ref_filter` (used by
`for-each-ref`), which is distinct from the `ref-filter.h` API (also
used by `for-each-ref`, among other things).
Rename the one within refs.c to more clearly indicate its purpose.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Link to community list of credential helpers. This is useful information
for users.
Describe how OAuth credential helpers work. OAuth is a user-friendly
alternative to personal access tokens and SSH keys. Reduced setup cost
makes it easier for users to contribute to projects across multiple
forges.
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git commit-graph verify` was taught how to verify commit-graph
chains in 3da4b609bb (commit-graph: verify chains with --shallow mode,
2019-06-18), it produced one line of progress per layer of the
commit-graph chain.
$ git.compile commit-graph verify
Verifying commits in commit graph: 100% (4356/4356), done.
Verifying commits in commit graph: 100% (131912/131912), done.
This could be somewhat confusing to users, who may wonder why there are
multiple occurrences of "Verifying commits in commit graph".
There are likely good arguments on whether or not there should be
one line of progress output per commit-graph layer. On the one hand, the
existing output shows us verifying each individual layer of the chain.
But on the other hand, the fact that a commit-graph may be stored among
multiple layers is an implementation detail that the caller need not be
aware of.
Clarify this by showing a single progress meter regardless of the number
of layers in the commit-graph chain. After this patch, the output
reflects the logical contents of a commit-graph chain, instead of
showing one line of output per commit-graph layer:
$ git.compile commit-graph verify
Verifying commits in commit graph: 100% (136268/136268), done.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the final step to prepare for consolidating the output of `git
commit-graph verify`. Instead of having each call to
`verify_one_commit_graph()` initialize its own progress struct, have the
caller pass one in instead.
This patch does not alter the output of `git commit-graph verify`, but
the next commit will consolidate the output.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have a function which can verify a single layer of a
commit-graph chain, implement `verify_commit_graph()` in terms of
iterating over commit-graphs along their `->base_graph` pointers.
This further prepares us to consolidate the progress output of `git
commit-graph verify`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the `verify_commit_graph()` function was extended to support
commit-graph chains via 3da4b609bb (commit-graph: verify chains with
--shallow mode, 2019-06-18), it did so by recursively calling itself on
each layer of the commit-graph chain.
In practice this poses no issues, since commit-graph chains do not loop,
and there are few enough of them that adding additional frames to the
stack is not a problem.
A future commit will consolidate the progress output from `git
commit-graph verify` when verifying chained commit-graphs to print a
single line instead of one progress meter per commit-graph layer.
Prepare for this by extracting a routine to verify a single layer of a
commit-graph.
Note that `verify_commit_graph()` is still recursive after this patch,
but this will change in the subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a similar spirit as the previous commit, address a bug where `git
fsck` produces output when calling `git multi-pack-index verify` even
when invoked with `--no-progress`.
$ git.compile fsck --connectivity-only --no-progress --no-dangling
Verifying OID order in multi-pack-index: 100% (605677/605677), done.
Sorting objects by packfile: 100% (605678/605678), done.
Verifying object offsets: 100% (605678/605678), done.
The three lines produced by `git fsck` come from `git multi-pack-index
verify`, but should be squelched due to `--no-progress`.
The MIDX machinery learned to generate these progress messages as early
as 430efb8a74 (midx: add progress indicators in multi-pack-index
verify, 2019-03-21), but did not respect `--progress` or `--no-progress`
until ad60096d1c (midx: honor the MIDX_PROGRESS flag in
verify_midx_file, 2019-10-21).
But the `git multi-pack-index verify` step was added to fsck in
66ec0390e7 (fsck: verify multi-pack-index, 2018-09-13), pre-dating any
of the above patches.
Pass `--[no-]progress` as appropriate to ensure that we don't produce
output when told not to.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since e0fd51e1d7 (fsck: verify commit-graph, 2018-06-27), `fsck` runs
`git commit-graph verify` to check the integrity of any commit-graph(s).
Originally, the `git commit-graph verify` step would always print to
stdout/stderr, regardless of whether or not `fsck` was invoked with
`--[no-]progress` or not. But in 7371612255 (commit-graph: add
--[no-]progress to write and verify, 2019-08-26), the commit-graph
machinery learned the `--[no-]progress` option, though `fsck` was not
updated to pass this new flag (or not).
This led to seeing output from running `git fsck`, even with
`--no-progress` on repositories that have a commit-graph:
$ git.compile fsck --connectivity-only --no-progress --no-dangling
Verifying commits in commit graph: 100% (4356/4356), done.
Verifying commits in commit graph: 100% (131912/131912), done.
Ensure that `fsck` passes `--[no-]progress` as appropriate when calling
`git commit-graph verify`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The match_pathspec_item() function takes "prefix" value, allowing a
caller to chop off the common leading prefix of pathspec pattern
strings from the path and only use the remainder of the path to
match the pathspec patterns (after chopping the same leading prefix
of them, of course).
This "common leading prefix" optimization has two main features:
* discard the entries in the in-core index that are outside of the
common leading prefix; if you are doing "ls-files one/a one/b",
we know all matches must be from "one/", so first the code
discards all entries outside the "one/" directory from the
in-core index. This allows us to work on a smaller dataset.
* allow skipping the comparison of the leading bytes when matching
pathspec with path. When "ls-files" finds the path "one/a/1" in
the in-core index given "one/a" and "one/b" as the pathspec,
knowing that common leading prefix "one/" was found lets the
pathspec matchinery not to bother comparing "one/" part, and
allows it to feed "a/1" down, as long as the pathspec element
"one/a" gets corresponding adjustment to "a".
When the "attr" pathspec magic is in effect, however, the current
code breaks down.
The attributes, other than the ones that are built-in and the ones
that come from the $GIT_DIR/info/attributes file and the top-level
.gitattributes file, are lazily read from the filesystem on-demand,
as we encounter each path and ask if it matches the pathspec. For
example, if you say "git ls-files "(attr:label)sub/" in a repository
with a file "sub/file" that is given the 'label' attribute in
"sub/.gitattributes":
* The common prefix optimization finds that "sub/" is the common
prefix and prunes the in-core index so that it has only entries
inside that directory. This is desirable.
* The code then walks the in-core index, finds "sub/file", and
eventually asks do_match_pathspec() if it matches the given
pathspec.
* do_match_pathspec() calls match_pathspec_item() _after_ stripping
the common prefix "sub/" from the path, giving it "file", plus
the length of the common prefix (4-bytes), so that the pathspec
element "(attr:label)sub/" can be treated as if it were "(attr:label)".
The last one is what breaks the match in the current code, as the
pathspec subsystem ends up asking the attribute subsystem to find
the attribute attached to the path "file". We need to ask about the
attributes on "sub/file" when calling match_pathspec_attrs(); this
can be done by looking at "prefix" bytes before the beginning of
"name", which is the same trick already used by another piece of the
code in the same match_pathspec_item() function.
Unfortunately this was not discovered so far because the code works
with slightly different arguments, e.g.
$ git ls-files "(attr:label)sub"
$ git ls-files "(attr:label)sub/" "no/such/dir/"
would have reported "sub/file" as a path with the 'label' attribute
just fine, because neither would trigger the common prefix
optimization.
Reported-by: Matthew Hughes <mhughes@uw.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few places failed to differenciate the case where the index is
truly empty (nothing added) and we haven't yet read from the
on-disk index file, which have been corrected.
* js/empty-index-fixes:
commit -a -m: allow the top-level tree to become empty again
split-index: accept that a base index can be empty
do_read_index(): always mark index as initialized unless erroring out
The strbuf_expand_step() loop in userformat_find_requirements() iterates
through the percent signs in the string "fmt", but we're not interested
in its effect on the strbuf "dummy". Use strchr(3) instead and get rid
of the strbuf that we no longer need.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hex2chr() takes care not to run over the end of a NUL-terminated string.
It's used in packet_length(), but both callers of that function pass a
four-byte buffer, making NUL-checks unnecessary. packet_length() could
accidentally be used with a pointer to a buffer of unknown size at new
call-sites, though, and the compiler wouldn't complain.
Add a size parameter plus check, and remove the NUL-checks by calling
hexval() directly. This trades three NUL checks against one size check
and the ability to report the use of a short buffer at runtime.
If any of the four bytes is NUL or -- more generally -- not a
hexadecimal digit, then packet_length() still returns a negative value.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tree-walk.c:do_match() function takes base_offset but just like
tree_entry_interesting() we dealt with earlier, nobody passes a
value other than 0 in it. Get rid of the parameter to avoid having
to worry about potential bugs lurking unexercised.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tree_entry_interesting() function takes base_offset, allowing
its callers to potentially pass a non-zero number to skip the early
part of the path string.
The feature is never exercised and we do not even know what bugs are
lurking there, as all callers pass 0 to the parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test coverage on attribute magic combined with path pattern
was a bit thin. Let's add a few and make sure "(attr:X)sub" and
"(attr:X)sub/" behave the same.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test for equality with no_flush, which has enough negation already. Get
rid of the unnecessary parentheses while at it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git ls-tree has two prefixes: The one handed to cmd_ls_tree(), i.e. the
current subdirectory in the repository (if any) and the "display" prefix
used by the show_tree_*() functions. The option --full-name clears the
last one, i.e. it shows full paths, and --full-tree clears both, i.e. it
acts as if the command was started in the root of the repository.
The show_tree_*() functions use the ls_tree_options members chomp_prefix
and ls_tree_prefix to determine their prefix values. Calculate it once
in cmd_ls_tree() instead, once the main prefix value is finalized.
This allows chomp_prefix to become a local variable. Stop using
strlen(3) to determine its initial value -- we only care whether we got
a non-empty string, not precisely how long it is.
Rename ls_tree_prefix to prefix to demonstrate that we converted all
users and because the ls_tree_ part is no longer necessary since
030a3d5d9e (ls-tree: use a "struct options", 2023-01-12) turned it from
a global variable to a struct member.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce reliance on a global state in the config reading API.
* gc/config-context:
config: pass source to config_parser_event_fn_t
config: add kvi.path, use it to evaluate includes
config.c: remove config_reader from configsets
config: pass kvi to die_bad_number()
trace2: plumb config kvi
config.c: pass ctx with CLI config
config: pass ctx with config files
config.c: pass ctx in configsets
config: add ctx arg to config_fn_t
urlmatch.h: use config_fn_t type
config: inline git_color_default_config
During a cherry-pick or revert session that works on multiple
commits, "git status" did not give correct information, which has
been corrected.
* jk/cherry-pick-revert-status:
fix cherry-pick/revert status when doing multiple commits
"git apply" punts when it is fed too large a patch input; the error
message it gives when it happens has been clarified.
* pw/apply-too-large:
apply: improve error messages when reading patch
'git notes append' was taught '--separator' to specify string to insert
between paragraphs.
* tl/notes-separator:
notes: introduce "--no-separator" option
notes.c: introduce "--[no-]stripspace" option
notes.c: append separator instead of insert by pos
notes.c: introduce '--separator=<paragraph-break>' option
t3321: add test cases about the notes stripspace behavior
notes.c: use designated initializers for clarity
notes.c: cleanup 'strbuf_grow' call in 'append_edit'
Partially revert a sanity check that the rest of the config code
was not ready, to avoid triggering it in a corner case.
* gc/config-partial-submodule-kvi-fix:
config: don't BUG when both kvi and source are set
Move functions that are not about pure string manipulation out of
strbuf.[ch]
* cw/strbuf-cleanup:
strbuf: remove global variable
path: move related function to path
object-name: move related functions to object-name
credential-store: move related functions to credential-store file
abspath: move related functions to abspath
strbuf: clarify dependency
strbuf: clarify API boundary
In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:
diff -u <(printf "a\nb\n") <(printf "a\nc\n")
However, this syntax does not produce useful results with "git diff
--no-index". On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named pipe. On
Linux, the arguments are symlinks to pipes, so git diff "helpfully"
diffs these symlinks, comparing their targets like "pipe:[1234]" and
"pipe:[5678]".
To address this "diff --no-index" is changed so that if a path given on
the commandline is a named pipe or a symbolic link that resolves to a
named pipe then we read the data to diff from that pipe. This is
implemented by generalizing the code that already exists to handle
reading from stdin when the user passes the path "-".
If the user tries to compare a named pipe to a directory then we die as
we do when trying to compare stdin to a directory.
As process substitution is not support by POSIX this change is tested by
using a pipe and a symbolic link to a pipe.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --no-index" supports reading from stdin with the path "-".
There is no test coverage for this so add a regression test before
changing the code in the next commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there is an error when reading from stdin then "diff --no-index"
prints an error message but continues to try and diff a file named "-"
resulting in an error message that looks like
error: error while reading from stdin: Invalid argument
fatal: stat '-': No such file or directory
assuming that no file named "-" exists. If such a file exists it prints
the first error message and generates the diff from that file which is
not what the user wanted. Instead just die() straight away if we cannot
read from stdin.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user runs
git diff --no-index file directory
we follow the behavior of POSIX diff and rewrite the arguments as
git diff --no-index file directory/file
Doing that when "file" is "-" (which means "read from stdin") does not
make sense so we should error out if the user asks us to compare "-" to
a directory. This matches the behavior of GNU diff and diff on *BSD.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the first test in this script, 'creates a report with content in the
right places', we generate a report and pipe it into our helper
`check_all_headers_populated()`. The idea of the helper is to find all
lines that look like headers ("[Some Header Here]") and to check that
the next line is non-empty. This is supposed to catch erroneous outputs
such as the following:
[A Header]
something
more here
[Another Header]
[Too Early Header]
contents
However, we provide the lines of the bug report as filenames to grep,
meaning we mostly end up spewing errors:
grep: : No such file or directory
grep: [System Info]: No such file or directory
grep: git version:: No such file or directory
grep: git version 2.41.0.2.gfb7d80edca: No such file or directory
This doesn't disturb the test, which tugs along and reports success, not
really having verified the contents of the report at all.
Note that after 788a776069 ("bugreport: collect list of populated
hooks", 2020-05-07), the bug report, which is created in our hook-less
test repo, contains an empty section with the enabled hooks. Thus, even
the intention of our helper is a bit misguided: there is nothing
inherently wrong with having an empty section in the bug report.
Let's instead split this test into three: first verify that we generate
a report at all, then check that the introductory blurb looks the way it
should, then verify that the "[System Info]" seems to contain the right
things. (The "[Enabled Hooks]" section is tested later in the script.)
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This table was originally introduced to solely be used with kwset
machinery (0f871cf56e), so it would make sense for it to belong in
kwset.[ch] rather than ctype.c and git-compat-util.h. It is only used in
diffcore-pickaxe.c, which already includes kwset.h so no other headers
have to be modified.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Splitting these macros from git-compat-util.h cleans up the file and
allows future third-party sources to not use these overrides if they do
not wish to.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the functions in wrapper.c are widely used across the codebase,
include it by default in git-compat-util.h. A future patch will remove
now unnecessary inclusions of wrapper.h from other files.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While functions like starts_with() probably should not belong in the
boundaries of the strbuf library, this commit focuses on first splitting
out headers from git-compat-util.h.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The imap_cmd_cb struct has several fields which are totally unused.
Presumably they did useful things in the upstream isync code from which
this is derived, but they don't in our more limited program. This is
particularly confusing for the "done" callback, which (as of the
previous patch) no longer matches the signature of the adjacent "cont"
callback.
Since we're unlikely to share code with isync going forward, we should
feel free to simplify the code here. Note that "done" is examined but
never set, so we can also drop a little bit of code outside of the
struct definition.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a generic callback mechanism for handling plus-continuation of
IMAP commands. It takes the imap_cmd struct itself as an argument. That
seems reasonable, and in a larger imap-using program it might be used.
But in imap-send, we have only one such callback (auth_cram_md5) and it
doesn't use this value, triggering -Wunused-parameter warnings.
We could just mark the parameter as UNUSED. But since this is the only
such function, and because we are not likely to share code with the
upstream isync anymore, we can just simplify the interface to remove
this parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our caller passes in an imap_server_conf struct, but we ignore it
totally, and instead read the config directly from the global "server"
variable. This works OK, since our sole caller will pass in that same
global variable. But the intent seems to have been to use the passed-in
variable, as otherwise it has no purpose (and many other functions use
the same pattern).
Let's use the passed-in value, which also silences a -Wunused-parameter
warning.
It would be nice if "server" was not a global here, as we could avoid
making similar mistakes. But changing that would be a larger refactor,
as it must be accessed as a global in a few spots (e.g., filling it in
with the config callback).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tutorial in Documentation/MyFirstObjectWalk.txt
contains the functions trace_printf(), oid_to_hex(),
and pp_commit_easy(), and struct oidset, which are used
without any hint of where they are defined. When the provided
code is compiled, the compiler returns an error, stating that
the functions and the struct are used before declaration. Therefore,include
necessary header files (the ones which have no mentions in the tutorial).
Signed-off-by: Vinayak Dev <vinayakdev.sci@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add more "git var" for toolsmiths to learn various locations Git is
configured with either via the configuration or hardcoded defaults.
* bc/more-git-var:
var: add config file locations
var: add attributes files locations
attr: expose and rename accessor functions
var: adjust memory allocation for strings
var: format variable structure with C99 initializers
var: add support for listing the shell
t: add a function to check executable bit
var: mark unused parameters in git_var callbacks
The set-up code for the get_revision() API now allows feeding
options like --all and --not in the --stdin mode.
* ps/revision-stdin-with-options:
revision: handle pseudo-opts in `--stdin` mode
revision: small readability improvement for reading from stdin
revision: reorder `read_revisions_from_stdin()`
When the external merge driver is killed by a signal, its output
should not be trusted as a resolution with conflicts that is
proposed by the driver, but the code did.
* jc/abort-ll-merge-with-a-signal:
t6406: skip "external merge driver getting killed by a signal" test on Windows
ll-merge: killing the external merge driver aborts the merge
Header files cleanup.
* en/header-split-cache-h-part-3: (28 commits)
fsmonitor-ll.h: split this header out of fsmonitor.h
hash-ll, hashmap: move oidhash() to hash-ll
object-store-ll.h: split this header out of object-store.h
khash: name the structs that khash declares
merge-ll: rename from ll-merge
git-compat-util.h: remove unneccessary include of wildmatch.h
builtin.h: remove unneccessary includes
list-objects-filter-options.h: remove unneccessary include
diff.h: remove unnecessary include of oidset.h
repository: remove unnecessary include of path.h
log-tree: replace include of revision.h with simple forward declaration
cache.h: remove this no-longer-used header
read-cache*.h: move declarations for read-cache.c functions from cache.h
repository.h: move declaration of the_index from cache.h
merge.h: move declarations for merge.c from cache.h
diff.h: move declaration for global in diff.c from cache.h
preload-index.h: move declarations for preload-index.c from elsewhere
sparse-index.h: move declarations for sparse-index.c from cache.h
name-hash.h: move declarations for name-hash.c from cache.h
run-command.h: move declarations for run-command.c from cache.h
...
We create .pack and then .idx, we consider only packfiles that have
.idx usable (those with only .pack are not ready yet), so we should
remove .idx before removing .pack for consistency.
* ds/remove-idx-before-pack:
packfile: delete .idx files before .pack files
When reporting a problem, `git fsck` emits a message such as:
missing blob 1234abcd (:file)
However, this can be ambiguous when the problem is detected in the index
of a worktree other than the one in which `git fsck` was invoked. To
address this shortcoming, 592ec63b38 (fsck: mention file path for index
errors, 2023-02-24) enhanced the output to mention the path of the index
when the problem is detected in some other worktree:
missing blob 1234abcd (.git/worktrees/wt/index:file)
Unfortunately, the variable in fsck_index() which controls whether the
index path should be shown is misleadingly named "is_main_index" which
can be misunderstood as referring to the main worktree (i.e. the one
housing the .git/ repository) rather than to the current worktree (i.e.
the one in which `git fsck` was invoked). Avoid such potential confusion
by choosing a name more reflective of its actual purpose.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pretty format %(describe:abbrev=<number>) tells describe to use
at least <number> digits of the oid to generate the human-readable
format of the commit-ish.
There are three things to test here:
- Check that we can describe a commit that is not tagged (that is,
for example our HEAD is at least one commit ahead of some reachable
commit which is tagged) with at least <number> digits of the oid
being used for describing it.
- Check that when using such a commit-ish, we always use at least
<number> digits of the oid to describe it.
- Check that we can describe a tag. This just gives the name of the
tag irrespective of abbrev (abbrev doesn't make sense here).
Do this, instead of the current test which only tests the last case.
Helped-by: Junio C Hamano <gitster@pobox.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 03267e8656 (commit: discard partial cache before (re-)reading it,
2022-11-08), a memory leak was plugged by discarding any partial index
before re-reading it.
The problem with this memory leak fix is that it was based on an
incomplete understanding of the logic introduced in 7168624c35 (Do not
generate full commit log message if it is not going to be used,
2007-11-28).
That logic was introduced to add a shortcut when committing without
editing the commit message interactively. A part of that logic was to
ensure that the index was read into memory:
if (!active_nr && read_cache() < 0)
die(...)
Translation to English: If the index has not yet been read, read it, and
if that fails, error out.
That logic was incorrect, though: It used `!active_nr` as an indicator
that the index was not yet read. Usually this is not a problem because
in the vast majority of instances, the index contains at least one
entry.
And it was natural to do it this way because at the time that condition
was introduced, the `index_state` structure had no explicit flag to
indicate that it was initialized: This flag was only introduced in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), but that commit did not adjust the code path
where no index file was found and a new, pristine index was initialized.
Now, when the index does not contain any entry (which is quite
common in Git's test suite because it starts quite a many repositories
from scratch), subsequent calls to `do_read_index()` will mistake the
index not to be initialized, and read it again unnecessarily.
This is a problem because after initializing the empty index e.g. the
`cache_tree` in that index could have been initialized before a
subsequent call to `do_read_index()` wants to ensure an initialized
index. And if that subsequent call mistakes the index not to have been
initialized, it would lead to leaked memory.
The correct fix for that memory leak is to adjust the condition so that
it does not mistake `active_nr == 0` to mean that the index has not yet
been read.
Using the `initialized` flag instead, we avoid that mistake, and as a
bonus we can fix a bug at the same time that was introduced by the
memory leak fix: When deleting all tracked files and then asking `git
commit -a -m ...` to commit the result, Git would internally update the
index, then discard and re-read the index undoing the update, and fail
to commit anything.
This fixes https://github.com/git-for-windows/git/issues/4462
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are about to fix an ancient bug where `do_read_index()` pretended
that the index was not initialized when there are no index entries.
Before the `index_state` structure gained the `initialized` flag in
913e0e99b6 (unpack_trees(): protect the handcrafted in-core index from
read_cache(), 2008-08-23), that was the best we could do (even if it was
incorrect: it is totally possible to read a Git index file that contains
no index entries).
This pattern was repeated also in 998330ac2e (read-cache: look for
shared index files next to the index, too, 2021-08-26), which we fix
here by _not_ mistaking an empty base index for a missing
`sharedindex.*` file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 913e0e99b6 (unpack_trees(): protect the handcrafted in-core index
from read_cache(), 2008-08-23) a flag was introduced into the
`index_state` structure to indicate whether it had been initialized (or
more correctly: read and parsed).
There was one code path that was not handled, though: when the index
file does not yet exist (but the `must_exist` parameter is set to 0 to
indicate that that's okay). In this instance, Git wants to go forward
with a new, pristine Git index, almost as if the file had existed and
contained no index entries or extensions.
Since Git wants to handle this situation the same as if an "empty" Git
index file existed, let's set the `initialized` flag also in that case.
This is necessary to prepare for fixing the bug where the condition
`cache_nr == 0` is incorrectly used as an indicator that the index was
already read, and the condition `initialized != 0` needs to be used
instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The summary under the NAME section for git hash-object can mislead
readers to conclude that the command can only be used to create blobs,
whereas the description makes it clear that it can be used to create
objects, not just blobs. Let's clarify the one-line summary.
Further, the description for the option -t does not list out other types
that can be used when creating objects. Let's make this explicit by
listing out the different object types.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
..so that the callback can use a "struct config_source" parameter
instead of "config_reader.source". "struct config_source" is internal to
config.c, so we are adding a pointer to a struct defined in config.c
into a public function signature defined in config.h, but this is okay
because this function has only ever been (and probably ever will be)
used internally by config.c.
As a result, the_reader isn't used anywhere, so "struct config_reader"
is obsolete (it was only intended to be used with the_reader). Remove
them.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Include directives are evaluated using the path of the config file. To
reduce the dependence on "config_reader.source", add a new
"key_value_info.path" member and use that instead of
"config_source.path". This allows us to remove a "struct config_reader
*" field from "struct config_include_data", which will subsequently
allow us to remove "struct config_reader" entirely.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the last usage of "struct config_reader" from configsets by
copying the "kvi" arg instead of recomputing "kvi" from
config_reader.source. Since we no longer need to pass both "struct
config_reader" and "struct config_set" in a single "void *cb", remove
"struct configset_add_data" too.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a code path starting from trace2_def_param_fl() that eventually
calls current_config_scope(), and thus it needs to have "kvi" plumbed
through it. Additional plumbing is also needed to get "kvi" to
trace2_def_param_fl(), which gets called by two code paths:
- Through tr2_cfg_cb(), which is a config callback, so it trivially
receives "kvi" via the "struct config_context ctx" parameter.
- Through tr2_list_env_vars_fl(), which is a high level function that
lists environment variables for tracing. This has been secretly
behaving like git_config_from_parameters() (in that it parses config
from environment variables/the CLI), but does not set config source
information.
Teach tr2_list_env_vars_fl() to be well-behaved by using
kvi_from_param(), which is used elsewhere for CLI/environment
variable-based config.
As a result, current_config_scope() has no more callers, so remove it.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context when parsing CLI config. To provide the .kvi member,
refactor out kvi_from_param() from the logic that caches CLI config in
configsets. Now that config_context and config_context.kvi is always
present when config machinery calls config callbacks, plumb "kvi" so
that we can remove all calls of current_config_scope() except for
trace2/*.c (which will be handled in a later commit), and remove all
other current_config_*() (the functions themselves and their calls).
Note that this results in .kvi containing a different, more complete
set of information than the mocked up "struct config_source" in
git_config_from_parameters().
Plumbing "kvi" reveals a few places where we've been doing the wrong
thing:
* git_config_parse_parameter() hasn't been setting config source
information, so plumb "kvi" there too.
* Several sites in builtin/config.c have been calling current_config_*()
functions outside of config callbacks (indirectly, via the
format_config() helper), which means they're reading state that isn't
set correctly:
* "git config --get-urlmatch --show-scope" iterates config to collect
values, but then attempts to display the scope after config
iteration, causing the "unknown" scope to be shown instead of the
config file's scope. It's clear that this wasn't intended: we knew
that "--get-urlmatch" couldn't show config source metadata, which is
why "--show-origin" was marked incompatible with "--get-urlmatch"
when it was introduced [1]. It was most likely a mistake that we
allowed "--show-scope" to sneak through.
Fix this by copying the "kvi" value in the collection phase so that
it can be read back later. This means that we can now support "git
config --get-urlmatch --show-origin", but that is left unchanged
for now.
* "git config --default" doesn't have config source metadata when
displaying the default value, so "--show-scope" also results in
"unknown", and "--show-origin" results in a BUG(). Fix this by
treating the default value as if it came from the command line (e.g.
like we do with "git -c" or "git config --file"), using
kvi_from_param().
[1] https://lore.kernel.org/git/20160205112001.GA13397@sigill.intra.peff.net/
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context to config_callbacks when parsing config files. To
provide the .kvi member, refactor out the configset logic that caches
"struct config_source" and "enum config_scope" as a "struct
key_value_info". Make the "enum config_scope" available to the config
file machinery by plumbing an additional arg through
git_config_from_file_with_options().
We do not exercise ctx yet because the remaining current_config_*()
callers may be used with config_with_options(), which may read config
from parameters, but parameters don't pass ctx yet.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass config_context to config callbacks in configset_iter(), trivially
setting the .kvi member to the cached key_value_info. Then, in config
callbacks that are only used with configsets, use the .kvi member to
replace calls to current_config_*(), and delete current_config_line()
because it has no remaining callers.
This leaves builtin/config.c and config.c as the only remaining users of
current_config_*().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are actually used as config callbacks, so use the typedef-ed type
and make future refactors easier.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_color_default_config() is a shorthand for calling two other config
callbacks. There are no other non-static functions that do this and it
will complicate our refactoring of config_fn_t so inline it instead.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The status report for an in-progress cherry-pick does not show the
current commit if the cherry-pick happens as part of a series of
multiple commits:
$ git cherry-pick <commit1> <commit2>
< one of the cherry-picks fails to merge clean >
Cherry-pick currently in progress.
(run "git cherry-pick --continue" to continue)
(use "git cherry-pick --skip" to skip this patch)
(use "git cherry-pick --abort" to cancel the cherry-pick operation)
$ git status
On branch <branch>
Your branch is ahead of '<upstream>' by 1 commit.
(use "git push" to publish your local commits)
Cherry-pick currently in progress.
(run "git cherry-pick --continue" to continue)
(use "git cherry-pick --skip" to skip this patch)
(use "git cherry-pick --abort" to cancel the cherry-pick operation)
The show_cherry_pick_in_progress() function prints "Cherry-pick
currently in progress". That function does have a more verbose print
based on whether the cherry_pick_head_oid is null or not. If it is not
null, then a more helpful message including which commit is actually
being picked is displayed.
The introduction of the "Cherry-pick currently in progress" message
comes from 4a72486de9 ("fix cherry-pick/revert status after commit",
2019-04-17). This commit modified wt_status_get_state() in order to
detect that a cherry-pick was in progress even if the user has used `git
commit` in the middle of the sequence.
The check used to detect this is the call to sequencer_get_last_command.
If the sequencer indicates that the lass command was a REPLAY_PICK, then
the state->cherry_pick_in_progress is set to 1 and the
cherry_pick_head_oid is initialized to the null_oid. Similar behavior is
done for the case of REPLAY_REVERT.
It happens that this call of sequencer_get_last_command will always
report the action even if the user hasn't interrupted anything. Thus,
during a range of cherry-picks or reverts, the cherry_pick_head_oid and
revert_head_oid will always be overwritten and initialized to the null
oid.
This results in status always displaying the terse message which does
not include commit information.
Fix this by adding an additional check so that we do not re-initialize
the cherry_pick_head_oid or revert_head_oid if we have already set the
cherry_pick_in_progress or revert_in_progress bits. This ensures that
git status will display the more helpful information when its available.
Add a test case covering this behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like with attributes files, sometimes programs would like to know
the location of configuration files at the global or system levels.
However, it isn't always clear where these may live, especially for the
system file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.
Since other parties cannot intuitively know how Git was compiled and
where it looks for these files, help them by providing variables that
can be queried. Because we have multiple paths for global config
values, print them in order from highest to lowest priority, and be sure
to split on newlines so that "git var -l" produces two entries for the
global value.
However, be careful not to split all values on newlines, since our
editor values could well contain such characters, and we don't want to
split them in such a case.
Note in the documentation that some values may contain multiple paths
and that callers should be prepared for that fact. This helps people
write code that will continue to work in the event we allow multiple
items elsewhere in the future.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are some programs which would like to read and parse
the gitattributes files at the global or system levels. However, it's
not always obvious where these files live, especially for the system
file, which may have been hard-coded at compile time or computed
dynamically based on the runtime prefix.
It's not reasonable to expect all callers of Git to intuitively know
where the Git distributor or user has configured these locations to
be, so add some entries to allow us to determine their location. Honor
the GIT_ATTR_NOSYSTEM environment variable if one is specified. Expose
the accessor functions in a way that we can reuse them from within the
var code.
In order to make our paths consistent on Windows and also use the same
form as paths use in "git rev-parse", let's normalize the path before we
return it. This results in Windows-style paths that use slashes, which
is convenient for making our tests function in a consistent way across
platforms. Note that this requires that some of our values be freed, so
let's add a flag about whether the value needs to be freed and use it
accordingly.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, the functions which determine the current system and global
gitattributes files are not exposed. We'd like to use them in a future
commit, but they're not ideally named. Rename them to something more
suitable as a public interface, expose them, and document them.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, all of our values are constants whose allocation is managed
elsewhere. However, in the future, we'll have some variables whose
memory we will need to free. To keep things consistent, let's make each
of our functions allocate its own memory and make the caller responsible
for freeing it.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now, we have only two items in our variable struct. However, in
the future, we're going to add two more items. To help keep our diffs
nice and tidy and make this structure easier to read, switch to use
C99-style initializers for our data.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On most Unix systems, finding a suitable shell is easy: one simply uses
"sh" with an appropriate PATH value. However, in many Windows
environments, the shell is shipped alongside Git, and it may or may not
be in PATH, even if Git is.
In such an environment, it can be very helpful to query Git for the
shell it's using, since other tools may want to use the same shell as
well. To help them out, let's add a variable, GIT_SHELL_PATH, that
points to the location of the shell.
On Unix, we know our shell must be executable to be functional, so
assume that the distributor has correctly configured their environment,
and use that as a basic test. On Git for Windows, we know that our
shell will be one of a few fixed values, all of which end in "sh" (such
as "bash"). This seems like it might be a nice test on Unix as well,
since it is customary for all shells to end in "sh", but there probably
exist such systems that don't have such a configuration, so be careful
here not to break them.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In line with our other helper functions for paths, let's add a function
to check whether a path is executable, and if not, print a suitable
error message. Document this function, and note that it must only be
used under the POSIXPERM prerequisite, since it doesn't otherwise work
on Windows.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We abstract the set of variables into a table, with a "read" callback to
provide the value of each. Each callback takes a "flag" argument, but
most callbacks don't make use of it.
This flag is a bit odd. It may be set to IDENT_STRICT, which make sense
for ident-based callbacks, but is just confusing for things like
GIT_EDITOR.
At first glance, it seems like this is just a hack to let us directly
stick the generic git_committer_info() and git_author_info() functions
into our table. And we'd be better off to wrap them with local functions
which pass IDENT_STRICT, and have our callbacks take no option at all.
But that doesn't quite work. We pass IDENT_STRICT when the caller asks
for a specific variable, but otherwise do not (so that "git var -l" does
not bail if the committer ident cannot be formed).
So we really do need to pass in the flag to each invocation, even if the
individual callback doesn't care about it. Let's mark the unused ones so
that -Wunused-parameter does not complain. And while we're here, let's
rename them so that it's clear that the flag values we get will be from
the IDENT_* set. That may prevent confusion for future readers of the
code.
Another option would be to define our own local "strict" flag for the
callbacks, and then have wrappers that translate that to IDENT_STRICT
where it matters. But that would be more boilerplate for little gain
(most functions would still ignore the "strict" flag anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When iterating through config, we read config source metadata from
global values - either a "struct config_source + enum config_scope"
or a "struct key_value_info", using the current_config* functions. Prior
to the series starting from 0c60285147 (config.c: create config_reader
and the_reader, 2023-03-28), we weren't very picky about which values we
should read in which situation; we did note that both groups of values
generally shouldn't be set together, but if both were set,
current_config* preferentially reads key_value_info. When that series
added more structure, we enforced that either the former (when parsing a
config source) can be set, or the latter (when iterating a config set),
but *never* both at the same time. See 9828453ff0 (config.c: remove
current_config_kvi, 2023-03-28) and 5cdf18e7cd (config.c: remove
current_parsing_scope, 2023-03-28).
That was a good simplifying constraint that helped us reason about the
global state, but it turns out that there is at least one situation
where we need both to be set at the same time: in a blobless partial
clone where .gitmodules is missing. "git fetch" in such a repo will
start a config parse over .gitmodules (setting the config_source), and
Git will attempt to lazy-fetch it from the promisor remote. However,
when we try to read the promisor configuration, we start iterating a
config set (setting the key_value_info), and we BUG() out because that's
not allowed any more.
Teaching config_reader to gracefully handle this is somewhat
complicated, but fortunately, there are proposed changes to the config.c
machinery to get rid of this global state, and make the BUG() obsolete
[1]. We should rely on that as the eventual solution, and avoid doing
yet another refactor in the meantime.
Therefore, fix the bug by removing the BUG() check. We're reverting to
an older, less safe state, but that's generally okay since
key_value_info is always preferentially read, so we'd always read the
correct values when we iterate a config set in the middle of a config
parse (like we are here). The reverse would be wrong, but extremely
unlikely to happen since very few callers parse config without going
through a config set.
[1] https://lore.kernel.org/git/pull.1497.v3.git.git.1687290231.gitgitgadget@gmail.com
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a comment on top of add_diff_options, where common diff options are
listed, mentioning __git_diff_common_options in the completion script,
in the hope that contributors update it when they add new diff flags.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--remerge-diff only makes sense for 'git log' and 'git show', so add it
to __git_log_show_options which is referenced in the completion for
these two commands.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flags --[no-]diff-merges only make sense for 'git log' and 'git
show', so add a new variable __git_log_show_options for options only
relevant to these two commands, and add them there. Also add
__git_diff_merges_opts and list the accepted values for --diff-merges,
and use it in _git_log and _git_show.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options --pickaxe-all and --pickaxe-regex are listed in
__git_diff_difftool_options and repeated in _git_log. Move them to
__git_diff_common_options instead, which makes them available
automatically in the completion of other commands referencing this
variable.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --ws-error-highlight= to the list in __git_diff_common_options, and
add the accepted values in a new list __git_ws_error_highlight_opts.
Use __git_ws_error_highlight_opts in _git_diff, _git_log and _git_show
to offer the accepted values.
As noted in fd0bc17557 (completion: add diff --color-moved[-ws],
2020-02-21), there is no easy way to offer completion for several
comma-separated values, so this is limited to completing a single
value.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --no-relative to __git_diff_common_options in the completion script,
and move --relative from __git_diff_difftool_options to
__git_diff_common_options since it applies to more than just diff and
difftool.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options --ita-invisible-in-index and --ita-visible-in-index are
listed in diff-options.txt and so are included in the documentation of
commands which include this file (diff, diff-*, log, show, format-patch)
but they only make sense for diffs relating to the index. As such, add
them to '__git_diff_difftool_options' instead of
'__git_diff_common_options' since it makes more sense to add them there.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add descriptive comments for '__git_diff_common_options' and
'__git_diff_difftool_options', so that it is clearer when looking at
these variables to know in which command's completion they are used.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid breakage of "git pack-objects --cruft" due to inconsistency
between the way the code enumerates packfiles in the repository.
* tb/collect-pack-filenames-fix:
builtin/repack.c: only collect fully-formed packs
When "git commit --trailer=..." invokes the interpret-trailers
machinery, it knows what it feeds to interpret-trailers is a full
log message without any patch, but failed to express that by
passing the "--no-divider" option, which has been corrected.
* jk/commit-use-no-divider-with-interpret-trailers:
commit: pass --no-divider to interpret-trailers
Commit f1c0e3946e (apply: reject patches larger than ~1 GiB, 2022-10-25)
added a limit on the size of patch that apply will process to avoid
integer overflows. The implementation re-used the existing error message
for when we are unable to read the patch. This is unfortunate because (a) it
does not signal to the user that the patch is being rejected because it
is too large and (b) it uses error_errno() without setting errno.
This patch adds a specific error message for the case when a patch is
too large. It also updates the existing message to make it clearer that
it is the patch that cannot be read rather than any other file and marks
both messages for translation. The "git apply" prefix is also dropped to
match most of the rest of the error messages in apply.c (there are still
a few error messages that prefixed with "git apply" and are not marked
for translation after this patch). The test added in f1c0e3946e is
updated accordingly.
Reported-by: Premek Vysoky <Premek.Vysoky@microsoft.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4dc16e2cb0 (gc: introduce `gc.recentObjectsHook`, 2023-06-07), we
added tests to ensure that prune-able (i.e. unreachable and with mtime
older than the cutoff) objects which are marked as recent via the new
`gc.recentObjectsHook` configuration are unpacked as loose with
`--unpack-unreachable`.
In that test, we also ensure that objects which are reachable from other
unreachable objects which were *not* pruned are kept as well, regardless
of their mtimes. For this, we use an annotated tag pointing at a blob
($obj2) which would otherwise be pruned.
But after pruning, that object is kept around for two reasons. One, the
tag object's mtime wasn't adjusted to be beyond the 1-hour cutoff, so it
would be kept as due to its recency regardless. The other reason is
because the tag itself is reachable.
Use mktag to write the tag object directly without pointing a reference
at it, and adjust the mtime of the tag object to be older than the
cutoff to ensure that our `gc.recentObjectsHook` configuration is
working as intended.
Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The run_command() on the platform does not seem to report death by
signal as the caller expects. For now, skip the test on Windows.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even when diff.ignoreSubmodules tells us to ignore submodule
changes, "git commit" with an index that already records changes to
submodules should include the submodule changes in the resulting
commit, but it did not.
* js/defeat-ignore-submodules-config-with-explicit-addition:
diff-lib: honor override_submodule_config flag bit
Leakfixes
* rj/leakfixes:
tests: mark as passing with SANITIZE=leak
config: fix a leak in git_config_copy_or_rename_section_in_file
branch: fix a leak in cmd_branch
branch: fix a leak in setup_tracking
rev-parse: fix a leak with --abbrev-ref
branch: fix a leak in setup_tracking
branch: fix a leak in check_tracking_branch
branch: fix a leak in inherit_tracking
branch: fix a leak in dwim_and_setup_tracking
remote: fix a leak in query_matches_negative_refspec
config: fix a leak in git_config_copy_or_rename_section_in_file
Gracefully deal with a stale MIDX file that lists a packfile that
no longer exists.
* tb/open-midx-bitmap-fallback:
pack-bitmap.c: gracefully degrade on failure to load MIDX'd pack
"git pack-objects" learned to invoke a new hook program that
enumerates extra objects to be used as anchoring points to keep
otherwise unreachable objects in cruft packs.
* tb/gc-recent-object-hook:
gc: introduce `gc.recentObjectsHook`
reachable.c: extract `obj_is_recent()`
Simplify error message when run-command fails to start a command.
* rs/run-command-exec-error-on-noent:
run-command: report exec error even on ENOENT
t1800: loosen matching of error message for bad shebang
When an external merge driver dies with a signal, we should not
expect that the result left on the filesystem is in any useful
state. However, because the current code uses the return value from
run_command() and declares any positive value as a sign that the
driver successfully left conflicts in the result, and because the
return value from run_command() for a subprocess that died upon a
signal is positive, we end up treating whatever garbage left on the
filesystem as the result the merge driver wanted to leave us.
run_command() returns larger than 128 (WTERMSIG(status) + 128, to be
exact) when it notices that the subprocess died with a signal, so
detect such a case and return LL_MERGE_ERROR from ll_ext_merge().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Suggest to refrain from using hex literals that are non-portable
when writing printf(1) format strings.
* jt/doc-use-octal-with-printf:
CodingGuidelines: use octal escapes, not hex
The reimplemented "git add -i" did not honor color.ui configuration.
* ds/add-i-color-configuration-fix:
add: test use of brackets when color is disabled
add: check color.ui for interactive add
"git cat-file --batch" and friends learned "-Z" that uses NUL
delimiter for both input and output.
* ps/cat-file-null-output:
cat-file: add option '-Z' that delimits input and output with NUL
cat-file: simplify reading from standard input
strbuf: provide CRLF-aware helper to read until a specified delimiter
t1006: modernize test style to use `test_cmp`
t1006: don't strip timestamps from expected results
Introduce a mechanism to disable replace refs globally and per
repository.
* ds/disable-replace-refs:
repository: create read_replace_refs setting
replace-objects: create wrapper around setting
repository: create disable_replace_refs()
The object traversal using reachability bitmap done by
"pack-object" has been tweaked to take advantage of the fact that
using "boundary" commits as representative of all the uninteresting
ones can save quite a lot of object enumeration.
* tb/pack-bitmap-traversal-with-boundary:
pack-bitmap.c: use commit boundary during bitmap traversal
pack-bitmap.c: extract `fill_in_bitmap()`
object: add object_array initializer helper function
'git worktree add' learned how to create a worktree based on an
orphaned branch with `--orphan`.
* ja/worktree-orphan:
worktree add: emit warn when there is a bad HEAD
worktree add: extend DWIM to infer --orphan
worktree add: introduce "try --orphan" hint
worktree add: add --orphan flag
t2400: add tests to verify --quiet
t2400: refactor "worktree add" opt exclusion tests
t2400: cleanup created worktree in test
worktree add: include -B in usage docs
This creates a new fsmonitor-ll.h with most of the functions from
fsmonitor.h, though it leaves three inline functions where they were.
Two-thirds of the files that previously included fsmonitor.h did not
need those three inline functions or the six extra includes those inline
functions required, so this allows them to only include the lower level
header.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
oidhash() was used by both hashmap and khash, which makes sense.
However, the location of this function in hashmap.[ch] meant that
khash.h had to depend upon hashmap.h, making people unfamiliar with
khash think that it was built upon hashmap. (Or at least, I personally
was confused for a while about this in the past.)
Move this function to hash-ll, so that khash.h can stop depending upon
hashmap.h.
This has another benefit as well: it allows us to remove hashmap.h's
dependency on hash-ll.h. While some callers of hashmap.h were making
use of oidhash, most were not, so this change provides another way to
reduce the number of includes.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The vast majority of files including object-store.h did not need dir.h
nor khash.h. Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.
After this patch:
$ git grep -h include..object-store | sort | uniq -c
2 #include "object-store.h"
129 #include "object-store-ll.h"
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
khash.h lets you instantiate custom hash types that map between two
types. These are defined as a struct, as you might expect, and khash
typedef's that to kh_foo_t. But it declares the struct anonymously,
which doesn't give a name to the struct type itself; there is no
"struct kh_foo". This has two small downsides:
- when using khash, we declare "kh_foo_t *the_foo". This is
unlike our usual naming style, which is "struct kh_foo *the_foo".
- you can't forward-declare a typedef of an unnamed struct type in
C. So we might do something like this in a header file:
struct kh_foo;
struct bar {
struct kh_foo *the_foo;
};
to avoid having to include the header that defines the real
kh_foo. But that doesn't work with the typedef'd name. Without the
"struct" keyword, the compiler doesn't know we mean that kh_foo is
a type.
So let's always give khash structs the name that matches our
conventions ("struct kh_foo" to match "kh_foo_t"). We'll keep doing
the typedef to retain compatibility with existing callers.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch]. I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.
Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix. Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.
Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group. But giving them the common "ll-" prefix would group them. Using
"-ll" as a suffix thus seems just much more logical to me. Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.
[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The include of wildmatch.h in git-compat-util.h was added in cebcab189a
(Makefile: add USE_WILDMATCH to use wildmatch as fnmatch, 2013-01-01) as
a way to be able to compile-time force any calls to fnmatch() to instead
invoke wildmatch(). The defines and inline function were removed in
70a8fc999d (stop using fnmatch (either native or compat), 2014-02-15),
and this include in git-compat-util.h has been unnecessary ever since.
Remove the include from git-compat-util.h, but add it to the .c files
that had omitted the direct #include they needed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that a few .c files under builtin/ were
depending upon some headers but had forgotten to #include them. Add the
missing direct includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files depended upon various
things that oidset included, but had omitted the direct #include for
those headers. Add those now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.
Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first). This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the functions defined in read-cache.c, move their declarations from
cache.h to a new header, read-cache-ll.h. Also move some related inline
functions from cache.h to read-cache.h. The purpose of the
read-cache-ll.h/read-cache.h split is that about 70% of the sites don't
need the inline functions and the extra headers they include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
the_index is a global variable defined in repository.c; as such, its
declaration feels better suited living in repository.h rather than
cache.h. Move it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have a preload-index.c file; move the declarations for the
functions in that file into a new preload-index.h. These were
previously split between cache.h and repository.h.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note in particular that this reverses the decision made in 118a2e8bde
("cache: move ensure_full_index() to cache.h", 2021-04-01).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions do not depend upon struct cache_entry or struct
index_state in any way, and it seems more logical to break them out into
this file, especially since statinfo.h already has the struct stat_data
declaration.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function add_files_to_cache(), plus associated helper functions,
were defined in builtin/add.c, but also shared with builtin/checkout.c
and builtin/commit.c. Move these shared functions to read-cache.c.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function add_files_to_cache() is used by all three of builtin/{add,
checkout, commit}.c. That suggests this is common library code, and
should be moved somewhere else, like read-cache.c. However, the
function and its helpers made use of two global variables that made
straight code movement difficult:
* the_index
* include_sparse
The latter was perhaps more problematic since it was only accessible in
builtin/add.c but was still affecting builtin/checkout.c and
builtin/commit.c without this fact being very clear from the code. I'm
not sure if the other two callers would want to add a `--sparse` flag
similar to add.c to get non-default behavior, but exposing this
dependence will help if we ever decide we do want to add such a flag.
Modify add_files_to_cache() and its helpers to accept the necessary
arguments instead of relying on globals.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function overlay_tree_on_index(), plus associated helper functions,
were defined in builtin/ls-files.c, but also shared with
builtin/commit.c. Move these shared functions to read-cache.c.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.
Move these functions, plus their several helpers only used by these
functions, to setup.[ch].
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like the parent commit, this commit was prompted by a desire to
move the functions which builtin/init-db.c and builtin/clone.c share out
of the former file and into setup.c. A secondary issue that made it
difficult was the init_shared_repository global variable; replace it
with a simple parameter that is passed to the relevant functions.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit was prompted by a desire to move the functions which
builtin/init-db.c and builtin/clone.c share out of the former file and
into setup.c. One issue that made it difficult was the
init_is_bare_repository global variable.
init_is_bare_repository's sole use in life it to cache a value in
init_db(), and then be used in create_default_files(). This is a bit
odd since init_db() directly calls create_default_files(), and is the
only caller of that function. Convert the global to a simple function
parameter instead.
(Of course, this doesn't fix the fact that this value is then ignored by
create_default_files(), as noted in a big TODO comment in that function,
but it at least includes no behavioral change other than getting rid of
a very questionable global variable.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comments in create_default_files() talks about reading config from
the config file in the specified `--templates` directory, which leads to
the question of whether core.bare could be set in such a config file and
thus whether the code is doing the right thing. It turns out, that it
doesn't; it unconditionally ignores core.bare in the config file in any
--templates directory. It is not clear to me that fixing it can be done
within this function; it seems to occur too late:
* create_default_files() is called by init_db()
* init_db() is called by both builtin/{clone.c,init-db.c}
* both callers of init_db() call set_git_work_tree() before init_db()
and in order to actual affect whether a repository is bear, we'd need to
somewhere reset these values, not just the is_bare_repository_cfg
setting.
I do not want to open this can of worms at this time; I'm trying to
clean up some headers, for which I need to move some functions, for
which I need to clean up some globals, and that's far enough down the
rabbit hole. So, simply document the issue with a careful TODO comment
and a few testcases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit introduces a new option "--[no-]stripspace" to git notes
append, git notes edit, and git notes add. This option allows users to
control whether the note message need to stripped out.
For the consideration of backward compatibility, let's look at the
behavior about "stripspace" in "git notes" command:
1. "Edit Message" case: using the default editor to edit the note
message.
In "edit" case, the edited message will always be stripped out, the
implementation which can be found in the "prepare_note_data()". In
addition, the "-c" option supports to reuse an existing blob as a
note message, then open the editor to make a further edition on it,
the edited message will be stripped.
This commit doesn't change the default behavior of "edit" case by
using an enum "notes_stripspace", only when "--no-stripspace" option
is specified, the note message will not be stripped out. If you do
not specify the option or you specify "--stripspace", clearly, the
note message will be stripped out.
2. "Assign Message" case: using the "-m"/"-F"/"-C" option to specify the
note message.
In "assign" case, when specify message by "-m" or "-F", the message
will be stripped out by default, but when specify message by "-C",
the message will be copied verbatim, in other word, the message will
not be stripped out. One more thing need to note is "the order of
the options matter", that is, if you specify "-C" before "-m" or
"-F", the reused message by "-C" will be stripped out together,
because everytime concat "-m" or "-F" message, the concated message
will be stripped together. Oppositely, if you specify "-m" or "-F"
before "-C", the reused message by "-C" will not be stripped out.
This commit doesn't change the default behavior of "assign" case by
extending the "stripspace" field in "struct note_msg", so we can
distinguish the different behavior of "-m"/"-F" and "-C" options
when we need to parse and concat the message.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename "insert_separator" to "append_separator" and also remove the
"postion" argument, this serves two purpose:
The first is that when specifying more than one "-m" ( like "-F", etc)
to "git notes add" or "git notes append", the order of them matters,
which means we need to append the each separator and message in turn,
so we don't have to make the caller specify the position, the "append"
operation is enough and clear.
The second is that when we execute the "git notes append" subcommand,
we need to combine the "prev_note" and "current_note" to get the
final result. Before, we inserted a newline character at the beginning
of "current_note". Now, we will append a newline to the end of
"prev_note" instead, this will give the consisitent results.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding new notes or appending to an existing notes, we will
insert a blank line between the paragraphs, like:
$ git notes add -m foo -m bar
$ git notes show HEAD
foo
bar
The default behavour sometimes is not enough, the user may want
to use a custom delimiter between paragraphs, like when
specifying '-m', '-F', '-C', '-c' options. So this commit
introduce a new '--separator' option for 'git notes add' and
'git notes append', for example when executing:
$ git notes add -m foo -m bar --separator="-"
$ git notes show HEAD
foo
-
bar
a newline is added to the value given to --separator if it
does not end with one already. So when executing:
$ git notes add -m foo -m bar --separator="-"
and
$ export LF="
"
$ git notes add -m foo -m bar --separator="-$LF"
Both the two exections produce the same result.
The reason we use a "strbuf" array to concat but not "string_list", is
that the binary file content may contain '\0' in the middle, this will
cause the corrupt result if using a string to save.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct note_data d = { 0, 0, NULL, STRBUF_INIT };" style could be
replaced with designated initializer for clarity.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's cleanup the unnecessary 'strbuf_grow' call in 'append_edit'. This
"strbuf_grow(&d.buf, size + 1);" is prepared for insert a blank line if
needed, but actually when inserting, "strbuf_insertstr(&d.buf, 0,
"\n");" will do the "grow" for us.
348f199b (builtin-notes: Refactor handling of -F option to allow
combining -m and -F, 2010-02-13) added these to mimic the code
introduced by 2347fae5 (builtin-notes: Add "append" subcommand for
appending to note objects, 2010-02-13) that reads in previous note
before the message. And the resulting code with explicit sizing is
carried to this day.
In the context of reading an existing note in, exact sizing may have
made sense, but because the resulting note needs cleansing with
stripspace() when appending with this option, such an exact sizing
does not buy us all that much in practice.
It may help avoiding overallocation due to ALLOC_GROW() slop, but
nobody can feed so many long messages for it to matter from the
command line.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf.
* jk/log-follow-with-non-literal-pathspec:
diff: detect pathspec magic not supported by --follow
diff: factor out --follow pathspec check
pathspec: factor out magic-to-name function
The value of config.worktree is per-repository, but has been kept
in a singleton global variable per process. This has been OK as
most Git operations interacted with a single repository at a time,
but not right for operations like recursive "grep" that want to
access multiple repositories from a single process without forking.
The global variable has been eliminated and made into a member in
the per-repository data structure.
* vd/worktree-config-is-per-repository:
repository: move 'repository_format_worktree_config' to repo scope
config: pass 'repo' directly to 'config_with_options()'
config: use gitdir to get worktree config
"git submodule" code trusted the data coming from the config (and
the in-tree .gitmodules file) too much without validating, leading
to NULL dereference if the user mucks with a repository (e.g.
submodule.<name>.url is removed). This has been corrected.
* tb/submodule-null-deref-fix:
builtin/submodule--helper.c: handle missing submodule URLs
Test style updates.
* jc/test-modernization-2:
t9400-git-cvsserver-server: modernize test format
t9200-git-cvsexportcommit: modernize test format
t9104-git-svn-follow-parent: modernize test format
t9100-git-svn-basic: modernize test format
t7700-repack: modernize test format
t7600-merge: modernize test format
t7508-status: modernize test format
t7201-co: modernize test format
t7111-reset-table: modernize test format
t7110-reset-merge: modernize test format
* jc/test-modernization:
t7101-reset-empty-subdirs: modernize test format
t6050-replace: modernize test format
t5306-pack-nobase: modernize test format
t5303-pack-corruption-resilience: modernize test format
t5301-sliding-window: modernize test format
t5300-pack-object: modernize test format
t4206-log-follow-harder-copies: modernize test format
t4202-log: modernize test format
t4004-diff-rename-symlink: modernize test format
t4003-diff-rename-1: modernize test format
t4002-diff-basic: modernize test format
t3903-stash: modernize test format
t3700-add: modernize test format
t3500-cherry: modernize test format
t1006-cat-file: modernize test format
t1002-read-tree-m-u-2way: modernize test format
t1001-read-tree-m-2way: modernize test format
t3210-pack-refs: modernize test format
t0030-stripspace: modernize test format
t0000-basic: modernize test format
Document more pseudo-refs and teach the command line completion
machinery to complete AUTO_MERGE.
* pb/complete-and-document-auto-merge-and-friends:
completion: complete AUTO_MERGE
Documentation: document AUTO_MERGE
git-merge.txt: modernize word choice in "True merge" section
completion: complete REVERT_HEAD and BISECT_HEAD
revisions.txt: document more special refs
revisions.txt: use description list for special refs
Clang's sanitizer implementation seems to work better than GCC's.
* jk/ci-use-clang-for-sanitizer-jobs:
ci: drop linux-clang job
ci: run ASan/UBSan in a single job
ci: use clang for ASan/UBSan checks
Code clean-up.
* ps/fetch-cleanups:
fetch: use `fetch_config` to store "submodule.fetchJobs" value
fetch: use `fetch_config` to store "fetch.parallel" value
fetch: use `fetch_config` to store "fetch.recurseSubmodules" value
fetch: use `fetch_config` to store "fetch.showForcedUpdates" value
fetch: use `fetch_config` to store "fetch.pruneTags" value
fetch: use `fetch_config` to store "fetch.prune" value
fetch: pass through `fetch_config` directly
fetch: drop unneeded NULL-check for `remote_ref`
fetch: drop unused DISPLAY_FORMAT_UNKNOWN enum value
When installing a packfile, we place the .pack file before the .idx
file. The intention is that Git scans for .idx files in the pack
directory and then loads the .pack files from that list.
However, when we delete packfiles, we do not do this in the reverse
order as we should. The unlink_pack_path() method deletes the .pack
followed by the .idx.
This creates a window where the process could be interrupted between
the .pack deletion and the .idx deletion, leaving the repository in a
state that looks strange, but isn't actually too problematic if we
assume the pack was safe to delete. The .idx without a .pack will cause
some overhead, but will not interrupt other Git processes.
This ordering was introduced into the 'git repack' builtin by
a1bbc6c017 (repack: rewrite the shell script in C, 2013-09-15), though
we must be careful to track history through the code move in 8434e85d5f
(repack: refactor pack deletion for future use, 2019-06-10) to see that.
This became more important after 73320e49ad (builtin/repack.c: only
collect fully-formed packs, 2023-06-07) changed how 'git repack' scanned
for packfiles for use in the cruft pack process. It previously looked
for .pack files, but that was problematic due to the order that packs
are installed: repacks between the creation of a .pack and the creation
of its .idx would result in hard failures.
There is an independent proposal about what to do in the case of a .idx
without a .pack during this 'git repack' scenario, but this change is
focused on deleting .pack files more safely.
Modify the order to delete the .idx before the .pack. The rest of the
modifiers on the .pack should still come after the .pack so we know all
of the presumed properties of the packfile as long as it exists in the
filesystem, in case we wish to reinstate it by re-indexing the .pack
file.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that strbuf_expand_literal_cb() is no longer used as a callback,
drop its "_cb" name suffix and unused context parameter.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid the overhead of passing context to a callback function of
strbuf_expand() by using strbuf_expand_step() in a loop instead. It
requires explicit handling of %% and unrecognized placeholders, but is
simpler, more direct and avoids void pointers.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid the overhead of setting up a dictionary and passing it via
strbuf_expand() to strbuf_expand_dict_cb() by using strbuf_expand_step()
in a loop instead. It requires explicit handling of %% and unrecognized
placeholders, but is more direct and simpler overall, and expands only
on demand.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract the part of strbuf_expand that finds the next placeholder into a
new function. It allows to build parsers without callback functions and
the overhead imposed by them.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Deduplicate the code for setting the options "separator" and
"key_value_separator" by moving it into a new helper function,
expand_separator().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When redacting auth tokens in trace output from curl, we look for http/2
headers of the form "h2h3 [header: value]". This comes from b637a41ebe
(http: redact curl h2h3 headers in info, 2022-11-11).
But the "h2h3" prefix changed to just "h2" in curl's fc2f1e547 (http2:
support HTTP/2 to forward proxies, non-tunneling, 2023-04-14). That's in
released version curl 8.1.0; linking against that version means we'll
fail to correctly redact the trace. Our t5559.17 notices and fails.
We can fix this by matching either prefix, which should handle both old
and new versions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests listed below, since previous commits, no longer trigger any
leak.
+ t1507-rev-parse-upstream.sh
+ t1508-at-combinations.sh
+ t1514-rev-parse-push.sh
+ t2027-checkout-track.sh
+ t3200-branch.sh
+ t3204-branch-name-interpretation.sh
+ t5404-tracking-branches.sh
+ t5517-push-mirror.sh
+ t5525-fetch-tagopt.sh
+ t6040-tracking-info.sh
+ t7508-status.sh
Let's mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promptly any new leak that may be introduced and triggered by them in
the future.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A branch can have its configuration spread over several configuration
sections. This situation was already foreseen in 52d59cc645 (branch:
add a --copy (-c) option to go with --move (-m), 2017-06-18) when
"branch -c" was introduced.
Unfortunately, a leak was also introduced:
$ git branch foo
$ cat >> .git/config <<EOF
[branch "foo"]
some-key-a = a value
[branch "foo"]
some-key-b = b value
[branch "foo"]
some-key-c = c value
EOF
$ git branch -c foo bar
Direct leak of 130 byte(s) in 2 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_vaddf strbuf.c
... in strbuf_addf strbuf.c
... in store_create_section config.c
... in git_config_copy_or_rename_section_in_file config.c
... in git_config_copy_section_in_file config.c
... in git_config_copy_section config.c
... in copy_or_rename_branch builtin/branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's fix it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 98e7ab6d42 (for-each-ref: delay parsing of --sort=<atom> options,
2021-10-20) a new string_list was introduced to accumulate any
"branch.sort" setting.
That string_list is cleared in ref_sorting_options(), which is only
called when processing the "--list" sub-command. Therefore, with other
sub-command, while having any sort option set, a leak is produced, e.g.:
$ git config branch.sort invalid_sort_option
$ git branch --edit-description
Direct leak of 384 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in string_list_append_nodup string-list.c
... in string_list_append string-list.c
... in git_branch_config builtin/branch.c
... in configset_iter config.c
... in repo_config config.c
... in git_config config.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Indirect leak of 20 byte(s) in 1 object(s) allocated from:
... in xstrdup wrapper.c
... in string_list_append string-list.c
... in git_branch_config builtin/branch.c
... in configset_iter config.c
... in repo_config config.c
... in git_config config.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
We don't have a common clean-up section in cmd_branch(). To avoid
refactoring, keep the fix simple, and while we find a better solution
which hopefuly will avoid entirely that string_list, when no sort
options are needed; let's squelch the leak sanitizer using UNLEAK().
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bdaf1dfae7 (branch: new autosetupmerge option "simple" for matching
branches, 2022-04-29) a new exit for setup_tracking() missed the
clean-up, producing a leak.
$ git config branch.autoSetupMerge simple
$ git remote add local .
$ git update-ref refs/remotes/local/foo HEAD
$ git branch bar local/foo
Direct leak of 384 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in string_list_append_nodup string-list.c
... in find_tracked_branch branch.c
... in for_each_remote remote.c
... in setup_tracking branch.c
... in create_branch branch.c
... in cmd_branch builtinbranch.c
... in run_builtin git.c
Indirect leak of 24 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_add strbuf.c
... in match_name_with_pattern remote.c
... in query_refspecs remote.c
... in remote_find_tracking remote.c
... in find_tracked_branch branch.c
... in for_each_remote remote.c
... in setup_tracking branch.c
... in create_branch branch.c
... in cmd_branch builtinbranch.c
... in run_builtin git.c
The return introduced in bdaf1dfae7 was to avoid setting up the
tracking, but even in that case it is still necessary to do the
clean-up. Let's do it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To handle "--abbrev-ref" we use shorten_unambiguous_ref(). This
function takes a refname and returns a shortened refname, which is a
newly allocated string that needs to be freed.
Unfortunately, the refname variable is reused to receive the shortened
one. Therefore, we lose the original refname, which needs to be freed
as well, producing a leak.
This leak can be reviewed with:
$ for a in {1..10}; do git branch foo_${a}; done
$ git rev-parse --abbrev-ref refs/heads/foo_{1..10}
Direct leak of 171 byte(s) in 10 object(s) allocated from:
... in xstrdup wrapper.c
... in expand_ref refs.c
... in repo_dwim_ref refs.c
... in show_rev builtin/rev-parse.c
... in cmd_rev_parse builtin/rev-parse.c
... in run_builtin git.c
We have this leak since a45d34691e (rev-parse: --abbrev-ref option to
shorten ref name, 2009-04-13) when "--abbrev-ref" was introduced.
Let's fix it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git-commit sees any "--trailer" options, it passes the
COMMIT_EDITMSG file through git-interpret-trailers. But it does so
without passing --no-divider, which means that interpret-trailers will
look for a "---" divider to signal the end of the commit message.
That behavior doesn't make any sense in this context; we know we have a
complete and solitary commit message, not something we have to further
parse. And as a result, we'll do the wrong thing if the commit message
contains a "---" marker (which otherwise is not syntactically
significant), inserting any new trailers at the wrong spot.
We can fix this by passing --no-divider. This is the exact situation for
which it was added in 1688c9a489 (interpret-trailers: allow suppressing
"---" divider, 2018-08-22). As noted in the message for that commit, it
just adds the mechanism, and further patches were needed to trigger it
from various callers. We did that back then in a few spots, like
ffce7f590f (sequencer: ignore "---" divider when parsing trailers,
2018-08-22), but obviously missed this one.
Reported-by: <eric.frederich@siemens.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
discover_git_directory() started modifying the_repository in ebaf3bcf1a
(repository: move global r_f_p_c to repo struct, 2021-06-17), when, in
the repository setup process, we started copying members from the
"struct repository_format" we're inspecting to the appropriate "struct
repository". However, discover_git_directory() isn't actually used in
the setup process (its only caller in the Git binary is
read_early_config()), so it shouldn't be doing this setup at all!
As explained by 16ac8b8db6 (setup: introduce the
discover_git_directory() function, 2017-03-13) and the comment on its
declaration, discover_git_directory() is intended to be an entrypoint
into setup.c machinery that allows the Git directory to be discovered
without side effects, e.g. so that read_early_config() can read
".git/config" before the_repository has been set up.
Fortunately, we didn't start to rely on this unintended behavior between
then and now, so we let's just remove it. It isn't harming anyone, but
it's confusing.
Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`credential reject` sends the erase action to each helper, but the
exact behaviour of erase isn't specified in documentation or tests.
Some helpers (such as credential-store and credential-libsecret) delete
all matching credentials, others (such as credential-cache) delete at
most one matching credential.
Test that helpers erase all matching credentials. This behaviour is
easiest to reason about. Users expect that `echo
"url=https://example.com" | git credential reject` or `echo
"url=https://example.com\nusername=tim" | git credential reject` erase
all matching credentials.
Fix credential-cache.
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test that credential helpers do not erase a password distinct from the
input. Such calls can happen when multiple credential helpers are
configured.
Fixes for credential-cache and credential-store.
Signed-off-by: M Hickford <mirth.hickford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While both git-rev-list(1) and git-log(1) support `--stdin`, it only
accepts commits and files. Most notably, it is impossible to pass any of
the pseudo-opts like `--all`, `--glob=` or others via stdin.
This makes it hard to use this function in certain scripted scenarios,
like when one wants to support queries against specific revisions, but
also against reference patterns. While this is theoretically possible by
using arguments, this may run into issues once we hit platform limits
with sufficiently large queries. And because `--stdin` cannot handle
pseudo-opts, the only alternative would be to use a mixture of arguments
and standard input, which is cumbersome.
Implement support for handling pseudo-opts in both commands to support
this usecase better. One notable restriction here is that `--stdin` only
supports "stuck" arguments in the form of `--glob=foo`. This is because
"unstuck" arguments would also require us to read the next line, which
would add quite some complexity to the code. This restriction should be
fine for scripted usage though.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that reads lines from standard input manually compares whether
the read line matches "--", which is a bit awkward to read. Furthermore,
we're about to extend the code to also support reading pseudo-options
via standard input, requiring more elaborate handling of lines with a
leading dash.
Refactor the code by hoisting out the check for "--" outside of the
block that checks for a leading dash.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reorder `read_revisions_from_stdin()` so that we can start using
`handle_revision_pseudo_opt()` without a forward declaration in a
subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-tree --format expands %x followed by two hexadecimal digits to the
character indicated by that hexadecimal number, e.g.:
$ git ls-tree --format=%x41 HEAD | head -1
A
It rejects % followed by a hexadecimal digit, e.g.:
$ git ls-tree --format=%41 HEAD | head -1
fatal: bad ls-tree format: element '41' does not start with '('
This functionality is provided by strbuf_expand_literal_cb(), which has
not been changed since it was factored out by fd2015b323 (strbuf:
separate callback for strbuf_expand:ing literals, 2019-01-28).
Adjust the documentation accordingly.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Be more up-front about what trailers are in practice with examples, to
give the reader a visual cue while they go on to read the rest of the
description.
Also add an example for multiline values.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'key' option is used frequently in the examples at the bottom but
there is no mention of it in the description.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This puts the deprecation notice up front, instead of leaving it to the
next paragraph.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already use angle brackets elsewhere, so this makes things more
consistent.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The phrase "many rules" gets essentially repeated again with "many other
rules", so remove this repetition.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, "message" could mean the input, output, commit message, or
"internal body text inside the commit message" (in the EXAMPLES
section). Avoid overloading this term by using the appropriate meanings
explicitly.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command can take inputs that are either just a commit message, or
an email-like output such as git-format-patch which includes a commit
message, "---" divider, and patch part. The existing explanation blends
these two inputs together in the first sentence
This command reads some patches or commit messages
which then necessitates using the "commit message part" phrasing (as
opposed to just "commit message") because the input is ambiguous per the
above definition.
This change separates the two input types and explains them separately,
and so there is no longer a need to use the "commit message part"
phrase.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This matches the order already used in the NAME section.
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the shell-scripting section of CodingGuidelines to suggest octal
escape sequences (e.g. "\302\242") over hexadecimal (e.g. "\xc2\xa2")
since the latter can be a source of portability problems.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `diff.ignoreSubmodules = all` is set and submodule commits are
manually staged (e.g. via `git-update-index`), `git-commit` should
record the commit with updated submodules.
`index_differs_from` is called from `prepare_to_commit` with flags set to
`override_submodule_config = 1`. `index_differs_from` then merges the
default diff flags and passed flags.
When `diff.ignoreSubmodules` is set to "all", `flags` ends up having
both `override_submodule_config` and `ignore_submodules` set to 1. This
results in `git-commit` ignoring staged commits.
This patch restores original `flags.ignore_submodule` if
`flags.override_submodule_config` is set.
Signed-off-by: Josip Sokcevic <sokcevic@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some atoms that can be used in "--format=<format>" for "git ls-tree"
were not supported by "git ls-files", even though they were relevant
in the context of the latter.
* zh/ls-files-format-atoms:
ls-files: align format atoms with ls-tree
"git pack-refs" learns "--include" and "--exclude" to tweak the ref
hierarchy to be packed using pattern matching.
* jc/pack-ref-exclude-include:
pack-refs: teach pack-refs --include option
pack-refs: teach --exclude option to exclude refs from being packed
docs: clarify git-pack-refs --all will pack all refs
Doc update.
* sa/doc-ls-remote:
ls-remote doc: document the output format
ls-remote doc: explain what each example does
ls-remote doc: show peeled tags in examples
ls-remote doc: remove redundant --tags example
show-branch doc: say <ref>, not <reference>
show-ref doc: update for internal consistency
The "-s" (silent, squelch) option of the "diff" family of commands
did not interact with other options that specify the output format
well. This has been cleaned up so that it will clear all the
formatting options given before.
* jc/diff-s-with-other-options:
diff: fix interaction between the "-s" option and other options
"git tag" learned to leave the "$GIT_DIR/TAG_EDITMSG" file when the
command failed, so that the user can salvage what they typed.
* kh/keep-tag-editmsg-upon-failure:
tag: keep the message file in case ref transaction fails
t/t7004-tag: add regression test for successful tag creation
doc: tag: document `TAG_EDITMSG`
The commit d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) replaced in "struct tracking", the member "char *src" by a
new "struct string_list *srcs".
This caused a modification in find_tracked_branch(). The string
returned by remote_find_tracking(), previously assigned to "src", is now
added to the string_list "srcs".
That string_list is initialized with STRING_LIST_INIT_DUP, which means
that what is added is not the given string, but a duplicate. Therefore,
the string returned by remote_find_tracking() is leaked.
The leak can be reviewed with:
$ git branch foo
$ git remote add local .
$ git fetch local
$ git branch --track bar local/foo
Direct leak of 24 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_add strbuf.c
... in match_name_with_pattern remote.c
... in query_refspecs remote.c
... in remote_find_tracking remote.c
... in find_tracked_branch branch.c
... in for_each_remote remote.c
... in setup_tracking branch.c
... in create_branch branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's fix the leak, using the "_nodup" API to add to the string_list.
This way, the string itself will be added instead of being strdup()'d.
And when the string_list is cleared, the string will be freed.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's fix a leak we have in check_tracking_branch() since the function
was introduced in 41c21f22d0 (branch.c: Validate tracking branches with
refspecs instead of refs/remotes/*, 2013-04-21).
The leak can be reviewed with:
$ git remote add local .
$ git update-ref refs/remotes/local/foo HEAD
$ git branch --track bar local/foo
Direct leak of 24 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_add strbuf.c
... in match_name_with_pattern remote.c
... in query_refspecs remote.c
... in remote_find_tracking remote.c
... in check_tracking_branch branch.c
... in for_each_remote remote.c
... in validate_remote_tracking_branch branch.c
... in dwim_branch_start branch.c
... in create_branch branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d3115660b4 (branch: add flags and config to inherit tracking,
2021-12-20) a new option was introduced to allow creating a new branch,
inheriting the tracking of another branch.
The new code, strdup()'d the remote_name of the existing branch, but
unfortunately it was not freed, producing a leak.
$ git remote add local .
$ git update-ref refs/remotes/local/foo HEAD
$ git branch --track bar local/foo
branch 'bar' set up to track 'local/foo'.
$ git branch --track=inherit baz bar
branch 'baz' set up to track 'local/foo'.
Direct leak of 6 byte(s) in 1 object(s) allocated from:
... in xstrdup wrapper.c
... in inherit_tracking branch.c
... in setup_tracking branch.c
... in create_branch branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Actually, the string we're strdup()'ing is from the struct branch
returned by get_branch(). Which, in turn, retrieves the string from the
global "struct repository". This makes perfectly valid to use the
string throughout the entire execution of create_branch(). There is no
need to duplicate it.
Let's fix the leak, removing the strdup().
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e89f151db1 (branch: move --set-upstream-to behavior to
dwim_and_setup_tracking(), 2022-01-28) the string returned by
dwim_branch_start() was not freed, resulting in a memory leak.
It can be reviewed with:
$ git remote add local .
$ git update-ref refs/remotes/local/foo HEAD
$ git branch --set-upstream-to local/foo foo
Direct leak of 23 byte(s) in 1 object(s) allocated from:
... in xstrdup wrapper.c
... in expand_ref refs.c
... in repo_dwim_ref refs.c
... in dwim_branch_start branch.c
... in dwim_and_setup_tracking branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's free it now.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c0192df630 (refspec: add support for negative refspecs, 2020-09-30)
query_matches_negative_refspec() was introduced.
The function was implemented as a two-loop process, where the former
loop accumulates and the latter evaluates. To accumulate, a string_list
is used.
Within the first loop, there are three cases where a string is added to
the string_list. Two of them add strings that do not need to be
freed. But in the third case, the string added is returned by
match_name_with_pattern(), which needs to be freed.
The string_list is initialized with STRING_LIST_INIT_NODUP, i.e. when
cleared, the strings added are not freed. Therefore, the string
returned by match_name_with_pattern() is not freed, so we have a leak.
$ git remote add local .
$ git update-ref refs/remotes/local/foo HEAD
$ git branch --track bar local/foo
Direct leak of 24 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_add strbuf.c
... in match_name_with_pattern remote.c
... in query_matches_negative_refspec remote.c
... in query_refspecs remote.c
... in remote_find_tracking remote.c
... in find_tracked_branch branch.c
... in for_each_remote remote.c
... in setup_tracking branch.c
... in create_branch branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Direct leak of 24 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_add strbuf.c
... in match_name_with_pattern remote.c
... in query_matches_negative_refspec remote.c
... in query_refspecs remote.c
... in remote_find_tracking remote.c
... in check_tracking_branch branch.c
... in for_each_remote remote.c
... in validate_remote_tracking_branch branch.c
... in dwim_branch_start branch.c
... in create_branch branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
An interesting point to note is that while string_list_append() is used
in the first two cases described, string_list_append_nodup() is used in
the third. This seems to indicate an intention to delegate the
responsibility for freeing the string, to the string_list. As if the
string_list had been initialized with STRING_LIST_INIT_DUP, i.e. the
strings are strdup()'d when added (except if the "_nodup" API is used)
and freed when cleared.
Switching to STRING_LIST_INIT_DUP fixes the leak and probably is what we
wanted to do originally. Let's do it.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 52d59cc645 (branch: add a --copy (-c) option to go with --move (-m),
2017-06-18) a new strbuf variable was introduced, but not released.
Thus, when copying a branch that has any configuration, we have a
leak.
$ git branch foo
$ git config branch.foo.some-key some_value
$ git branch -c foo bar
Direct leak of 65 byte(s) in 1 object(s) allocated from:
... in xrealloc wrapper.c
... in strbuf_grow strbuf.c
... in strbuf_vaddf strbuf.c
... in strbuf_addf strbuf.c
... in store_create_section config.c
... in git_config_copy_or_rename_section_in_file config.c
... in git_config_copy_section_in_file config.c
... in git_config_copy_section config.c
... in copy_or_rename_branch builtin/branch.c
... in cmd_branch builtin/branch.c
... in run_builtin git.c
Let's fix that leak.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When opening a MIDX bitmap, we the pack-bitmap machinery eagerly calls
`prepare_midx_pack()` on each of the packs contained in the MIDX. This
is done in order to populate the array of `struct packed_git *`s held by
the MIDX, which we need later on in `load_reverse_index()`, since it
calls `load_pack_revindex()` on each of the MIDX'd packs, and requires
that the caller provide a pointer to a `struct packed_git`.
When opening one of these packs fails, the pack-bitmap code will `die()`
indicating that it can't open one of the packs in the MIDX. This
indicates that the MIDX is somehow broken with respect to the current
state of the repository. When this is the case, we indeed cannot make
use of the MIDX bitmap to speed up reachability traversals.
However, it does not mean that we can't perform reachability traversals
at all. In other failure modes, that same function calls `warning()` and
then returns -1, indicating to its caller (`open_bitmap()`) that we
should either look for a pack bitmap if one is available, or perform
normal object traversal without using bitmaps at all.
There is no reason why this case should cause us to die. If we instead
continued (by jumping to `cleanup` as this patch does) and avoid using
bitmaps altogether, we may again try and query the MIDX, which will also
fail. But when trying to call `fill_midx_entry()` fails, it also returns
a signal of its failure, and prompts the caller to try and locate the
object elsewhere.
In other words, the normal object traversal machinery works fine in the
presence of a corrupt MIDX, so there is no reason that the MIDX bitmap
machinery should abort in that case when we could easily continue.
Note that we *could* in theory try again to load a MIDX bitmap after
calling `reprepare_packed_git()`. Even though the `prepare_packed_git()`
code is careful to avoid adding a pack that we already have,
`prepare_midx_pack()` is not. So if we got part of the way through
calling `prepare_midx_pack()` on a stale MIDX, and then tried again on a
fresh MIDX that contains some of the same packs, we would end up with a
loop through the `->next` pointer.
For now, let's do the simplest thing possible and fallback to the
non-bitmap code when we detect a stale MIDX so that the complete fix as
above can be implemented carefully.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch introduces a new multi-valued configuration option,
`gc.recentObjectsHook` as a means to mark certain objects as recent (and
thus exempt from garbage collection), regardless of their age.
When performing a garbage collection operation on a repository with
unreachable objects, Git makes its decision on what to do with those
object(s) based on how recent the objects are or not. Generally speaking,
unreachable-but-recent objects stay in the repository, and older objects
are discarded.
However, we have no convenient way to keep certain precious, unreachable
objects around in the repository, even if they have aged out and would
be pruned. Our options today consist of:
- Point references at the reachability tips of any objects you
consider precious, which may be undesirable or infeasible if there
are many such objects.
- Track them via the reflog, which may be undesirable since the
reflog's lifetime is limited to that of the reference it's tracking
(and callers may want to keep those unreachable objects around for
longer).
- Extend the grace period, which may keep around other objects that
the caller *does* want to discard.
- Manually modify the mtimes of objects you want to keep. If those
objects are already loose, this is easy enough to do (you can just
enumerate and `touch -m` each one).
But if they are packed, you will either end up modifying the mtimes
of *all* objects in that pack, or be forced to write out a loose
copy of that object, both of which may be undesirable. Even worse,
if they are in a cruft pack, that requires modifying its `*.mtimes`
file by hand, since there is no exposed plumbing for this.
- Force the caller to construct the pack of objects they want
to keep themselves, and then mark the pack as kept by adding a
".keep" file. This works, but is burdensome for the caller, and
having extra packs is awkward as you roll forward your cruft pack.
This patch introduces a new option to the above list via the
`gc.recentObjectsHook` configuration, which allows the caller to
specify a program (or set of programs) whose output is treated as a set
of objects to treat as recent, regardless of their true age.
The implementation is straightforward. Git enumerates recent objects via
`add_unseen_recent_objects_to_traversal()`, which enumerates loose and
packed objects, and eventually calls add_recent_object() on any objects
for which `want_recent_object()`'s conditions are met.
This patch modifies the recency condition from simply "is the mtime of
this object more recent than the cutoff?" to "[...] or, is this object
mentioned by at least one `gc.recentObjectsHook`?".
Depending on whether or not we are generating a cruft pack, this allows
the caller to do one of two things:
- If generating a cruft pack, the caller is able to retain additional
objects via the cruft pack, even if they would have otherwise been
pruned due to their age.
- If not generating a cruft pack, the caller is likewise able to
retain additional objects as loose.
A potential alternative here is to introduce a new mode to alter the
contents of the reachable pack instead of the cruft one. One could
imagine a new option to `pack-objects`, say `--extra-reachable-tips`
that does the same thing as above, adding the visited set of objects
along the traversal to the pack.
But this has the unfortunate side-effect of altering the reachability
closure of that pack. If parts of the unreachable object graph mentioned
by one or more of the "extra reachable tips" programs is not closed,
then the resulting pack won't be either. This makes it impossible in the
general case to write out reachability bitmaps for that pack, since
closure is a requirement there.
Instead, keep these unreachable objects in the cruft pack (or set of
unreachable, loose objects) instead, to ensure that we can continue to
have a pack containing just reachable objects, which is always safe to
write a bitmap over.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When enumerating objects in order to add recent ones (i.e. those whose
mtime is strictly newer than the cutoff) as tips of a reachability
traversal, `add_recent_object()` discards objects which do not meet the
recency criteria.
The subsequent commit will make checking whether or not an object is
recent also consult the list of hooks in `pack.recentHook`. Isolate this
check in its own function to keep the additional complexity outside of
`add_recent_object()`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To partition the set of packs based on which ones are "kept" (either
they have a .keep file, or were otherwise marked via the `--keep-pack`
option) and "non-kept" ones (anything else), `git repack` uses its
`collect_pack_filenames()` function.
Ordinarily, we would rely on a convenience function such as
`get_all_packs()` to enumerate and partition the set of packs. But
`collect_pack_filenames()` uses `readdir()` directly to read the
contents of the "$GIT_DIR/objects/pack" directory, and adds each entry
ending in ".pack" to the appropriate list (either kept, or non-kept as
above).
This is subtly racy, since `collect_pack_filenames()` may see a pack
that is not fully staged (i.e., it is missing its ".idx" file).
Ordinarily, this doesn't cause a problem. But it can cause issues when
generating a cruft pack.
This is because `git repack` feeds (among other things) the list of
existing kept packs down to `git pack-objects --cruft` to indicate that
any kept packs will not be removed from the repository (so that the
cruft pack machinery can avoid packing objects that appear in those
packs as cruft).
But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`.
So if a ".pack" file exists (necessary to get that pack to appear to
`collect_pack_filenames()`), but doesn't have a corresponding ".idx"
file (necessary to get that pack to appear via `get_all_packs()`), we'll
complain with:
fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack'
Fix the above by teaching `collect_pack_filenames()` to only collect
packs with their corresponding `*.idx` files in place, indicating that
those packs have been fully staged.
There are a couple of things worth noting:
- Since each entry in the `extra_keep` list (which contains the
`--keep-pack` names) has a `*.pack` suffix, we'll have to swap the
suffix from ".pack" to ".idx", and compare that instead.
- Since we use the the `fname_kept_list` to figure out which packs to
delete (with `git repack -d`), we would have previously deleted a
`*.pack` with no index (since the existince of a ".pack" file is
necessary and sufficient to include that pack in the list of
existing non-kept packs).
Now we will leave it alone (since that pack won't appear in the
list). This is far more correct behavior, since we don't want
to race with a pack being staged. Deleting a partially staged pack
is unlikely, however, since the window of time between staging a
pack and moving its .idx file into place is miniscule.
Note that this window does *not* include the time it takes to
receive and index the pack, since the incoming data goes into
"$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack"
and is thus ignored by collect_pack_filenames().
In the future, this function should probably be rewritten as a callback
to `for_each_file_in_pack_dir()`, but this is the simplest change we
could do in the short-term.
Reported-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These were found with an automated CLI tool [1]. Only the
"Documentation" subfolder (and not source code files) was considered
because the docs are user-facing.
[1]: https://crates.io/crates/typos-cli
Signed-off-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GPGSSH_VERIFYTIME prequeq makes use of "${GNUPGHOME}" but does not
create it. Require GPGSSH which creates the "${GNUPGHOME}" directory.
Additionally, it makes sense to require GPGSSH in GPGSSH_VERIFYTIME
because the latter builds on the former. If we can't use GPGSSH,
there's little point in checking whether GPGSSH_VERIFYTIME is usable.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a library that only interacts with other primitives, strbuf should
not utilize the comment_line_char global variable within its
functions. Therefore, add an additional parameter for functions that use
comment_line_char and refactor callers to pass it in instead.
strbuf_stripspace() removes the skip_comments boolean and checks if
comment_line_char is a non-NUL character to determine whether to skip
comments or not.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move path-related function from strbuf.[ch] to path.[ch] so that strbuf
is focused on string manipulation routines with minimal dependencies.
repository.h is no longer a necessary dependency after moving this
function out.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move object-name-related functions from strbuf.[ch] to object-name.[ch]
so that strbuf is focused on string manipulation routines with minimal
dependencies.
dir.h relied on the forward declration of the repository struct in
strbuf.h. Since that is removed in this patch, add the forward
declaration to dir.h.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
is_rfc3986_unreserved() and is_rfc3986_reserved_or_unreserved() are only
called from builtin/credential-store.c and they are only relevant to that
file so move those functions and make them static.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move abspath-related functions from strbuf.[ch] to abspath.[ch] so that
strbuf is focused on string manipulation routines with minimal
dependencies.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs.h was once needed but is no longer so as of 6bab74e7fb ("strbuf:
move strbuf_branchname to sha1_name.c", 2010-11-06). strbuf.h was
included thru refs.h, so removing refs.h requires strbuf.h to be added
back.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf, as a generic and widely used structure across the codebase,
should be limited as a library to only interact with primitives. Add
documentation so future functions can appropriately be placed. Older
functions that do not follow this boundary should eventually be moved or
refactored.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'read_replace_refs' global specifies whether or not we should
respect the references of the form 'refs/replace/<oid>' to replace which
object we look up when asking for '<oid>'. This global has caused issues
when it is not initialized properly, such as in b6551feadf (merge-tree:
load default git config, 2023-05-10).
To make this more robust, move its config-based initialization out of
git_default_config and into prepare_repo_settings(). This provides a
repository-scoped version of the 'read_replace_refs' global.
The global still has its purpose: it is disabled process-wide by the
GIT_NO_REPLACE_OBJECTS environment variable or by a call to
disable_replace_refs() in some specific Git commands.
Since we already encapsulated the use of the constant inside
replace_refs_enabled(), we can perform the initialization inside that
method, if necessary. This solves the problem of forgetting to check the
config, as we will check it before returning this value.
Due to this encapsulation, the global can move to be static within
replace-object.c.
There is an interesting behavior change possible here: we now have a
repository-scoped understanding of this config value. Thus, if there was
a command that recurses into submodules and might follow replace refs,
then it would now respect the core.useReplaceRefs config value in each
repository.
'git grep --recurse-submodules' is such a command that recurses into
submodules in-process. We can demonstrate the granularity of this config
value via a test in t7814.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'read_replace_objects' constant is initialized by git_default_config
(if core.useReplaceRefs is disabled) and within setup_git_env (if
GIT_NO_REPLACE_OBJECTS) is set. To ensure that this variable cannot be
set accidentally in other places, wrap it in a replace_refs_enabled()
method.
Since we still assign this global in config.c, we are not able to remove
the global scope of this variable and make it a static within
replace-object.c. This will happen in a later change which will also
prevent the variable from being read before it is initialized.
Centralizing read access to the variable is an important first step.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several builtins depend on being able to disable the replace references
so we actually operate on each object individually. These currently do
so by directly mutating the 'read_replace_refs' global.
A future change will move this global into a different place, so it will
be necessary to change all of these lines. However, we can simplify that
transition by abstracting the purpose of these global assignments with a
method call.
We will need to keep this read_replace_refs global forever, as we want
to make sure that we never use replace refs throughout the life of the
process if this method is called. Future changes may present a
repository-scoped version of the variable to represent that repository's
core.useReplaceRefs config value, but a zero-valued read_replace_refs
will always override such a setting.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with
`-z`, 2022-07-22), we have introduced a new mode to read the input via
NUL-delimited records instead of newline-delimited records. This allows
the user to query for revisions that have newlines in their path
component. While unusual, such queries are perfectly valid and thus it
is clear that we should be able to support them properly.
Unfortunately, the commit only changed the input to be NUL-delimited,
but didn't change the output at the same time. While this is fine for
queries that are processed successfully, it is less so for queries that
aren't. In the case of missing commits for example the result can become
entirely unparsable:
```
$ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" |
git cat-file --batch -z
7ce4f05bae blob 10
1234567890
commit missing
```
This is of course a crafted query that is intentionally gaming the
deficiency, but more benign queries that contain newlines would have
similar problems.
Ideally, we should have also changed the output to be NUL-delimited when
`-z` is specified to avoid this problem. As the input is NUL-delimited,
it is clear that the output in this case cannot ever contain NUL
characters by itself. Furthermore, Git does not allow NUL characters in
revisions anyway, further stressing the point that using NUL-delimited
output is safe. The only exception is of course the object data itself,
but as git-cat-file(1) prints the size of the object data clients should
read until that specified size has been consumed.
But even though `-z` has only been introduced a few releases ago in Git
v2.38.0, changing the output format retroactively to also NUL-delimit
output would be a backwards incompatible change. And while one could
make the argument that the output is inherently broken already, we need
to assume that there are existing users out there that use it just fine
given that revisions containing newlines are quite exotic.
Instead, introduce a new option `-Z` that switches to NUL-delimited
input and output. While this new option could arguably only switch the
output format to be NUL-delimited, the consequence would be that users
have to always specify both `-z` and `-Z` when the input may contain
newlines. On the other hand, if the user knows that there never will be
newlines in the input, they don't have to use either of those options.
There is thus no usecase that would warrant treating input and output
format separately, which is why we instead opt to "do the right thing"
and have `-Z` mean to NUL-terminate both formats.
The old `-z` option is marked as deprecated with a hint that its output
may become unparsable. It is thus hidden both from the synopsis as well
as the command's help output.
Co-authored-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The batch modes of git-cat-file(1) read queries from stantard input that
are either newline- or NUL-delimited. This code was introduced via
db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`,
2022-07-22), which notes that:
"""
The refactoring here is slightly unfortunate, since we turn loops like:
while (strbuf_getline(&buf, stdin) != EOF)
into:
while (1) {
int ret;
if (opt->nul_terminated)
ret = strbuf_getline_nul(&input, stdin);
else
ret = strbuf_getline(&input, stdin);
if (ret == EOF)
break;
}
"""
The commit proposed introducing a helper function that is easier to use,
which is just what we have done in the preceding commit. Refactor the
code to use this new helper to simplify the loop.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many of our commands support reading input that is separated either via
newlines or via NUL characters. Furthermore, in order to be a better
cross platform citizen, these commands typically know to strip the CRLF
sequence so that we also support reading newline-separated inputs on
e.g. the Windows platform. This results in the following kind of awkward
pattern:
```
struct strbuf input = STRBUF_INIT;
while (1) {
int ret;
if (nul_terminated)
ret = strbuf_getline_nul(&input, stdin);
else
ret = strbuf_getline(&input, stdin);
if (ret)
break;
...
}
```
Introduce a new CRLF-aware helper function that can read up to a user
specified delimiter. If the delimiter is `\n` the function knows to also
strip CRLF, otherwise it will only strip the specified delimiter. This
results in the following, much more readable code pattern:
```
struct strbuf input = STRBUF_INIT;
while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) {
...
}
```
The new function will be used in a subsequent commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests for git-cat-file(1) are quite old and haven't ever been
updated since they were introduced. They thus tend to use old idioms
that have since grown outdated. Most importantly, many of the tests use
`test $A = $B` to compare expected and actual output. This has the
downside that it is impossible to tell what exactly is different between
both versions in case the test fails.
Refactor the tests to instead use `test_cmp`. While more verbose, it
both tends to be more readable and will result in a nice diff in case
states don't match.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1006 we have a bunch of tests that verify the output format of the
git-cat-file(1) command. But while part of the output for some tests
would include commit timestamps, we don't verify those but instead strip
them before comparing expected with actual results. This is done by the
function `maybe_remove_timestamp`, which goes all the way back to the
ancient commit b335d3f121 (Add tests for git cat-file, 2008-04-23).
Our tests had been in a different shape back then. Most importantly we
didn't yet have the infrastructure to create objects with deterministic
timestamps. Nowadays we do though, and thus there is no reason anymore
to strip the timestamps.
Refactor the tests to not strip the timestamp anymore.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cross-compiling with the mingw toolchain on a system with a case
sensitive filesystem, the mixed case (which is technically correct as
per the contents of MS Visual C++) doesn't work (the corresponding mingw
headers are all lowercase for some reason).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index is read in 'worktree.c' at two points:
1.The 'validate_no_submodules' function, which checks if there are any
submodules present in the worktree.
2.The 'check_clean_worktree' function, which verifies if a worktree is
'clean', i.e., there are no untracked or modified but uncommitted files.
This is done by running the 'git status' command, and an error message
is thrown if the worktree is not clean. Given that 'git status' is
already sparse-aware, the function is also sparse-aware.
Hence we can just set the requires-full-index to false for
"git worktree".
Add tests that verify that 'git worktree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.
The `p2000` tests demonstrate a ~20% execution time reduction for
'git worktree' using a sparse index:
(Note:the p2000 test results didn't reflect the huge speedup because of
the index reading time is minuscule comparing to the filesystem
operations.)
Test before after
-----------------------------------------------------------------------
2000.102: git worktree add....(full-v3) 3.15 2.82 -10.5%
2000.103: git worktree add....(full-v4) 3.14 2.84 -9.6%
2000.104: git worktree add....(sparse-v3) 2.59 2.14 -16.4%
2000.105: git worktree add....(sparse-v4) 2.10 1.57 -25.2%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If execve(2) fails with ENOENT and we report the error, we use the
format "cannot run %s", followed by the actual error message. For other
errors we use "cannot exec '%s'".
Stop making this subtle distinction and use the second format for all
execve(2) errors. This simplifies the code and makes the prefix more
precise by indicating the failed operation. It also allows us to
slightly simplify t1800.16.
On Windows -- which lacks execve(2) -- we already use a single format in
all cases: "cannot spawn %s".
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t1800.16 checks whether an attempt to run a hook script with a missing
executable in its #! line fails and reports that error. The expected
error message differs between platforms. The test handles two common
variants, but on NonStop OS we get a third one: "fatal: cannot exec
'bad-hooks/test-hook': ...", which causes the test to fail there.
We don't really care about the specific message text all that much here.
Use grep and a single regex with alternations to ascertain that we get
an error message (fatal or otherwise) about the failed invocation of the
hook, but don't bother checking if we get the right variant for the
platform the test is running on or whether quoting is done. This looser
check let's the test pass on NonStop OS.
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
592fc5b3 (dir.h: move DTYPE defines from cache.h, 2023-04-22) moved
DTYPE macros from cache.h to dir.h, but they are still used by cache.h
to implement ce_to_dtype(); cache.h cannot include dir.h because that
would cause name-hash.c to have two different and conflicting
definitions of `struct dir_entry`. (That should be separately fixed.)
Both dir.h and cache.h include statinfo.h, and this seems a reasonable
place for these definitions.
This change fixes a broken build issue on old SunOS.
Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Signed-off-by: Alejandro R Sedeño <asedeno@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From 02156b81bbb2cafb19d702c55d45714fcf224048 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Wed, 7 Jun 2023 09:39:01 -0400
Subject: [PATCH v2 2/2] add: test use of brackets when color is disabled
The interactive add command, 'git add -i', displays a menu of options
using full words. When color is enabled, the first letter of each word
is changed to a highlight color to signal that the first letter could be
used as a command. Without color, brackets ("[]") are used around these
first letters.
This behavior was not previously tested directly in t3701, so add a test
for it now. Since we use 'git add -i >actual <input' without
'force_color', the color system recognizes that colors are not available
on stdout and will be disabled by default.
This test would reproduce correctly with or without the fix in the
previous commit to make sure that color.ui is respected in 'git add'.
Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git add -i' and 'git add -p' were converted to a builtin, they
introduced a color bug: the 'color.ui' config setting is ignored.
The included test demonstrates an example that is similar to the
previous test, which focuses on customizing colors. Here, we are
demonstrating that colors are not being used at all by comparing the raw
output and the color-decoded version of that output.
The fix is simple, to use git_color_default_config() as the fallback for
git_add_config(). A more robust change would instead encapsulate the
git_use_color_default global in methods that would check the config
setting if it has not been initialized yet. Some ideas are being
discussed on this front [1], but nothing has been finalized.
[1] https://lore.kernel.org/git/pull.1539.git.1685716420.gitgitgadget@gmail.com/
This test case naturally bisects down to 0527ccb1b5 (add -i: default to
the built-in implementation, 2021-11-30), but the fix makes it clear
that this would be broken even if we added the config to use the builtin
earlier than this.
Reported-by: Greg Alexander <gitgreg@galexander.org>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Its better to document the struct members directly instead of on a
function that takes a pointer to the struct. This will also make it
easier to update the documentation in the future.
Make adjustments for this new context. Also drop “may contain” since we
don’t need to emphasize that a list could be empty.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`suppress_default_notes` was renamed to `use_default_notes` in
3a03cf6b1d (notes: refactor display notes default handling,
2011-03-29).
The commit message says that “values less than one [indicates] “not
set” ”, but what was meant was probably “less than zero” (the author of
3a03cf6b1d agrees on this point).
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Duplicate the code for outputting the signature and its other
parameters for commits and tags in ref-filter from pretty. In the
future, this will help in getting rid of the current duplicate
implementations of such logic everywhere, when ref-filter can do
everything that pretty is doing.
The new atom "signature" and its friends are equivalent to the existing
pretty formats as follows:
%(signature) = %GG
%(signature:grade) = %G?
%(siganture:signer) = %GS
%(signature:key) = %GK
%(signature:fingerprint) = %GF
%(signature:primarykeyfingerprint) = %GP
%(signature:trustlevel) = %GT
Co-authored-by: Hariom Verma <hariom18599@gmail.com>
Co-authored-by: Jaydeep Das <jaydeepjd.8914@gmail.com>
Co-authored-by: Nsengiyumva Wilberforce <nsengiyumvawilberforce@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GnuPG v2.0.0 released in 2006, which according to its release notes
https://gnupg.org/download/release_notes.html
is the "First stable version of GnuPG integrating OpenPGP and S/MIME".
Use this version or its successors for tests that will fail for
versions less than v2.0.0 because of the difference in the output on
stderr between the versions (v2.* vs v0.* or v2.* vs v1.*). Skip if
the GPG version detected is less than v2.0.0.
Do not, however, remove the existing prereq GPG yet since a lot of tests
still work with the prereq GPG (that is even with versions v0.* or v1.*)
and some systems still use these versions.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Kousik Sanagavarapu <five231003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a leak that has existed since the method was first created
in fcb2c0769d (commit-reach: implement get_reachable_subset,
2018-11-02).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the linux-asan-ubsan job runs using clang under Linux, there is
not much point in running a separate clang job. Any errors that a normal
clang compile-and-test cycle would find are likely to be a subset of
what the sanitizer job will find. Since this job takes ~14 minutes to
run in CI, this shaves off some of our CPU load (though it does not
affect end-to-end runtime, since it's typically run in parallel and is
not the longest job).
Technically this provides us with slightly less signal for a given run,
since you won't immediately know if a failure in the sanitizer job is
from using clang or from the sanitizers themselves. But it's generally
obvious from the logs, and anyway your next step would be to fix the
probvlem and re-run CI, since we expect all of these jobs to pass
normally.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we started running sanitizers in CI via 1c0962c0c4 (ci: add address
and undefined sanitizer tasks, 2022-10-20), we ran them as two separate
CI jobs, since as that commit notes, the combination "seems to take
forever".
And indeed, it does with gcc. However, since the previous commit
switched to using clang, the situation is different, and we can save
some CPU by using a single job for both. Comparing before/after CI runs,
this saved about 14 minutes (the single combined job took 54m, versus
44m plus 24m for ASan and UBSan jobs, respectively). That's wall-clock
and not CPU, but since our jobs are mostly CPU-bound, the two should be
closely proportional.
This does increase the end-to-end time of a CI run, though, since before
this patch the two jobs could run in parallel, and the sanitizer job is
our longest single job. It also means that we won't get a separate
result for "this passed with UBSan but not with ASan" or vice versa).
But as 1c0962c0c4 noted, that is not a very useful signal in practice.
Below are some more detailed timings of gcc vs clang that I measured by
running the test suite on my local workstation. Each measurement counts
only the time to run the test suite with each compiler (not the compile
time itself). We'll focus on the wall-clock times for simplicity, though
the CPU times follow roughly similar trends.
Here's a run with CC=gcc as a baseline:
real 1m12.931s
user 9m30.566s
sys 8m9.538s
Running with SANITIZE=address increases the time by a factor of ~4.7x:
real 5m40.352s
user 49m37.044s
sys 36m42.950s
Running with SANITIZE=undefined increases the time by a factor of ~1.7x:
real 2m5.956s
user 12m42.847s
sys 19m27.067s
So let's call that 6.4 time units to run them separately (where a unit
is the time it takes to run the test suite with no sanitizers). As a
simplistic model, we might imagine that running them together would take
5.4 units (we save 1 unit because we are no longer running the test
suite twice, but just paying the sanitizer overhead on top of a single
run).
But that's not what happens. Running with SANITIZE=address,undefined
results in a factor of 9.3x:
real 11m9.817s
user 77m31.284s
sys 96m40.454s
So not only did we not get faster when doing them together, we actually
spent 1.5x as much CPU as doing them separately! And while those
wall-clock numbers might not look too terrible, keep in mind that this
is on an unloaded 8-core machine. In the CI environment, wall-clock
times will be much closer to CPU times. So not only are we wasting CPU,
but we risk hitting timeouts.
Now let's try the same thing with clang. Here's our no-sanitizer
baseline run, which is almost identical to the gcc one (which is quite
convenient, because we can keep using the same "time units" to get an
apples-to-apples comparison):
real 1m11.844s
user 9m28.313s
sys 8m8.240s
And now again with SANITIZE=address, we get a 5x factor (so slightly
worse than gcc's 4.7x, though I wouldn't read too much into it; there is
a fair bit of run-to-run noise):
real 6m7.662s
user 49m24.330s
sys 44m13.846s
And with SANITIZE=undefined, we are at 1.5x, slightly outperforming gcc
(though again, that's probably mostly noise):
real 1m50.028s
user 11m0.973s
sys 16m42.731s
So running them separately, our total cost is 6.5x. But if we combine
them in a single run (SANITIZE=address,undefined), we get:
real 6m51.804s
user 52m32.049s
sys 51m46.711s
which is a factor of 5.7x. That's along the lines we'd hoped for!
Running them together saves us almost a whole time unit. And that's not
counting any time spent outside the test suite itself (starting the job,
setting up the environment, compiling) that we're no longer duplicating
by having two jobs.
So clang behaves like we'd hope: the overhead to run the sanitizers is
additive as you add more sanitizers. Whereas gcc's numbers seem very
close to multiplicative, almost as if the sanitizers were enforcing
their overheads on each other (though that is purely a guess on what is
going on; ultimately what matters to us is the amount of time it takes).
And that roughly matches the CI improvement I saw. A "time unit" there
is more like 12 minutes, and the observed time savings was 14 minutes
(with the extra presumably coming from avoiding duplicated setup, etc).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both gcc and clang support the "address" and "undefined" sanitizers.
However, they may produce different results. We've seen at least two
real world cases where gcc missed a UBSan problem but clang found it:
1. Clang's UBSan (using clang 14.0.6) found a string index that was
subtracted to "-1", causing an out-of-bounds read (curiously this
didn't trigger ASan, but that may be because the string was in the
argv memory, not stack or heap). Using gcc (version 12.2.0) didn't
find the same problem.
Original thread:
https://lore.kernel.org/git/20230519005447.GA2955320@coredump.intra.peff.net/
2. Clang's UBSan (using clang 4.0.1) complained about pointer
arithmetic with NULL, but gcc at the time did not. This was in
2017, and modern gcc does seem to find the issue, though.
Original thread:
https://lore.kernel.org/git/32a8b949-638a-1784-7fba-948ae32206fc@web.de/
Since we don't otherwise have a particular preference for one compiler
over the other for this test, let's switch to the one that we think may
be more thorough.
Note that it's entirely possible that the two are simply _different_,
and we are trading off problems that gcc would find that clang wouldn't.
However, my subjective and anecdotal experience has been that clang's
sanitizer support is a bit more mature (e.g., I recall other oddities
around leak-checking where clang performed more sensibly).
Obviously running both and cross-checking the results would give us the
best coverage, but that's very expensive to run (and these are already
some of our most expensive CI jobs). So let's use clang as our best
guess, and we can re-evaluate if we get more data points.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --follow code doesn't handle most forms of pathspec magic. We check
that no unexpected ones have made it to try_to_follow_renames() with a
runtime GUARD_PATHSPEC() check, which gives behavior like this:
$ git log --follow ':(icase)makefile' >/dev/null
BUG: tree-diff.c:596: unsupported magic 10
Aborted
The same is true of ":(glob)", ":(attr)", and so on. It's good that we
notice the problem rather than continuing and producing a wrong answer.
But there are two non-ideal things:
1. The idea of GUARD_PATHSPEC() is to catch programming errors where
low-level code gets unexpected pathspecs. We'd usually try to catch
unsupported pathspecs by passing a magic_mask to parse_pathspec(),
which would give the user a much better message like:
pathspec magic not supported by this command: 'icase'
That doesn't happen here because git-log usually _does_ support
all types of pathspec magic, and so it passes "0" for the mask
(this call actually happens in setup_revisions()). It needs to
distinguish the normal case from the "--follow" one but currently
doesn't.
2. In addition to --follow, we have the log.follow config option. When
that is set, we try to turn on --follow mode only when there is a
single pathspec (since --follow doesn't handle anything else). But
really, that ought to be expanded to "use --follow when the
pathspec supports it". Otherwise, we'd complain any time you use an
exotic pathspec:
$ git config log.follow true
$ git log ':(icase)makefile' >/dev/null
BUG: tree-diff.c:596: unsupported magic 10
Aborted
We should instead just avoid enabling follow mode if it's not
supported by this particular invocation.
This patch expands our diff_check_follow_pathspec() function to cover
pathspec magic, solving both problems.
A few final notes:
- we could also solve (1) by passing the appropriate mask to
parse_pathspec(). But that's not great for two reasons. One is that
the error message is less precise. It says "magic not supported by
this command", but really it is not the command, but rather the
--follow option which is the problem. The second is that it always
calls die(). But for our log.follow code, we want to speculatively
ask "is this pathspec OK?" and just get a boolean result.
- This is obviously the right thing to do for ':(icase)' and most
other magic options. But ':(glob)' is a bit odd here. The --follow
code doesn't support wildcards, but we allow them anyway. From
try_to_follow_renames():
#if 0
/*
* We should reject wildcards as well. Unfortunately we
* haven't got a reliable way to detect that 'foo\*bar' in
* fact has no wildcards. nowildcard_len is merely a hint for
* optimization. Let it slip for now until wildmatch is taught
* about dry-run mode and returns wildcard info.
*/
if (opt->pathspec.has_wildcard)
BUG("wildcards are not supported");
#endif
So something like "git log --follow 'Make*'" is already doing the
wrong thing, since ":(glob)" behavior is already the default (it is
used only to countermand an earlier --noglob-pathspecs).
So we _could_ loosen the guard to allow :(glob), since it just
behaves the same as pathspecs do by default. But it seems like a
backwards step to do so. It already doesn't work (it hits the BUG()
case currently), and given that the user took an explicit step to
say "this pathspec should glob", it is reasonable for us to say "no,
--follow does not support globbing" (or in the case of log.follow,
avoid turning on follow mode). Which is what happens after this
patch.
- The set of allowed pathspec magic is obviously the same as in
GUARD_PATHSPEC(). We could perhaps factor these out to avoid
repetition. The point of having separate masks and GUARD calls is
that we don't necessarily know which parsed pathspecs will be used
where. But in this case, the two are heavily correlated. Still,
there may be some value in keeping them separate; it would make
anyone think twice about adding new magic to the list in
diff_check_follow_pathspec(). They'd need to touch
try_to_follow_renames() as well, which is the code that would
actually need to be updated to handle more exotic pathspecs.
- The documentation for log.follow says that it enables --follow
"...when a single <path> is given". We could possibly expand that to
say "with no unsupported pathspec magic", but that raises the
question of documenting which magic is supported. I think the
existing wording of "single <path>" sufficiently encompasses the
idea (the forbidden magic is stuff that might match multiple
entries), and the spirit remains the same.
Reported-by: Jim Pryor <dubiousjim@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In --follow mode, we require exactly one pathspec. We check this
condition in two places:
- in diff_setup_done(), we complain if --follow is used with an
inapropriate pathspec
- in git-log's revision "tweak" function, we enable log.follow only if
the pathspec allows it
The duplication isn't a big deal right now, since the logic is so
simple. But in preparation for it becoming more complex, let's pull it
into a shared function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have unsupported magic in a pathspec (because a command or code
path does not support particular items), we list the unsupported ones in
an error message.
Let's factor out the code here that converts the bits back into their
human-readable names, so that it can be used from other callers, which
may want to provide more flexible error messages.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The output may become confusing to recognize if the user
accidentally gave an extra opening space, like:
$ git commit --fixup=" 6d6360b67e99c2fd82d64619c971fdede98ee74b"
fatal: could not lookup commit 6d6360b67e99c2fd82d64619c971fdede98ee74b
and it will be better if we surround the %s specifier with single quotes.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move 'repository_format_worktree_config' out of the global scope and into
the 'repository' struct. This change is similar to how
'repository_format_partial_clone' was moved in ebaf3bcf1a (repository: move
global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository'
struct and updating 'setup.c' & 'repository.c' functions to assign the value
appropriately.
The primary goal of this change is to be able to load the worktree config of
a submodule depending on whether that submodule - not its superproject - has
'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()'
has access to the newly repo-scoped configuration, add a 'struct repository'
argument to 'do_git_config_sequence()' and pass it the 'repo' value from
'config_with_options()'.
Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to
verify 'extensions.worktreeConfig' is read an used independently by
superprojects and submodules.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'struct repository' argument to 'config_with_options()' and remove the
'repo' field from 'struct git_config_source'.
A 'struct repository' instance was originally added to the config source in
e3e8bf046e (submodule-config: pass repo upon blob config read, 2021-08-16)
to improve how submodule blob config content was accessed. At the time, this
was the only use for a 'repository' instance, so it was naturally added only
where it was needed: to 'struct git_config_source'. However, in upcoming
patches, 'config_with_options()' will need the repository instance to access
extension information (regardless of whether a 'config_source' exists). To
make the 'struct repository' instance more easily accessible, move it into
the function's arguments.
Update all callers of 'config_with_options()' to pass the appropriate 'repo'
value:
* in 'builtin/config.c', use 'the_repository'
* in 'submodule--config.c', use the 'repo' arg in 'config_from_gitmodules()'
* in 'read_[very_]early_config()' & 'read_protected_config()', set 'repo' to
NULL (repository instances aren't available there)
* in 'populate_remote_urls()', use the repo instance that has been added to
the 'struct config_include_data'
* in 'repo_read_config()', use the given 'repo' arg
Finally, note that this patch eliminates the fallback to 'the_repository'
that previously existed for the 'config_source' repo instance if it was
NULL. The fallback is no longer necessary, as the 'repo' is set explicitly
in all cases where it is needed.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'do_git_config_sequence()' to read the worktree config from
'config.worktree' in 'opts->git_dir' rather than the gitdir of
'the_repository'.
The worktree config is loaded from the path returned by
'git_pathdup("config.worktree")', the 'config.worktree' relative to the
gitdir of 'the_repository'. If loading the config for a submodule, this path
is incorrect, since 'the_repository' is the superproject. 'opts->git_dir' is
the gitdir of the submodule being configured, so the config file in that
location should be read instead.
To ensure the use of 'opts->git_dir' is safe, require that 'opts->git_dir'
is set if-and-only-if 'opts->commondir' is set (rather than "only-if" as it
is now). In all current usage of 'config_options', these values are set
together, so the stricter check does not change any behavior.
Finally, add tests to 't3007-ls-files-recurse-submodules.sh' to verify the
corrected config is loaded. Use 'ls-files' to test this because, unlike some
other '--recurse-submodules' commands, 'ls-files' parses the config of the
submodule in the same process as the superproject (via 'show_submodule()' ->
'repo_read_index()' -> 'prepare_repo_settings()'). As a result,
'the_repository' points to the config of the superproject but the
commondir/gitdir in the config sequence will be that of the submodule,
providing the exact scenario needed to verify this patch.
The first test ('--recurse-submodules parses submodule repo config') checks
that the submodule's *repo* config is read when running 'ls-files' on the
superproject; this confirms already-working behavior, serving as a reference
for how worktree config parsing should behave. The second test
('--recurse-submodules parses submodule worktree config') tests the same
scenario as the previous but instead using the *worktree* config,
demonstrating the corrected behavior. The 'test_config' helper is extended
for this case so that it properly applies the '--worktree' option to the
configure/unconfigure operations it performs.
Note that, although the submodule worktree config is now parsed instead of
the superproject's, 'extensions.worktreeConfig' in the superproject still
controls whether or not the worktree config is enabled at all in the
submodule. This will be fixed in a later patch.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OpenSSH-9.0 requires a namespace option with `-Y check-novalidate`.
This was added in openssh-portable commit a0b5816f8 (upstream:
ssh-keygen -Y check-novalidate requires namespace or SEGV, 2022-03-18).
The -n option was documented as a required option since check-novalidate
was added in openssh-portable 8aa2aa3cd (upstream: Allow testing
signature syntax and validity without verifying, 2019-09-16).
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prereq guard added in 14903c8e92 (trace2 tests: guard pthread test
with "PTHREAD", 2022-11-24) lacks the S in PTHREADS, causing it to never
be satisfied. Fix the spelling of the prereq.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e0a862fdaf (submodule helper: convert relative URL to absolute URL if
needed, 2018-10-16), `prepare_to_clone_next_submodule()` lost the
ability to handle URL-less submodules, due to a change from:
if (repo_get_config_string_const(the_repostiory, sb.buf, &url))
url = sub->url;
to
if (repo_get_config_string_const(the_repostiory, sb.buf, &url)) {
if (starts_with_dot_slash(sub->url) ||
starts_with_dot_dot_slash(sub->url)) {
/* ... */
}
}
, which will segfault when `sub->url` is NULL, since both
`starts_with_dot_slash()` does not guard its arguments as non-NULL.
Guard the checks to both of the above functions by first checking
whether `sub->url` is non-NULL. There is no need to check whether `sub`
itself is NULL, since we already perform this check earlier in
`prepare_to_clone_next_submodule()`.
By adding a NULL-ness check on `sub->url`, we'll fall into the 'else'
branch, setting `url` to `sub->url` (which is NULL). Before attempting
to invoke `git submodule--helper clone`, check whether `url` is NULL,
and die() if it is.
Reported-by: Tribo Dar <3bodar@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files --format" can be used to format the output of
multiple file entries in the index, while "git ls-tree --format"
can be used to format the contents of a tree object. However,
the current set of %(objecttype), "(objectsize)", and
"%(objectsize:padded)" atoms supported by "git ls-files --format"
is a subset of what is available in "git ls-tree --format".
Users sometimes need to establish a unified view between the index
and tree, which can help with comparison or conversion between the two.
Therefore, this patch adds the missing atoms to "git ls-files --format".
"%(objecttype)" can be used to retrieve the object type corresponding
to a file in the index, "(objectsize)" can be used to retrieve the
object size corresponding to a file in the index, and "%(objectsize:padded)"
is the same as "(objectsize)", except with padded format.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pseudoref AUTO_MERGE is documented since the previous commit. To
make it easier to use, let __git_refs in the Bash completion code
complete it.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5291828df8 (merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a
conflict, 2021-03-20), when using the 'ort' merge strategy, the special
ref AUTO_MERGE is written when a merge operation results in conflicts.
This ref points to a tree recording the conflicted state of the working
tree and is very useful during conflict resolution. However, this ref is
not documented.
Add some documentation for AUTO_MERGE in git-diff(1), git-merge(1),
gitrevisions(7) and in the user manual.
In git-diff(1), mention it at the end of the description section, when
we mention that the command also accepts trees instead of commits, and
also add an invocation to the "Various ways to check your working tree"
example.
In git-merge(1), add a step to the list of things that happen "when it
is not obvious how to reconcile the changes", under the "True merge"
section. Also mention AUTO_MERGE in the "How to resolve conflicts"
section, when mentioning 'git diff'.
In gitrevisions(7), add a mention of AUTO_MERGE along with the other
special refs.
In the user manual, add a paragraph describing AUTO_MERGE to the
"Getting conflict-resolution help during a merge" section, and include
an example of a 'git diff AUTO_MERGE' invocation for the example
conflict used in that section. Note that for uniformity we do not use
backticks around AUTO_MERGE here since the rest of the document does not
typeset special refs differently.
Closes: https://github.com/gitgitgadget/git/issues/1471
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "True merge" section of the 'git merge' documentation mentions that
in case of conflicts, the conflicted working tree files contain "the
result of the "merge" program". This probably refers to RCS's 'merge'
program, which is mentioned further down under "How conflicts are
presented".
Since it is not clear at that point of the document which program is
referred to, and since most modern readers probably do not relate to RCS
anyway, let's just write "the merge operation" instead.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pseudorefs REVERT_HEAD and BISECT_HEAD are not suggested
by the __git_refs function. Add them there.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some special refs, namely HEAD, FETCH_HEAD, ORIG_HEAD, MERGE_HEAD and
CHERRY_PICK_HEAD, are mentioned and described in 'gitrevisions', but some
others, namely REBASE_HEAD, REVERT_HEAD, and BISECT_HEAD, are not.
Add a small description of these special refs.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The special refs listed in 'gitrevisions' (under the '<refname>' entry)
are on separate lines in the Asciidoc source, but end up as a single
continuous paragraph in the rendered documentation (see e.g. [1]). In
following commits we will mention additional special refs, so to improve
legibility, use a description list such that every entry appears on its
own line. Since we are already in a description list, use ':::' as the
term delimiter.
In order for the new description list to be aligned with the description
under the '<refname>' entry, instead of being aligned with the last
entry of the "in the following rules" nested list, use the "ancestor
list continuation" syntax [2], i.e., leave an empty line before the
continuation '+'. Do the same for the paragraph following the new
description list ("Note that any...").
While at it, also use a continuation '+' before the "in the following
rules" list, for correctness. The parser seems not to care here, but
it's best to keep the sources correct.
[1] https://git-scm.com/docs/gitrevisions#Documentation/gitrevisions.txt-emltrefnamegtemegemmasterememheadsmasterememrefsheadsmasterem
[2] https://docs.asciidoctor.org/asciidoc/latest/lists/continuation/#ancestor-list-continuation
Suggested-by: Victoria Dye <vdye@github.com>
Based-on-patch-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests still use the old format with four spaces indentation.
Standardize the tests to the new format with tab indentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t1006-cat-file.sh used the older four space indent format.
Update these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests are still using the older four space indent format. Update
these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests are still using the older four space indent format. Update
these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t3210-pack-refs.sh used the older four space indent format.
Update these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t0030-stripspace.sh used the older four space indent
format. Update these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in t0000-basic.sh used the older four space indent format.
Update these to use tabs.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are showing multiple patches with format-patch, we have to
repeatedly overwrite the rev.message_id field. We take care to avoid
leaking the old value by either freeing it, or adding it to
ref_message_ids, a string list of ids to reference in subsequent
messages.
But unfortunately we do leak the value via that string list. We try
to clear the string list, courtesy of 89f45cf4eb (format-patch: don't
leak "extra_headers" or "ref_message_ids", 2022-04-13). But since it was
initialized as "nodup", the string list doesn't realize it owns the
strings, and it leaks them.
We have two options here:
1. Continue to init with "nodup", but then tweak the value of
ref_message_ids.strdup_strings just before clearing.
2. Init with "dup", but use "append_nodup" when transferring ownership
of strings to the list. Clearing just works.
I picked the second here, as I think it calls attention to the tricky
part (transferring ownership via the nodup call).
There's one other related fix we have to do, though. We also insert the
result of clean_message_id() into the list. This _sometimes_ allocates
and sometimes does not, depending on whether we have to remove cruft
from the end of the string. Let's teach it to consistently return an
allocated string, so that the caller knows it must be freed.
There's no new test here, as the leak can already be seen in t4014.44 (as
well as others in that script). We can't mark all of t4014 as leak-free,
though, as there are other unrelated leaks that it triggers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While well-established, the output format of ls-remote was not actually
documented. This patch adds an OUTPUT section to the documentation
following the format of git-show-ref.txt (which has similar semantics).
Add a basic example immediately after this to solidify the 'normal'
output format.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While it's good to have several examples to solidify the output pattern
and generally demonstrate how to use the command, most other EXAMPLES
sections (e.g., git-show-branch.txt, git-remote.txt) additionally
describe the problem/situation to which the example is applicable.
Follow this example in the ls-remote documentation.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without `--refs`, this command will show peeled tags. Make this clearer
in the examples to further mitigate the possibility of surprises in
consuming scripts.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --tags option is already demonstrated in the later example that
lists version-patterned tags. As it doesn't appear to add anything to
the documentation, it ought to be removed to keep the documentation
easier to read.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The glossary defines 'ref' as the official name of the thing,
and the output from "git grep -e '<ref' Documentation/" shows
that most everybody uses <ref>, not <reference>. In addition,
the page already says <ref> in its SYNOPSIS section for the
command when it is used in the mode to follow the reflogs.
Strictly speaking, many references of these should be updated to
<commit> after adding an explanation on how these <commit>s are
discovered (i.e. we take <rev>, <glob>, or <ref> and starting from
these commits, follow their ancestry or reflog entries to list
commits), but that would be a lot bigger change I would rather not
to do in this patch, whose primary purpose is to make the existing
documentation more consistent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Use inline-code syntax for options where appropriate.
- Use code blocks to clarify output format.
- Use 'OID' (for 'object ID') instead of 'SHA-1' as we support
different hashing algorithms these days.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We may allocate a message-id string via gen_message_id(), but we never
free it, causing a small leak. This can be demonstrated by running t9001
with a leak-checking build. The offending test is the one touched by
3ece9bf0f9 (send-email: clear the $message_id after validation,
2023-05-17), but the leak is much older than that. The test was simply
unlucky enough to trigger the leaking code path for the first time.
We can fix this by freeing the string at the end of the function. We can
also re-mark the test script as leak-free, effectively reverting
20bd08aefb (t9001: mark the script as no longer leak checker clean,
2023-05-17).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index is read in 'cmd_diff_tree' at two points:
1. The first index read was added in fd66bcc31f (diff-tree: read the
index so attribute checks work in bare repositories, 2017-12-06) to deal
with reading '.gitattributes' content. 77efbb366a (attr: be careful
about sparse directories, 2021-09-08) established that, in a sparse
index, we do _not_ try to load a '.gitattributes' file from within a
sparse directory.
2. The second index access point is involved in rename detection,
specifically when reading from stdin.This was initially added in
f0c6b2a2fd ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where
'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE |DIFF_SETUP_USE_CACHE'.
That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in
ff7fe37b05 (diff.c: move read_index() code back to the caller,
2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as
of 6e0b8ed6d3 (diff.c: do not use a separate "size cache"., 2007-05-07)
and nothing about 'detect_rename' otherwise indicates index usage.
Hence we can just set the requires-full-index to false for "diff-tree".
Add tests that verify that 'git diff-tree' behaves correctly when the
sparse index is enabled and test to ensure the index is not expanded.
The `p2000` tests demonstrate a ~98% execution time reduction for
'git diff-tree' using a sparse index:
Test before after
-----------------------------------------------------------------------
2000.94: git diff-tree HEAD (full-v3) 0.05 0.04 -20.0%
2000.95: git diff-tree HEAD (full-v4) 0.06 0.05 -16.7%
2000.96: git diff-tree HEAD (sparse-v3) 0.59 0.01 -98.3%
2000.97: git diff-tree HEAD (sparse-v4) 0.61 0.01 -98.4%
2000.98: git diff-tree HEAD -- f2/f4/a (full-v3) 0.05 0.05 +0.0%
2000.99: git diff-tree HEAD -- f2/f4/a (full-v4) 0.05 0.04 -20.0%
2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3) 0.58 0.01 -98.3%
2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4) 0.55 0.01 -98.2%
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a warning to `worktree add` when the command tries to reference
HEAD, there exist valid local branches, and the HEAD points to a
non-existent reference.
Current Behavior:
% git -C foo worktree list
/path/to/repo/foo dadc8e6dac [main]
/path/to/repo/foo_wt 0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%
New Behavior:
% git -C foo worktree list
/path/to/repo/foo dadc8e6dac [main]
/path/to/repo/foo_wt 0000000000 [badref]
% git -C foo worktree add ../wt1
Preparing worktree (new branch 'wt1')
HEAD is now at dadc8e6dac dummy commit
% git -C foo_wt worktree add ../wt2
warning: HEAD points to an invalid (or orphaned) reference.
HEAD path: '/path/to/repo/foo/.git/worktrees/foo_wt/HEAD'
HEAD contents: 'ref: refs/heads/badref'
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend DWIM to try to infer `--orphan` when in an empty repository. i.e.
a repository with an invalid/unborn HEAD, no local branches, and if
`--guess-remote` is used then no remote branches.
This behavior is equivalent to `git switch -c` or `git checkout -b` in
an empty repository.
Also warn the user (overriden with `-f`/`--force`) when they likely
intend to checkout a remote branch to the worktree but have not yet
fetched from the remote. i.e. when using `--guess-remote` and there is a
remote but no local or remote refs.
Current Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git workree add --guess-remote ../main
hint: If you meant to create a worktree containing a new orphan branch
[...]
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%
New Behavior:
% git --no-pager branch --list --remotes
% git remote
origin
% git workree add ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git workree add --guess-remote ../main
fatal: No local or remote refs exist despite at least one remote
present, stopping; use 'add -f' to overide or fetch a remote first
% git workree add --guess-remote -f ../main
No possible source branch, inferring '--orphan'
Preparing worktree (new branch 'main')
% git worktree remove ../main
% git fetch --quiet
% git --no-pager branch --list --remotes
origin/HEAD -> origin/main
origin/main
% git workree add --guess-remote ../main
Preparing worktree (new branch 'main')
branch 'main' set up to track 'origin/main'.
HEAD is now at dadc8e6dac commit message
%
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new advice/hint in `git worktree add` for when the user
tries to create a new worktree from a reference that doesn't exist.
Current Behavior:
% git init foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
fatal: invalid reference: HEAD
%
New Behavior:
% git init --bare foo
Initialized empty Git repository in /path/to/foo/
% touch file
% git -C foo commit -q -a -m "test commit"
% git -C foo switch --orphan norefbranch
% git -C foo worktree add newbranch/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint: git worktree add --orphan newbranch/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
% git -C foo worktree add -b newbranch2 new_wt/
Preparing worktree (new branch 'newbranch')
hint: If you meant to create a worktree containing a new orphan branch
hint: (branch with no commits) for this repository, you can do so
hint: using the --orphan option:
hint:
hint: git worktree add --orphan -b newbranch2 new_wt/
hint:
hint: Disable this message with "git config advice.worktreeAddOrphan false"
fatal: invalid reference: HEAD
%
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add support for creating an orphan branch when adding a new worktree.
The functionality of this flag is equivalent to git switch's --orphan
option.
Current Behavior:
% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%
% git init bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l
% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: not a valid object name: 'HEAD'
%
New Behavior:
% git -C foo.git --no-pager branch -l
+ main
% git -C foo.git worktree add main/
Preparing worktree (new branch 'main')
HEAD is now at 6c93a75 a commit
%
% git init --bare bar.git
Initialized empty Git repository in /path/to/bar.git/
% git -C bar.git --no-pager branch -l
% git -C bar.git worktree add main/
Preparing worktree (new branch 'main')
fatal: invalid reference: HEAD
% git -C bar.git worktree add --orphan -b main/
Preparing worktree (new branch 'main')
% git -C bar.git worktree add --orphan -b newbranch worktreedir/
Preparing worktree (new branch 'newbranch')
%
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to verify that the command performs operations the same with
`--quiet` as without it. Additionally verifies that all non-fatal output
is suppressed.
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pull duplicate test code into a function so that additional opt
combinations can be tested succinctly.
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document `-B` next to where `-b` is already documented to bring the
usage docs in line with other commands such as git checkout.
Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "submodule.fetchJobs" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "fetch.parallel" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "fetch.recurseSubmodules" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "fetch.showForcedUpdaets" config value into the
`fetch_config` structure. This reduces our reliance on global variables
and further unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "fetch.pruneTags" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the parsed "fetch.prune" config value into the `fetch_config`
structure. This reduces our reliance on global variables and further
unifies the way we parse the configuration in git-fetch(1).
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `fetch_config` structure currently only has a single member, which
is the `display_format`. We're about extend it to contain all parsed
config values and will thus need it available in most of the code.
Prepare for this change by passing through the `fetch_config` directly
instead of only passing its single member.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop the `NULL` check for `remote_ref` in `update_local_ref()`. The
function only has a single caller, and that caller always passes in a
non-`NULL` value.
This fixes a false positive in Coverity.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 50957937f9 (fetch: introduce `display_format` enum, 2023-05-10), a
new enumeration was introduced to determine the display format that is
to be used by git-fetch(1). The `DISPLAY_FORMAT_UNKNOWN` value isn't
ever used though, and neither do we rely on the explicit `0` value for
initialization anywhere.
Remove the enum value.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref transaction can fail after the user has written their tag
message. In particular, if there exists a tag `foo/bar` and `git tag -a
foo` is said then the command will only fail once it tries to write
`refs/tags/foo`, which is after the file has been unlinked.
Hold on to the message file for a little longer so that it is not
unlinked before the fatal error occurs.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard tag message file is unlinked if the tag is created.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document `TAG_EDITMSG` which we have told the user about on unsuccessful
command invocations since commit 3927bbe9a4 (tag: delete TAG_EDITMSG
only on successful tag, 2008-12-06).
Introduce this documentation since we are going to add tests for the
lifetime of this file in the case of command failure and success.
Use the documentation for `COMMIT_EDITMSG` from `git-commit.txt` as a
template since these two files share the same purpose.[1]
† 1: from commit 3927bbe9a4:
“ This matches the behavior of COMMIT_EDITMSG, which stays around
in case of error.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow users to be more selective over which refs to pack by adding an
--include option to git-pack-refs.
The existing options allow some measure of selectivity. By default
git-pack-refs packs all tags. --all can be used to include all refs,
and the previous commit added the ability to exclude certain refs with
--exclude.
While these knobs give the user some selection over which refs to pack,
it could be useful to give more control. For instance, a repository may
have a set of branches that are rarely updated and would benefit from
being packed. --include would allow the user to easily include a set of
branches to be packed while leaving everything else unpacked.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At GitLab, we have a system that creates ephemeral internal refs that
don't live long before getting deleted. Having an option to exclude
certain refs from a packed-refs file allows these internal references to
be deleted much more efficiently.
Add an --exclude option to the pack-refs builtin, and use the ref
exclusions API to exclude certain refs from being packed into the final
packed-refs file
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--all packs not just branch tips but anything under refs/ with the
exception of hidden refs and broken refs. Clarify this in the
documentation.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reachability bitmap coverage exists in a repository, Git will use a
different (and hopefully faster) traversal to compute revision walks.
Consider a set of positive and negative tips (which we'll refer to with
their standard bitmap parlance by "wants", and "haves"). In order to
figure out what objects exist between the tips, the existing traversal
in `prepare_bitmap_walk()` does something like:
1. Consider if we can even compute the set of objects with bitmaps,
and fall back to the usual traversal if we cannot. For example,
pathspec limiting traversals can't be computed using bitmaps (since
they don't know which objects are at which paths). The same is true
of certain kinds of non-trivial object filters.
2. If we can compute the traversal with bitmaps, partition the
(dereferenced) tips into two object lists, "haves", and "wants",
based on whether or not the objects have the UNINTERESTING flag,
respectively.
3. Fall back to the ordinary object traversal if either (a) there are
more than zero haves, none of which are in the bitmapped pack or
MIDX, or (b) there are no wants.
4. Construct a reachability bitmap for the "haves" side by walking
from the revision tips down to any existing bitmaps, OR-ing in any
bitmaps as they are found.
5. Then do the same for the "wants" side, stopping at any objects that
appear in the "haves" bitmap.
6. Filter the results if any object filter (that can be easily
computed with bitmaps alone) was given, and then return back to the
caller.
When there is good bitmap coverage relative to the traversal tips, this
walk is often significantly faster than an ordinary object traversal
because it can visit far fewer objects.
But in certain cases, it can be significantly *slower* than the usual
object traversal. Why? Because we need to compute complete bitmaps on
either side of the walk. If either one (or both) of the sides require
walking many (or all!) objects before they get to an existing bitmap,
the extra bitmap machinery is mostly or all overhead.
One of the benefits, however, is that even if the walk is slower, bitmap
traversals are guaranteed to provide an *exact* answer. Unlike the
traditional object traversal algorithm, which can over-count the results
by not opening trees for older commits, the bitmap walk builds an exact
reachability bitmap for either side, meaning the results are never
over-counted.
But producing non-exact results is OK for our traversal here (both in
the bitmap case and not), as long as the results are over-counted, not
under.
Relaxing the bitmap traversal to allow it to produce over-counted
results gives us the opportunity to make some significant improvements.
Instead of the above, the new algorithm only has to walk from the
*boundary* down to the nearest bitmap, instead of from each of the
UNINTERESTING tips.
The boundary-based approach still has degenerate cases, but we'll show
in a moment that it is often a significant improvement.
The new algorithm works as follows:
1. Build a (partial) bitmap of the haves side by first OR-ing any
bitmap(s) that already exist for UNINTERESTING commits between the
haves and the boundary.
2. For each commit along the boundary, add it as a fill-in traversal
tip (where the traversal terminates once an existing bitmap is
found), and perform fill-in traversal.
3. Build up a complete bitmap of the wants side as usual, stopping any
time we intersect the (partial) haves side.
4. Return the results.
And is more-or-less equivalent to using the *old* algorithm with this
invocation:
$ git rev-list --objects --use-bitmap-index $WANTS --not \
$(git rev-list --objects --boundary $WANTS --not $HAVES |
perl -lne 'print $1 if /^-(.*)/')
The new result performs significantly better in many cases, particularly
when the distance from the boundary commit(s) to an existing bitmap is
shorter than the distance from (all of) the have tips to the nearest
bitmapped commit.
Note that when using the old bitmap traversal algorithm, the results can
be *slower* than without bitmaps! Under the new algorithm, the result is
computed faster with bitmaps than without (at the cost of over-counting
the true number of objects in a similar fashion as the non-bitmap
traversal):
# (Computing the number of tagged objects not on any branches
# without bitmaps).
$ time git rev-list --count --objects --tags --not --branches
20
real 0m1.388s
user 0m1.092s
sys 0m0.296s
# (Computing the same query using the old bitmap traversal).
$ time git rev-list --count --objects --tags --not --branches --use-bitmap-index
19
real 0m22.709s
user 0m21.628s
sys 0m1.076s
# (this commit)
$ time git.compile rev-list --count --objects --tags --not --branches --use-bitmap-index
19
real 0m1.518s
user 0m1.234s
sys 0m0.284s
The new algorithm is still slower than not using bitmaps at all, but it
is nearly a 15-fold improvement over the existing traversal.
In a more realistic setting (using my local copy of git.git), I can
observe a similar (if more modest) speed-up:
$ argv="--count --objects --branches --not --tags"
hyperfine \
-n 'no bitmaps' "git.compile rev-list $argv" \
-n 'existing traversal' "git.compile rev-list --use-bitmap-index $argv" \
-n 'boundary traversal' "git.compile -c pack.useBitmapBoundaryTraversal=true rev-list --use-bitmap-index $argv"
Benchmark 1: no bitmaps
Time (mean ± σ): 124.6 ms ± 2.1 ms [User: 103.7 ms, System: 20.8 ms]
Range (min … max): 122.6 ms … 133.1 ms 22 runs
Benchmark 2: existing traversal
Time (mean ± σ): 368.6 ms ± 3.0 ms [User: 325.3 ms, System: 43.1 ms]
Range (min … max): 365.1 ms … 374.8 ms 10 runs
Benchmark 3: boundary traversal
Time (mean ± σ): 167.6 ms ± 0.9 ms [User: 139.5 ms, System: 27.9 ms]
Range (min … max): 166.1 ms … 169.2 ms 17 runs
Summary
'no bitmaps' ran
1.34 ± 0.02 times faster than 'boundary traversal'
2.96 ± 0.05 times faster than 'existing traversal'
Here, the new algorithm is also still slower than not using bitmaps, but
represents a more than 2-fold improvement over the existing traversal in
a more modest example.
Since this algorithm was originally written (nearly a year and a half
ago, at the time of writing), the bitmap lookup table shipped, making
the new algorithm's result more competitive. A few other future
directions for improving bitmap traversal times beyond not using bitmaps
at all:
- Decrease the cost to decompress and OR together many bitmaps
together (particularly when enumerating the uninteresting side of
the walk). Here we could explore more efficient bitmap storage
techniques, like Roaring+Run and/or use SIMD instructions to speed
up ORing them together.
- Store pseudo-merge bitmaps, which could allow us to OR together
fewer "summary" bitmaps (which would also help with the above).
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the boundary-based bitmap walk to perform a fill-in
traversal using the boundary of either side as the tips, extract routine
used to perform fill-in traversal by `find_objects()` so that it can be
used in both places.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object_array API has an OBJECT_ARRAY_INIT macro, but lacks a
function to initialize an object_array at a given location in memory.
Introduce `object_array_init()` to implement such a function.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sergey Organov noticed and reported "--patch --no-patch --raw"
behaves differently from just "--raw". It turns out that there are
a few interesting bugs in the implementation and documentation.
* First, the documentation for "--no-patch" was unclear that it
could be read to mean "--no-patch" countermands an earlier
"--patch" but not other things. The intention of "--no-patch"
ever since it was introduced at d09cd15d (diff: allow --no-patch
as synonym for -s, 2013-07-16) was to serve as a synonym for
"-s", so "--raw --patch --no-patch" should have produced no
output, but it can be (mis)read to allow showing only "--raw"
output.
* Then the interaction between "-s" and other format options were
poorly implemented. Modern versions of Git uses one bit each to
represent formatting options like "--patch", "--stat" in a single
output_format word, but for historical reasons, "-s" also is
represented as another bit in the same word. This allows two
interesting bugs to happen, and we have both X-<.
(1) After setting a format bit, then setting NO_OUTPUT with "-s",
the code to process another "--<format>" option drops the
NO_OUTPUT bit to allow output to be shown again. However,
the code to handle "-s" only set NO_OUTPUT without unsetting
format bits set earlier, so the earlier format bit got
revealed upon seeing the second "--<format>" option. This is
the problem Sergey observed.
(2) After setting NO_OUTPUT with "-s", code to process
"--<format>" option can forget to unset NO_OUTPUT, leaving
the command still silent.
It is tempting to change the meaning of "--no-patch" to mean
"disable only the patch format output" and reimplement "-s" as "not
showing anything", but it would be an end-user visible change in
behavior. Let's fix the interactions of these bits to first make
"-s" work as intended.
The fix is conceptually very simple.
* Whenever we set DIFF_FORMAT_FOO because we saw the "--foo"
option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
given), we make sure we drop DIFF_FORMAT_NO_OUTPUT. We forgot to
do so in some of the options and caused (2) above.
* When processing "-s" option, we should not just set
DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
We didn't do so and retained format bits set by options
previously seen, causing (1) above.
It is even more tempting to lose NO_OUTPUT bit and instead take
output_format word being 0 as its replacement, but that would break
the mechanism "git show" uses to default to "--patch" output, where
the distinction between telling the command to be silent with "-s"
and having no output format specified on the command line matters,
and an explicit output format given on the command line should not
be "combined" with the default "--patch" format.
So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we
may want to replace it with OPTION_GIVEN bit, and
* make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and
DIFF_FORMAT_OPTION_GIVEN bit on for each format. "--no-raw",
etc. will set off DIFF_FORMAT_$format bit but still record the
fact that we saw an option from the command line by setting
DIFF_FORMAT_OPTION_GIVEN bit.
* make "-s" (and its synonym "--no-patch") clear all other bits
and set only the DIFF_FORMAT_OPTION_GIVEN bit on.
which I suspect would make the code much cleaner without breaking
any end-user expectations.
Once that is in place, transitioning "--no-patch" to mean the
counterpart of "--patch", just like "--no-raw" only defeats an
earlier "--raw", would be quite simple at the code level. The
social cost of migrating the end-user expectations might be too
great for it to be worth, but at least the "GIVEN" bit clean-up
alone may be worth it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These practices largely reflect what we are already doing on the mailing
list, which should help new Coccinelle authors and reviewers get up to
speed.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Drop "examples" since we actually use the patches.
- Drop sentences that could be headings instead
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-27 16:49:15 -07:00
606 changed files with 22864 additions and 13232 deletions
warning("cache client gave us a partial credential");
else{
remove_credential(&c);
remove_credential(&c,0);
cache_credential(&c,timeout);
}
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.