In 2.28-rc0, we corrected a bug that some repository extensions are
honored by mistake even in a version 0 repositories (these
configuration variables in extensions.* namespace were supposed to
have special meaning in repositories whose version numbers are 1 or
higher), but this was a bit too big a change.
* jn/v0-with-extensions-fix:
repository: allow repository format upgrade with extensions
Revert "check_repository_format_gently(): refuse extensions for old repositories"
Now that we officially permit repository extensions in repository
format v0, permit upgrading a repository with extensions from v0 to v1
as well.
For example, this means a repository where the user has set
"extensions.preciousObjects" can use "git fetch --filter=blob:none
origin" to upgrade the repository to use v1 and the partial clone
extension.
To avoid mistakes, continue to forbid repository format upgrades in v0
repositories with an unrecognized extension. This way, a v0 user
using a misspelled extension field gets a chance to correct the
mistake before updating to the less forgiving v1 format.
While we're here, make the error message for failure to upgrade the
repository format a bit shorter, and present it as an error, not a
warning.
Reported-by: Huan Huan Chen <huanhuanchen@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 14c7fa269e.
The core.repositoryFormatVersion field was introduced in ab9cb76f66
(Repository format version check., 2005-11-25), providing a welcome
bit of forward compatibility, thanks to some welcome analysis by
Martin Atukunda. The semantics are simple: a repository with
core.repositoryFormatVersion set to 0 should be comprehensible by all
Git implementations in active use; and Git implementations should
error out early instead of trying to act on Git repositories with
higher core.repositoryFormatVersion values representing new formats
that they do not understand.
A new repository format did not need to be defined until 00a09d57eb
(introduce "extensions" form of core.repositoryformatversion,
2015-06-23). This provided a finer-grained extension mechanism for
Git repositories. In a repository with core.repositoryFormatVersion
set to 1, Git implementations can act on "extensions.*" settings that
modify how a repository is interpreted. In repository format version
1, unrecognized extensions settings cause Git to error out.
What happens if a user sets an extension setting but forgets to
increase the repository format version to 1? The extension settings
were still recognized in that case; worse, unrecognized extensions
settings do *not* cause Git to error out. So combining repository
format version 0 with extensions settings produces in some sense the
worst of both worlds.
To improve that situation, since 14c7fa269e
(check_repository_format_gently(): refuse extensions for old
repositories, 2020-06-05) Git instead ignores extensions in v0 mode.
This way, v0 repositories get the historical (pre-2015) behavior and
maintain compatibility with Git implementations that do not know about
the v1 format. Unfortunately, users had been using this sort of
configuration and this behavior change came to many as a surprise:
- users of "git config --worktree" that had followed its advice
to enable extensions.worktreeConfig (without also increasing the
repository format version) would find their worktree configuration
no longer taking effect
- tools such as copybara[*] that had set extensions.partialClone in
existing repositories (without also increasing the repository format
version) would find that setting no longer taking effect
The behavior introduced in 14c7fa269e might be a good behavior if we
were traveling back in time to 2015, but we're far too late. For some
reason I thought that it was what had been originally implemented and
that it had regressed. Apologies for not doing my research when
14c7fa269e was under development.
Let's return to the behavior we've had since 2015: always act on
extensions.* settings, regardless of repository format version. While
we're here, include some tests to describe the effect on the "upgrade
repository version" code path.
[*] ca76c0b1e1
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix to the code to produce progress bar, which is new in the
upcoming release.
* tb/commit-graph-no-check-oids:
commit-graph: fix "Collecting commits from input" progress line
The code to produce progress output from "git commit-graph --write"
had a few breakages, which have been fixed.
* sg/commit-graph-progress-fix:
commit-graph: fix "Writing out commit graph" progress counter
commit-graph: fix progress of reachable commits
When an aliased command, whose output is piped to a pager by git,
gets killed by a signal, the pager got into a funny state, which
has been corrected (again).
* ta/wait-on-aliased-commands-upon-signal:
Wait for child on signal death for aliases to externals
Wait for child on signal death for aliases to builtins
To display a progress line while reading commits from standard input
and looking them up, 5b6653e523 (builtin/commit-graph.c: dereference
tags in builtin, 2020-05-13) should have added a pair of
start_delayed_progress() and stop_progress() calls around the loop
reading stdin. Alas, the stop_progress() call ended up at the wrong
place, after write_commit_graph(), which does all the commit-graph
computation and writing, and has several progress lines of its own.
Consequently, that new
Collecting commits from input: 1234
progress line is overwritten by the first progress line shown by
write_commit_graph(), and its final "done" line is shown last, after
everything is finished:
$ { sleep 3 ; git rev-list -3 HEAD ; sleep 1 ; } | ~/src/git/git commit-graph write --stdin-commits
Expanding reachable commits in commit graph: 873402, done.
Writing out commit graph in 4 passes: 100% (3493608/3493608), done.
Collecting commits from input: 3, done.
Furthermore, that stop_progress() call was added after the 'cleanup'
label, where that loop reading stdin jumps in case of an error. In
case of invalid input this then results in the "done" line shown after
the error message:
$ { sleep 3 ; git rev-list -3 HEAD ; echo junk ; } | ~/src/git/git commit-graph write --stdin-commits
error: unexpected non-hex object ID: junk
Collecting commits from input: 3, done.
Move that stop_progress() call to the right place.
While at it, drop the unnecessary 'if (progress)' condition protecting
the stop_progress() call, because that function is prepared to handle
a NULL progress struct.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of `git diff` goes through several different invocations
(numbering added by me):
1. git diff [<options>] [--] [<path>...]
2. git diff [<options>] --no-index [--] <path> <path>
3. git diff [<options>] --cached [<commit>] [--] [<path>...]
4. git diff [<options>] <commit> [--] [<path>...]
5. git diff [<options>] <commit> <commit> [--] [<path>...]
6. git diff [<options>] <commit>..<commit> [--] [<path>...]
7. git diff [<options>] <commit> <commit>... <commit> [--] [<path>...]
8. git diff [<options>] <commit>...<commit> [--] [<path>...]
It then goes on to say that "all of the <commit> in the above
description, except in the last two forms that use '..' notations, can
be any <tree>". The "last two" actually refers to 6 and 8. This got out
of sync in commit b7e10b2ca2 ("Documentation: usage for diff combined
commits", 2020-06-12) which added item 7 to the mix.
As a further complication, after b7e10b2ca2 we also have some potential
confusion around "the '..' notation". The "..[.]" in items 6 and 8 are
part of the rev notation, whereas the "..." in item 7 is manpage
language for "one or more".
Move item 6 down, i.e., to between 7 and 8, to restore the ordering.
Because 6 refers to 5 ("synonymous to the previous form") we need to
tweak the language a bit.
An added bonus of this commit is that we're trying to steer users away
from `git diff <commit>..<commit>` and moving it further down probably
doesn't hurt.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b7e10b2ca2 ("Documentation: usage for diff combined commits",
2020-06-12) modified the synopsis by adding an optional "[<commit>...]"
to
'git diff' [<options>] <commit> <commit> [--] [<path>...]
to effectively add
'git diff' [<options>] <commit> <commit>... <commit> [--] [<path>...]
as another valid invocation. Which makes sense.
Further down, in the description, it left the existing entry for
'git diff' [<options>] <commit> <commit> [--] [<path>...]
intact and added a new entry on
'git diff' [<options>] <commit> [<commit>...] <commit> [--] [<path>...]
where it says that "[t]his form is to view the results of a merge
commit" and details how "the first listed commit must be the merge
itself". But one possible instantiation of this form is `git diff
<commit> <commit>` for which the added text doesn't really apply.
Remove the brackets so that we lose this overlap between the two
descriptions. We can still use the more compact representation in the
synopsis.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git checkout" failed to catch an error from fstat() after updating
a path in the working tree.
* mt/entry-fstat-fallback-fix:
entry: check for fstat() errors after checkout
"fetch.writeCommitGraph" was enabled when "feature.experimental" is
asked for, but it was found to be a bit too risky even for bold
folks in its current shape. The configuration has been ejected, at
least for now, from the "experimental" feature set.
* jn/eject-fetch-write-commit-graph-out-of-experimental:
experimental: default to fetch.writeCommitGraph=false
When "fetch.writeCommitGraph" configuration is set in a shallow
repository and a fetch moves the shallow boundary, we wrote out
broken commit-graph files that do not match the reality, which has
been corrected.
* tb/fix-persistent-shallow:
commit.c: don't persist substituted parents when unshallowing
Recent update to "git diff" meant as a code clean-up introduced a
bug in its error handling code, which has been corrected.
* ct/diff-with-merge-base-clarification:
diff: check for merge bases before assigning sym->base
"git log -Lx,y:path --before=date" lost track of where the range
should be because it didn't take the changes made by the youngest
commits that are omitted from the output into account.
* rs/line-log-until:
revision: disable min_age optimization with line-log
"git send-email --in-reply-to=<msg>" did not use the In-Reply-To:
header with the value given from the command line, and let it be
overridden by the value on In-Reply-To: header in the messages
being sent out (if exists).
* ra/send-email-in-reply-to-from-command-line-wins:
send-email: restore --in-reply-to superseding behavior
The command line completion support (in contrib/) used to be
prepared to work with "set -u" but recent changes got a bit more
sloppy. This has been corrected.
* vs/completion-with-set-u:
completion: nounset mode fixes
We don't give a "::" for the list separator, but just a single ":". This
ends up rendering literally, "--apply: Use applying strategies ...". As
a follow-on error, the list continuation, "+", also ends up rendering
literally (because we don't have a list).
This was introduced in 52eb738d6b ("rebase: add an --am option",
2020-02-15) and survived the rename in 10cdb9f38a ("rebase: rename the
two primary rebase backends", 2020-02-15).
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
76ffbca71a (commit-graph: write Bloom filters to commit graph file,
2020-04-06) added two delayed progress lines to writing the Bloom
filter index and data chunk. This is wrong, because a single common
progress is used while writing all chunks, which is not updated while
writing these two new chunks, resulting in incomplete-looking "done"
lines:
Expanding reachable commits in commit graph: 888679, done.
Computing commit changed paths Bloom filters: 100% (888678/888678), done.
Writing out commit graph in 6 passes: 66% (3554712/5332068), done.
Use the common 'struct progress' instance while writing the Bloom
filter chunks as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To display a progress line while iterating over all refs,
d335ce8f24 (commit-graph.c: show progress of finding reachable
commits, 2020-05-13) should have added a pair of
start_delayed_progress() and stop_progress() calls around a
for_each_ref() invocation. Alas, the stop_progress() call ended up at
the wrong place, after write_commit_graph(), which does all the
commit-graph computation and writing, and has several progress lines
of its own. Consequently, that new
Collecting referenced commits: 123
progress line is overwritten by the first progress line shown by
write_commit_graph(), and its final "done" line is shown last, after
everything is finished:
Expanding reachable commits in commit graph: 344786, done.
Computing commit changed paths Bloom filters: 100% (344786/344786), done.
Collecting referenced commits: 154, done.
Move that stop_progress() call to the right place.
While at it, drop the unnecessary 'if (data.progress)' condition
protecting the stop_progress() call, because that function is prepared
to handle a NULL progress struct.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 11179eb311 ("entry.c: check if file exists after checkout",
2017-10-05) we started checking the result of the lstat() call done
after writing a file, to avoid writing garbage to the corresponding
cache entry. However, the code skips calling lstat() if it's possible
to use fstat() when it still has the file descriptor open. And when
calling fstat() we don't do the same error checking. To fix that, let
the callers of fstat_output() know when fstat() fails. In this case,
write_entry() will try to use lstat() and properly report an error if
that fails as well.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch.writeCommitGraph feature makes fetches write out a commit
graph file for the newly downloaded pack on fetch. This improves the
performance of various commands that would perform a revision walk and
eventually ought to be the default for everyone. To prepare for that
future, it's enabled by default for users that set
feature.experimental=true to experience such future defaults.
Alas, for --unshallow fetches from a shallow clone it runs into a
snag: by the time Git has fetched the new objects and is writing a
commit graph, it has performed a revision walk and r->parsed_objects
contains information about the shallow boundary from *before* the
fetch. The commit graph writing code is careful to avoid writing a
commit graph file in shallow repositories, but the new state is not
shallow, and the result is that from that point on, commands like "git
log" make use of a newly written commit graph file representing a
fictional history with the old shallow boundary.
We could fix this by making the commit graph writing code more careful
to avoid writing a commit graph that could have used any grafts or
shallow state, but it is possible that there are other pieces of
mutated state that fetch's commit graph writing code may be relying
on. So disable it in the feature.experimental configuration.
Google developers have been running in this configuration (by setting
fetch.writeCommitGraph=false in the system config) to work around this
bug since it was discovered in April. Once the fix lands, we'll
enable fetch.writeCommitGraph=true again to give it some early testing
before rolling out to a wider audience.
In other words:
- this patch only affects behavior with feature.experimental=true
- it makes feature.experimental match the configuration Google has
been using for the last few months, meaning it would leave users in
a better tested state than without it
- this should improve testing for other features guarded by
feature.experimental, by making feature.experimental safer to use
Reported-by: Jay Conrod <jayconrod@google.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git knows how to reset stat-validity checks for the
$GIT_DIR/shallow file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').
However, when $GIT_DIR/shallow changes, Git does not alter or remove any
grafts (nor substituted parents) in memory.
This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
even in this case), commit_graph_compatible() would return false,
indicating that the repository should not be used to write a
commit-graphs (since commit-graph files cannot represent a shallow
history). But since 37b9dcabfc, in an --unshallow operation that check
succeeds.
Thus even though the repository isn't shallow any longer (that is, we
have all of the objects), the in-core representation of those objects
still has munged parents at the shallow boundaries. When the
commit-graph write proceeds, we use the incorrect parentage, producing
wrong results.
There are two ways for a user to work around this: either (1) set
'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
unshallowing.
One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. That would produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, attach a new bit to the pool,
'substituted_parent', which indicates if the repository *ever* stored a
commit which had its parents modified (i.e., the shallow boundary
prior to unshallowing).
This bit needs to be sticky because all reads subsequent to modifying a
commit's parents are unreliable when unshallowing. Modify the check in
'commit_graph_compatible' to take this bit into account, and correctly
avoid generating commit-graphs in this case, thus solving the bug.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In symdiff_prepare(), we iterate over the set of parsed objects to pick
out any symmetric differences, including the left, right, and base
elements. We assign the results into pointers in a "struct symdiff", and
then complain if we didn't find a base, like so:
sym->left = rev->pending.objects[lpos].name;
sym->right = rev->pending.objects[rpos].name;
sym->base = rev->pending.objects[basepos].name;
if (basecount == 0)
die(_("%s...%s: no merge base"), sym->left, sym->right);
But the least lines are backwards. If basecount is 0, then basepos will
be -1, and we will access memory outside of the pending array. This
isn't usually that big a deal, since we don't do anything besides a
single pointer-sized read before exiting anyway, but it does violate the
C standard, and of course memory-checking tools like ASan complain.
Let's put the basecount check first. Note that we haveto split it from
the other assignments, since the die() relies on sym->left and
sym->right having been assigned (this isn't strictly necessary, but is
easier to read than dereferencing the pending array again).
Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are running an alias to an external command, we want to wait for
that process to exit even after receiving ^C which normally kills the
git process. This is useful when the process is ignoring SIGINT (which
e.g. pagers often do), and then we don't want it to be killed.
Having an alias which invokes a pager is probably not common, but it can
be useful e.g. if you have an alias to a git command which uses a
subshell as one of the arguments (in which case you have to use an
external command, not an alias to a builtin).
This patch is similar to the previous commit, but the previous commit
fixed this only for aliases to builtins, while this commit does the same
for aliases to external commands. In addition to waiting after clean
like the previous commit, this also enables cleaning the child (that was
already enabled for aliases to builtins before the previous commit),
because wait_after_clean relies on it. Lastly, while the previous commit
fixed a regression, I don't think this has ever worked properly.
Signed-off-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you hit ^C all the processes in the tree receives it. When a git
command uses a pager, git ignores this and waits until the pager quits.
However, when using an alias there is an additional process in the tree
which didn't ignore the signal. That caused it to exit which in turn
caused the pager to exit. This fixes that for aliases to builtins.
This was originally fixed in 46df6906 (execv_dashed_external: wait
for child on signal death, 2017-01-06), but was broken by ee4512ed
(trace2: create new combined trace facility, 2019-02-22) and then
b9140840 (git: avoid calling aliased builtins via their dashed form,
2019-07-29).
Signed-off-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effort to avoid using test_must_fail on non-git command continues.
* dl/test-must-fail-fixes-5:
lib-submodule-update: pass 'test_must_fail' as an argument
lib-submodule-update: prepend "git" to $command
lib-submodule-update: consolidate --recurse-submodules
lib-submodule-update: add space after function name
"git fast-export --anonymize" learned to take customized mapping to
allow its users to tweak its output more usable for debugging.
* jk/fast-export-anonym-alt:
fast-export: use local array to store anonymized oid
fast-export: anonymize "master" refname
fast-export: allow seeding the anonymized mapping
fast-export: add a "data" callback parameter to anonymize_str()
fast-export: move global "idents" anonymize hashmap into function
fast-export: use a flex array to store anonymized entries
fast-export: stop storing lengths in anonymized hashmaps
fast-export: tighten anonymize_mem() interface to handle only strings
fast-export: store anonymized oids as hex strings
fast-export: use xmemdupz() for anonymizing oids
t9351: derive anonymized tree checks from original repo
"git difftool" has trouble dealing with paths added to the index
with the intent-to-add bit.
* js/diff-files-i-t-a-fix-for-difftool:
difftool -d: ensure that intent-to-add files are handled correctly
diff-files --raw: show correct post-image of intent-to-add files
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.
* js/default-branch-name:
contrib: subtree: adjust test to change in fmt-merge-msg
testsvn: respect `init.defaultBranch`
remote: use the configured default branch name when appropriate
clone: use configured default branch name when appropriate
init: allow setting the default for the initial branch name via the config
init: allow specifying the initial branch name for the new repository
docs: add missing diamond brackets
submodule: fall back to remote's HEAD for missing remote.<name>.branch
send-pack/transport-helper: avoid mentioning a particular branch
fmt-merge-msg: stop treating `master` specially
The code to push changes over "dumb" HTTP had a bad interaction
with the commit reachability code due to incorrect allocation of
object flag bits, which has been corrected.
* bc/http-push-flagsfix:
http-push: ensure unforced pushes fail when data would be lost
The documentation and some tests have been adjusted for the recent
renaming of "pu" branch to "seen".
* js/pu-to-seen:
tests: reference `seen` wherever `pu` was referenced
docs: adjust the technical overview for the rename `pu` -> `seen`
docs: adjust for the recent rename of `pu` to `seen`
Add "git prune" to the completion (in contrib/), which could be
typed by end-users from the command line.
* jl/complete-git-prune:
bash-completion: add git-prune into bash completion
API cleanup for get_worktrees()
* es/get-worktrees-unsort:
worktree: drop get_worktrees() unused 'flags' argument
worktree: drop get_worktrees() special-purpose sorting option
CVS/SVN interface have been prepared for SHA-256 transition
* bc/sha-256-cvs-svn-updates:
git-cvsexportcommit: port to SHA-256
git-cvsimport: port to SHA-256
git-cvsserver: port to SHA-256
git-svn: set the OID length based on hash algorithm
perl: make SVN code hash independent
perl: make Git::IndexInfo work with SHA-256
perl: create and switch variables for hash constants
t/lib-git-svn: make hash size independent
t9101: make hash independent
t9104: make hash size independent
t9100: make test work with SHA-256
t9108: make test hash independent
t9168: make test hash independent
t9109: make test hash independent
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.
* ak/commit-graph-to-slab:
commit-graph: minimize commit_graph_data_slab access
commit: move members graph_pos, generation to a slab
commit-graph: introduce commit_graph_data_slab
object: drop parsed_object_pool->commit_count
"git status" learned to report the status of sparse checkout.
* en/sparse-status:
git-prompt: include sparsity state as well
git-prompt: document how in-progress operations affect the prompt
wt-status: show sparse checkout status as well
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
If one of the options --before, --min-age or --until is given,
limit_list() filters out younger commits early on. Line-log needs all
those commits to trace the movement of line ranges, though. Skip this
optimization if both are used together.
Reported-by: Мария Долгополова <dolgopolovamariia@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In https://github.com/git-for-windows/git/issues/2677, a `git difftool
-d` problem was reported. The underlying cause was a bug in `git
diff-files --raw` that we just fixed: it reported intent-to-add files
with the empty _tree_ as the post-image OID, when we need to show
an all-zero (or, "null") OID instead, to indicate to the caller that
they have to look at the worktree file.
The symptom of that problem shown by `git difftool` was this:
error: unable to read sha1 file of <path> (<empty-tree-OID>)
error: could not write '<filename>'
Make sure that the reported `difftool` problem stays fixed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documented behavior of `git diff-files --raw` is to display
[...] 0{40} if creation, unmerged or "look at work tree".
on the right hand (i.e. postimage) side. This happens for files that
have unstaged modifications, and for files that are unmodified but
stat-dirty.
For intent-to-add files, we used to show the empty blob's hash instead.
In c26022ea8f (diff: convert diff_addremove to struct object_id,
2017-05-30), we made that worse by inadvertently changing that to the
hash of the empty tree.
Let's make the behavior consistent with files that have unstaged
modifications (which applies to intent-to-add files, too) by showing
all-zero values also for intent-to-add files.
Accordingly, this patch adjusts the expectations set by the regression
test introduced in feea6946a5 (diff-files: treat "i-t-a" files as
"not-in-index", 2020-06-20).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git send-email --in-reply-to= fails to override In-Reply-To email headers,
if they're present in the output of format-patch, even when explicitly
told to do so by the option --no-thread, which breaks the contract of the
command line switch option, per its man page.
"
--in-reply-to=<identifier>
Make the first mail (or all the mails with --no-thread) appear as
a reply to the given Message-Id, which avoids breaking threads to
provide a new patch series.
"
This patch fixes the aformentioned issue, by bringing --in-reply-to's old
overriding behavior back.
The test was donated by Carlo Marcelo Arenas Belón.
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When displaying cat-file usage, the fact that a <format> can
be specified is only visible when lookling at the --batch and
--batch-check options which are shown like this:
--batch[=<format>] show info and content of objects fed from the standard input
--batch-check[=<format>]
show info about objects fed from the standard input
It seems more coherent and improves discovery to also show it
on the usage line.
In the documentation the DESCRIPTION tells us that "The output
format can be overridden using the optional <format> argument",
but we can't see the <format> argument in the SYNOPSIS above
the description which is confusing.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Accessing unset variables results an errors when the shell is in
nounset/-u mode. This fixes the cases I've come across while using git
completion in a shell running in that mode for a while. It's hard to
tell if this is the complete set, but at least it improves things.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're starting to stop treating `master' specially in fmt-merge-msg.
Adjust the test to reflect that change.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.
* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"
A misdesigned strbuf_write_fd() function has been retired.
* rs/retire-strbuf-write-fd:
strbuf: remove unreferenced strbuf_write_fd method.
bugreport.c: replace strbuf_write_fd with write_in_full
An in-code comment in "git diff" has been updated.
* dl/diff-usage-comment-update:
builtin/diff: fix botched update of usage comment
builtin/diff: update usage comment
Allow runtime upgrade of the repository format version, which needs
to be done carefully.
There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.
* xl/upgrade-repo-format:
check_repository_format_gently(): refuse extensions for old repositories
sparse-checkout: upgrade repository to version 1 when enabling extension
fetch: allow adding a filter after initial clone
repository: add a helper function to perform repository format upgrade
Some older versions of gcc complain about this line:
builtin/fast-export.c:412:2: error: dereferencing type-punned pointer
will break strict-aliasing rules [-Werror=strict-aliasing]
put_be32(oid.hash + hashsz - 4, counter++);
^
This seems to be a false positive, as there's no type-punning at all
here. oid.hash is an array of unsigned char; when we pass it to a
function it decays to a pointer to unsigned char. We do take a void
pointer in put_be32(), but it's immediately aliased with another pointer
to unsigned char (and clearly the compiler is looking inside the inlined
put_be32(), since the warning doesn't happen with -O0).
This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6,
7, 8, and 9).
We can work around it by using a local array instead of an object_id
struct. This is a little more intimate with the details of object_id,
but for whatever reason doesn't seem to trigger the compiler warning.
We can revert this patch once we decide that those gcc versions are too
old to care about for a warning like this (gcc 4.8 is the default
compiler for Ubuntu Trusty, which is out-of-support but not fully
end-of-life'd until April 2022).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:
- it helped to have some known reference point between the original
and anonymized repository
- since it's historically the default branch name, it doesn't leak any
information
Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).
For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.
Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.
So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.
Let's make it possible to seed the anonymization map. This lets users
either:
- mark names to be retained as-is, if they don't consider them secret
(in which case their original commands would just work)
- map names to new values, which lets them adapt the reproduction
recipe to the new names without revealing the originals
The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.
This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.
One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.
* ss/submodule-set-branch-in-c:
submodule: port subcommand 'set-branch' from shell to C
"git merge-base --is-ancestor" is taught to take advantage of the
commit graph.
* ds/merge-base-is-ancestor-optim:
commit-reach: use fast logic in repo_in_merge_base
commit-reach: create repo_is_descendant_of()
Code clean-up around "git branch" with a minor bugfix.
* dl/branch-cleanup:
branch: don't mix --edit-description
t3200: test for specific errors
t3200: rename "expected" to "expect"
Code clean-up in the codepath that serves "git fetch" continues.
* cc/upload-pack-data-3:
upload-pack: refactor common code into do_got_oid()
upload-pack: move oldest_have to upload_pack_data
upload-pack: pass upload_pack_data to got_oid()
upload-pack: pass upload_pack_data to ok_to_give_up()
upload-pack: pass upload_pack_data to send_acks()
upload-pack: pass upload_pack_data to process_haves()
upload-pack: change allow_unadvertised_object_request to an enum
upload-pack: move allow_unadvertised_object_request to upload_pack_data
upload-pack: move extra_edge_obj to upload_pack_data
upload-pack: move shallow_nr to upload_pack_data
upload-pack: pass upload_pack_data to send_unshallow()
upload-pack: pass upload_pack_data to deepen_by_rev_list()
upload-pack: pass upload_pack_data to deepen()
upload-pack: pass upload_pack_data to send_shallow_list()
"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.
* ct/diff-with-merge-base-clarification:
Documentation: usage for diff combined commits
git diff: improve range handling
t/t3430: avoid undefined git diff behavior
Code clean-up of "git clean" resulted in a fix of recent
performance regression.
* en/clean-cleanups:
clean: optimize and document cases where we recurse into subdirectories
clean: consolidate handling of ignored parameters
dir, clean: avoid disallowed behavior
dir: fix a few confusing comments
As our test suite partially reflects how we work in the Git project, it
is natural that the branch name `pu` was used in a couple places.
Since that branch was renamed to `seen`, let's use the new name
consistently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch tries to rewrite history a bit: the mail contents that have
been added to Git's source code are actually fixed, we cannot change
them in hindsight.
But as the `pu` branch _was_ renamed, and as the documents were added to
Git's source code not so much as historical record, but to describe the
status quo, let's pretend that we have a time machine and adjust the
provided information accordingly.
Where appropriate, quotes were added for readability.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of "What's cooking in git.git (Jun 2020, #04; Mon, 22)", there is no
longer any `pu` branch, but a `seen` branch.
While we technically do not even need to update the manual pages, it
makes sense to update them because they clearly talk about branches in
git.git.
Please note that in two instances, this patch not only updates the
branch name, but also the description "(proposed updates)".
Where appropriate, quotes have been added for readability.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_worktrees() retrieves a list of all worktrees associated with a
repository, including the main worktree. The location of the main
worktree is determined by get_main_worktree() which needs to handle
three distinct cases for the main worktree after absolute-path
conversion:
* <bare-repository>/.
* <main-worktree>/.git/. (when $CWD is .git)
* <main-worktree>/.git (when $CWD is any worktree)
They all need to be normalized to just the <path> portion, dropping any
"/." or "/.git" suffix.
It turns out, however, that get_main_worktree() was only handling the
first and last cases, i.e.:
if (!strip_suffix(path, "/.git"))
strip_suffix(path, "/.");
This shortcoming was addressed by 45f274fbb1 (get_main_worktree(): allow
it to be called in the Git directory, 2020-02-23) by changing the logic
to:
strip_suffix(path, "/.");
if (!strip_suffix(path, "/.git"))
strip_suffix(path, "/.");
which makes the final strip_suffix() invocation dead-code.
Fix this oversight by enumerating the three distinct cases explicitly
rather than attempting to strip the suffix(es) incrementally.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default name of the initial branch in new repositories can now be
configured. The `testsvn` remote helper translates the remote Subversion
repository's branch name `trunk` to the hard-coded name `master`.
Clearly, the intention was to make the name align with Git's defaults.
So while we are not talking about a newly-created repository in the
`testsvn` context, it is a newly-created _Git_ repository, si it _still_
makes sense to use the overridden default name for the initial branch
whenever users configured it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When guessing the default branch name of a remote, and there are no refs
to guess from, we want to go with the preference specified by the user
for the fall-back, i.e. the default name to be used for the initial
branch of new repositories (because as far as the user is concerned, a
remote that has no branches yet is a new repository).
At the same time, when talking to an older Git server that does not
report a symref for `HEAD` (but instead reports a commit hash), let's
try to guess the configured default branch name first. If it does not
match the reported commit hash, let's fall back to `master` as before.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.
As part of the implicit initialization of the local repository, Git just
learned to respect `init.defaultBranch` to choose a different initial
branch name. We now really want that branch name to be used as a
fall-back.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We just introduced the command-line option
`--initial-branch=<branch-name>` to allow initializing a new repository
with a different initial branch than the hard-coded one.
To allow users to override the initial branch name more permanently
(i.e. without having to specify the name manually for each and every
`git init` invocation), let's introduce the `init.defaultBranch` config
setting.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a growing number of projects and companies desiring to change
the main branch name of their repositories (see e.g.
https://twitter.com/mislav/status/1270388510684598272 for background on
this).
To change that branch name for new repositories, currently the only way
to do that automatically is by copying all of Git's template directory,
then hard-coding the desired default branch name into the `.git/HEAD`
file, and then configuring `init.templateDir` to point to those copied
template files.
To make this process much less cumbersome, let's introduce a new option:
`--initial-branch=<branch-name>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were a couple of instances in our manual pages that had an
opening diamond bracket without a corresponding closing one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `remote.<name>.branch` is not configured, `git submodule update`
currently falls back to using the branch name `master`. A much better
idea, however, is to use the remote `HEAD`: on all Git servers running
reasonably recent Git versions, the symref `HEAD` points to the main
branch.
Note: t7419 demonstrates that there _might_ be use cases out there that
_expect_ `git submodule update --remote` to update submodules to the
remote `master` branch even if the remote `HEAD` points to another
branch. Arguably, this patch makes the behavior more intuitive, but
there is a slight possibility that this might cause regressions in
obscure setups.
Even so, it should be okay to fix this behavior without anything like a
longer transition period:
- The `git submodule update --remote` command is not really common.
- Current Git's behavior when running this command is outright
confusing, unless the remote repository's current branch _is_ `master`
(in which case the proposed behavior matches the old behavior).
- If a user encounters a regression due to the changed behavior, the fix
is actually trivial: setting `submodule.<name>.branch` to `master`
will reinstate the old behavior.
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to push all matching branches, but none match, we offer a
message suggesting to push the `master` branch.
However, we want to step away from making that branch any more special
than any other branch, so let's reword that message to mention no branch
in particular.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bit fields in struct object have an unfortunate layout. Here's what
pahole reports on x86_64 GNU/Linux:
struct object {
unsigned int parsed:1; /* 0: 0 4 */
unsigned int type:3; /* 0: 1 4 */
/* XXX 28 bits hole, try to pack */
/* Force alignment to the next boundary: */
unsigned int :0;
unsigned int flags:29; /* 4: 0 4 */
/* XXX 3 bits hole, try to pack */
struct object_id oid; /* 8 32 */
/* size: 40, cachelines: 1, members: 4 */
/* sum members: 32 */
/* sum bitfield members: 33 bits, bit holes: 2, sum bit holes: 31 bits */
/* last cacheline: 40 bytes */
};
Notice the 1+3+29=33 bits in bit fields and 28+3=31 bits in holes.
There are holes inside the flags bit field as well -- while some object
flags are used for more than one purpose, 22, 23 and 24 are still free.
Use 23 and 24 instead of 27 and 28 for TOPO_WALK_EXPLORED and
TOPO_WALK_INDEGREE. This allows us to reduce FLAG_BITS by one so that
all bitfields combined fit into a single 32-bit slot:
struct object {
unsigned int parsed:1; /* 0: 0 4 */
unsigned int type:3; /* 0: 1 4 */
unsigned int flags:28; /* 0: 4 4 */
struct object_id oid; /* 4 32 */
/* size: 36, cachelines: 1, members: 4 */
/* last cacheline: 36 bytes */
};
With this tight packing the size of struct object is reduced by 10%.
Other architectures probably benefit as well.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run a test helper function in test_submodule_switch_common(), we
sometimes specify a whole helper function as the $command. When we do
this, in some test cases, we just mark the whole function with
`test_must_fail`. However, it's possible that the helper function might
fail earlier or later than expected due to an introduced bug. If this
happens, then the test case will still report as passing but it should
really be marked as failing since it didn't actually display the
intended behaviour.
Instead of invoking `test_must_fail $command`, pass the string
"test_must_fail" as the second argument in case where the git command is
expected to fail.
When $command is a helper function, the parent function calling
test_submodule_switch_common() is test_submodule_switch_func(). For all
test_submodule_switch_func() invocations, increase the granularity of
the argument test helper function by prefixing the git invocation which is
meant to fail with the second argument like this:
$2 git checkout "$1"
In the other cases, test_submodule_switch() and
test_submodule_forced_switch(), instead of passing in the git command
directly, wrap it using the git_test_func() and pass the git arguments
using the global variable $gitcmd. Unfortunately, since closures aren't
a thing in shell scripts, the global variable is necessary. Another
unfortunate result is that the "git_test_func" will used as the test
case name when $command is printed but it's worth it for the cleaner
code.
Finally, as an added bonus, `test_must_fail` will now only run on git
commands.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The anonymize_str() function takes a generator callback, but there's no
way to pass extra context to it. Let's add the usual "void *data"
parameter to the generator interface and pass it along.
This is mildly annoying for existing callers, all of which pass NULL,
but is necessary to avoid extra globals in some cases we'll add in a
subsequent patch.
While we're touching each of these callbacks, we can further observe
that none of them use the existing orig/len parameters at all. This
makes sense, since the point is for their output to have no discernable
basis in the original (my original version had some notion that we might
use a one-way function to obfuscate the names, but it was never
implemented). So let's drop those extra parameters. If a caller really
wants to do something with them, it can pass a struct through the new
data parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the other anonymization functions keep their static mappings
inside the function to avoid polluting the global namespace. Let's do
the same for "idents", as nobody needs it outside of
anonymize_ident_line().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're using a separate keydata struct for hash lookups, we have
more flexibility in how we allocate anonymized_entry structs. Let's push
the "orig" key into a flex member within the struct. That should save us
a few bytes of memory per entry (a pointer plus any malloc overhead),
and may make lookups a little faster (since it's one less pointer to
chase in the comparison function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the anonymize_str() interface is restricted to NUL-terminated
strings, there's no need for us to keep track of the length of each
entry in the hashmap. This simplifies the code and saves a bit of
memory.
Note that we do still need to compare the stored results to partial
strings passed in by the callers. We can do that by using hashmap's
keydata feature to get the ptr/len pair into the comparison function,
and then using strncmp().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the anonymize_mem() interface _can_ store arbitrary byte
sequences, none of the callers uses this feature (as of the previous
commit). We'd like to keep it that way, as we'll be exposing the
string-like nature of the anonymization routines to the user. So let's
tighten up the interface a bit:
- don't treat "len" as an out-parameter from anonymize_mem(); this
ensures callers treat the pointer result as a NUL-terminated string
- likewise, don't treat "len" as an out-parameter from generator
functions
- swap out "void *" for "char *" as appropriate to signal that we
don't handle arbitrary memory
- rename the function to anonymize_str()
This will also open up some optimization opportunities in a future
patch.
Note that we can't drop the "len" parameter entirely. Some callers do
pass in partial strings (e.g., "foo/bar", len=3) to avoid copying, and
we need to handle those still.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast-export stores anonymized oids, it does so as binary strings.
And while the anonymous mapping storage is binary-clean (at least as of
the previous commit), this will become awkward when we start exposing
more of it to the user. In particular, if we allow a method for
retaining token "foo", then users may want to specify a hex oid as such
a token.
Let's just switch to storing the hex strings. The difference in memory
usage is negligible (especially considering how infrequently we'd
generally store an oid compared to, say, path components).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our anonymize_mem() function is careful to take a ptr/len pair to allow
storing binary tokens like object ids, as well as partial strings (e.g.,
just "foo" of "foo/bar"). But it duplicates the hash key using
xstrdup()! That means that:
- for a partial string, we'd store all bytes up to the NUL, even
though we'd never look at anything past "len". This didn't produce
wrong behavior, but was wasteful.
- for a binary oid that doesn't contain a zero byte, we'd copy garbage
bytes off the end of the array (though as long as nothing complained
about reading uninitialized bytes, further reads would be limited by
"len", and we'd produce the correct results)
- for a binary oid that does contain a zero byte, we'd copy _fewer_
bytes than intended into the hashmap struct. When we later try to
look up a value, we'd access uninitialized memory and potentially
falsely claim that a particular oid is not present.
The most common reason to store an oid is an anonymized gitlink, but our
test case doesn't have any gitlinks at all. So let's add one whose oid
contains a NUL and is present at two different paths. ASan catches the
memory error, but even without it we can detect the bug because the oid
is not anonymized the same way for both paths.
And of course the fix is to copy the correct number of bytes. We don't
technically need the appended NUL from xmemdupz(), but it doesn't hurt
as an extra protection against anybody treating it like a string (plus a
future patch will push us more in that direction).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our tests of the anonymized repo just hard-code the expected set of
objects in the root and subdirectory trees. This makes them brittle to
the test setup changing (e.g., adding new paths that need tested).
Let's look at the original repo to compute our expected set of objects.
Note that this isn't completely perfect (e.g., we still rely on there
being only one tree in the root), but it does simplify later patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the context of many projects renaming their primary branch names away
from `master`, Git wants to stop treating the `master` branch specially.
Let's start with `git fmt-merge-msg`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, an attempt was made to correct the "N=1, M=0"
case. However, the fix was botched and it introduced two half-correct
sections by mistake. Combine these half-correct sections into one fully
correct section.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d91d6fbf26 (commit-reach: create repo_is_descendant_of(), 2020-06-17)
adds a repository aware version of is_descendant_of() and a backward
compatibility shim that is barely used.
Update all callers to directly use the new repo_is_descendant_of()
function instead; making the codebase simpler and pushing more
the_repository references higher up the stack.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we push using the DAV-based protocol, the client is the one that
performs the ref updates and therefore makes the checks to see whether
an unforced push should be allowed. We make this check by determining
if either (a) we lack the object file for the old value of the ref or
(b) the new value of the ref is not newer than the old value, and in
either case, reject the push.
However, the ref_newer function, which performs this latter check, has
an odd behavior due to the reuse of certain object flags. Specifically,
it will incorrectly return false in its first invocation and then
correctly return true on a subsequent invocation. This occurs because
the object flags used by http-push.c are the same as those used by
commit-reach.c, which implements ref_newer, and one piece of code
misinterprets the flags set by the other.
Note that this does not occur in all cases. For example, if the example
used in the tests is changed to use one repository instead of two and
rewind the head to add a commit, the test passes and we correctly reject
the push. However, the example provided does trigger this behavior, and
the code has been broken in this way since at least Git 2.0.0.
To solve this problem, let's move the two sets of object flags so that
they don't overlap, since we're clearly using them at the same time.
The new set should not conflict with other usage because other users are
either builtin code (which is not compiled into git http-push) or
upload-pack (which we similarly do not use here).
Reported-by: Michael Ward <mward@smartsoftwareinc.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effect of sparse checkout settings on submodules is documented.
* en/sparse-with-submodule-doc:
git-sparse-checkout: clarify interactions with submodules
The same worktree directory must be registered only once, but
"git worktree move" allowed this invariant to be violated, which
has been corrected.
* es/worktree-duplicate-paths:
worktree: make "move" refuse to move atop missing registered worktree
worktree: generalize candidate worktree path validation
worktree: prune linked worktree referencing main worktree path
worktree: prune duplicate entries referencing same worktree path
worktree: make high-level pruning re-usable
worktree: give "should be pruned?" function more meaningful name
worktree: factor out repeated string literal
The interface to redact sensitive information in the trace output
has been simplified.
* jt/redact-all-cookies:
http: redact all cookies, teach GIT_TRACE_REDACT=0
Further code clean-up.
* cc/upload-pack-data-2:
upload-pack: move pack_objects_hook to upload_pack_data
upload-pack: move allow_sideband_all to upload_pack_data
upload-pack: move allow_ref_in_want to upload_pack_data
upload-pack: move allow_filter to upload_pack_data
upload-pack: move keepalive to upload_pack_data
upload-pack: pass upload_pack_data to upload_pack_config()
upload-pack: change multi_ack to an enum
upload-pack: move multi_ack to upload_pack_data
upload-pack: move filter_capability_requested to upload_pack_data
upload-pack: move use_sideband to upload_pack_data
upload-pack: move static vars to upload_pack_data
upload-pack: annotate upload_pack_data fields
upload-pack: actually use some upload_pack_data bitfields
Sometimes git would suggest the user to run `git prune` when there are
too many unreachable loose objects. It's more user-friendly if we add
git-prune into bash completion.
Signed-off-by: John Lin <johnlinp@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we apply a binary patch, we must have the full object ID in the
header in order to apply it; without that, any attempt to apply it will
fail. If we set GIT_DIR to empty, git apply does not know about the
hash algorithm we're using, and consequently any attempt to apply a
patch using SHA-256 will fail, since the object ID is the wrong length.
The reason we set the GIT_DIR environment variable is because we don't
want to modify the index; we just want to know whether the patch
applies. Instead, let's just use a temporary file for the index, which
will be cleaned up automatically when the object goes out of scope.
Additionally, read the configuration for the repository and compute the
length of an object ID based on it. Use that when matching object IDs
with a regex or computing the all-zeros object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of calling the function is_sha1, call it is_oid and update it to
match either a SHA-1 or a SHA-256 hex object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code of git-cvsserver currently has several hard-coded 20 and 40
constants that are the length of SHA-1. When parsing the configuration
file, read the extensions.objectformat configuration setting as well as
CVS-related ones and adjust the hash sizes accordingly. Use these
computed values in all the places we match object IDs.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the configuration or when creating a new repository, load
the extensions.objectFormat value and set the object ID length to 64 if
it's "sha256". Note that we use the hex length in git-svn because most
of our processing is done on hex values, not binary ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several places throughout git-svn that use various hard-coded
constants. For matching object IDs, use the $oid variable. Compute the
record size we use for our revision storage based on the object ID.
When parsing the revision map format, use a wildcard in the pack format
since we know that the data we're parsing is always exactly the record
size. This lets us continue to use a constant for the pack format.
Finally, update several comments to reflect the fact that an object ID
may be of one of multiple sizes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the Git modules, git-svn excepted, don't know anything about the
hash algorithm and mostly work. However, when we're printing an
all-zero object ID in Git::IndexInfo, we need to know the hash length.
Since we don't want to change the API to have that information passed
in, let's query the config to find the hash algorithm and compute the
right value.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-svn has several variables for SHA-1 constants, including short hash
values and full length hash values. Since these are no longer SHA-1
specific, let's start them with "oid" instead of "sha1". Add a
constant, oid_length, which is the length of the hash algorithm in use
in hex. We use the hex version because overwhelmingly that's what's
used by git-svn.
We don't currently set oid_length based on the repository algorithm, but
we will in a future commit.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The record size used in the git svn storage is four bytes plus the
length of the binary hash. Pass the hash length into our Perl
invocation and use it to compute the size of the records.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `diff-files' command and related commands which call the function
`cmd_diff_files()', consider the "intent-to-add" files as a part of the
index when comparing the work-tree against it. This was previously
addressed in commits [1] and [2] by turning the option
`--ita-invisible-in-index' (introduced in [3]) on by default.
For `diff-files' (and `add -p' as a consequence) to show the i-t-a
files as as new, `ita_invisible_in_index' will be enabled by default
here as well.
[1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default,
2018-05-26)
[2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist
in index", 2016-10-24)
[3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24)
Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_worktrees() accepts a 'flags' argument, however, there are no
existing flags (the lone flag GWT_SORT_LINKED was recently retired) and
no behavior which can be tweaked. Therefore, drop the 'flags' argument.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Of all the clients of get_worktrees(), only "git worktree list" wants
the list sorted in a very specific way; other clients simply don't care
about the order. Rather than imbuing get_worktrees() with special
knowledge about how various clients -- now and in the future -- may want
the list sorted, drop the sorting capability altogether and make it the
client's responsibility to sort the list if needed.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the object ID for our test .gitignore file, let's
compute it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The size of a record in the database used by git svn is four bytes plus
the length of the binary hash. Instead of hard-coding 24, compute this
value based on the size of the hash in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute the relevant tree objects for SHA-256 and use those when
appropriate instead of using the SHA-1 ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of stripping off the first 41 characters of git log output,
let's just strip off the first space-separated component, which will
work for any size hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-prompt includes the current branch, a bunch of single character
mini-state displayers, and some much longer in-progress state
notifications. The current branch is always shown. The single
character mini-state displayers are all off by default (they are not
self explanatory) but each has an environment variable for turning it
on. The in-progress state notifications provide no configuration
options for turning them off, and can be up to 15 characters long (e.g.
"|REBASE (12/18)" or "|CHERRY-PICKING").
The single character mini-state tends to be used for things like "Do you
have any stashes in refs/stash?" or "Are you ahead or behind of
upstream?". These are things which users can take advantage of but do
not affect most normal git operations. The in-progress states, by
contrast, suggest the user needs to interact differently and may also
prevent some normal operations from succeeding (e.g. git switch may show
an error instead of switching branches).
Sparsity is like the in-progress states in that it suggests a
fundamental different interaction with the repository (many of the files
from the repository are not present in your working copy!). A few
commits ago added sparsity information to wt_longstatus_print_state(),
grouping it with other in-progress state displays. We do similarly here
with the prompt and show the extra state, by default, with an extra
|SPARSE
This state can be present simultaneously with the in-progress states, in
which case it will appear before the other states; for example,
(branchname|SPARSE|REBASE 6/10)
The reason for showing the "|SPARSE" substring before other states is to
emphasize those other states. Sparsity is probably not going to change
much within a repository, while temporary operations will. So we want
the state changes related to temporary operations to be listed last, to
make them appear closer to where the user types and make them more
likely to be noticed.
The fact that sparsity isn't just cached metadata or additional
information is what leads us to show it more similarly to the
in-progress states, but the fact that sparsity is not transient like the
in-progress states might cause some users to want an abbreviated
notification of sparsity state or perhaps even be able to turn it off.
Allow GIT_PS1_COMPRESSSPARSESTATE to be set to request that it be
shortened to a single character ('?'), and GIT_PS1_OMITSPARSESTATE to be
set to request that sparsity state be omitted from the prompt entirely.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using an algorithm other than SHA-1, we need the remote helper to
advertise support for the object-format extension and provide
information back to us so that we can properly parse refs and return
data. Ensure that the test remote helper understands these extensions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like with the dumb HTTP transport, there isn't a way to explicitly
specify the hash algorithm when dealing with a bundle, so detect the
algorithm based on the length of the object IDs in the prerequisites and
ref advertisements.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack by default reads the repository to determine the object
format. However, when outside of a repository, it's necessary to specify
the hash algorithm in use so that the pack can be properly indexed. Add
an --object-format argument when invoking git index-pack outside of a
repository.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we speak protocol v2 in this test, we must pass the object-format
header if the algorithm is not SHA-1. Otherwise, git upload-pack fails
because the hash algorithm doesn't match and not because we've failed to
speak the protocol correctly. Pass the header so that our assertions
test what we're really interested in.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're using an algorithm other than SHA-1, we need to specify the
algorithm in use so we don't get a failure with an "unknown format"
message. Add a wrapper function that specifies this header if required.
Skip specifying this header for SHA-1 to test that it works both with an
without this header.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make this test work with SHA-256, offer an object-format
capability so that both sides use the same algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-sha1-array uses the_hash_algo under the hood. Since t0064 wants to
use the value that is correct for the hash algorithm that we're testing,
make sure the test helper initializes the repository to set
the_hash_algo correctly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally, the remote-curl transport helper is aware of the hash
algorithm we're using because we're in a repo with the appropriate hash
algorithm set. However, when using git ls-remote outside of a
repository, we won't have initialized the hash algorithm properly, so
use hash_to_hex_algop to print the ref corresponding to the algorithm
we've detected.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When outside a repository, git index-pack is unable to guess the hash
algorithm in use for a pack, since packs don't contain any information
on the algorithm in use. Pass an option to index-pack to help it out in
this test.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack is usually run in a repository, but need not be. Since
packs don't contains information on the algorithm in use, instead
relying on context, add an option to index-pack to tell it which one
we're using in case someone runs it outside of a repository. Since
using --stdin necessarily implies a repository, don't allow specifying
an object format if it's provided to prevent users from passing an
option that won't work. Add documentation for this option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the info/refs file for a repository, we have no explicit
way to detect which hash algorithm is in use because the file doesn't
provide one. Detect the hash algorithm in use by the size of the first
object ID.
If we have an empty repository, we don't know what the hash algorithm is
on the remote side, so default to whatever the local side has
configured. Without doing this, we cannot clone an empty repository
since we don't know its hash algorithm. Test this case appropriately,
since we currently have no tests for cloning an empty repository with
the dumb HTTP protocol.
We anonymize the URL like elsewhere in the function in case the user has
decided to include a secret in the URL.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_write_fd was only used in bugreport.c. Since that file now uses
write_in_full, this method is no longer needed. In addition, strbuf_write_fd
did not guard against exceeding MAX_IO_SIZE for the platform, nor
provided error handling in the event of a failure if only partial data
was written to the file descriptor. Since already write_in_full has this
capability and is in general use, it should be used instead. The change
impacts strbuf.c and strbuf.h.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_write_fd method did not provide checks for buffers larger
than MAX_IO_SIZE. Replacing with write_in_full ensures the entire
buffer will always be written to disk or report an error and die.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_pull() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Leaking in cmd_*() isn't a big deal, but sets a bad example for other
users of is_descendant_of().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref_newer() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git branches have been qualified as topic branches, integration branches,
development branches, feature branches, release branches and so on.
Git has a branch that is the master *for* development, but it is not
the master *of* any "slave branch": Git does not have slave branches,
and has never had, except for a single testcase that claims otherwise. :)
Independent of any future change to the naming of the "master" branch,
removing this sole appearance of the term is a strict improvement: it
avoids divisive language, and talking about "feature branch" clarifies
which developer workflow the test is trying to emulate.
Reported-by: Till Maas <tmaas@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment in cmd_diff() states that if one tree-ish and no blobs are
provided, (the "N=1, M=0" case), it will provide a diff between the tree
and the cache. This is incorrect because a diff happens between the
tree-ish and the working tree. Remove the `--cached` in the comment so
that the correct behavior is shown. Add a new section describing the
"N=1, M=0, --cached" behavior.
Next, describe the "N=0, M=0, --cached" case, similar to the above since
it is undocumented.
Finally, fix some spacing issues. Add spaces between each section for
consistency and readability. Also, change tabs within the comment into
spaces.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the early feedback of folks trying out sparse-checkouts at
$dayjob is that sparse checkouts can sometimes be disorienting; users
can forget that they had a sparse-checkout and then wonder where files
went. Add some output to 'git status' in the form of a simple line that
states:
You are in a sparse checkout with 35% of files present.
where, obviously, the exact figure changes depending on what percentage
of files from the index do not have the SKIP_WORKTREE bit set.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use of negative pathspec, while collecting paths including
untracked ones in the working tree, was broken.
* en/do-match-pathspec-fix:
dir: fix treatment of negated pathspecs
The behaviour of "sparse-checkout" in the state "git clone
--no-checkout" left was changed accidentally in 2.27, which has
been corrected.
* en/sparse-checkout:
sparse-checkout: avoid staging deletions of all files
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.
* js/reflog-anonymize-for-clone-and-fetch:
clone/fetch: anonymize URLs in the reflog
Reduce memory usage during "diff --quiet" in a worktree with too
many stat-unmatched paths.
* jk/diff-memuse-optim-with-stat-unmatch:
diff: discard blob data from stat-unmatched pairs
In an earlier patch, multiple struct acccesses to `graph_pos` and
`generation` were auto-converted to multiple method calls.
Since the values are fixed and commit-slab access costly, we would be
better off with storing the values as a local variable and reusing it.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We remove members `graph_pos` and `generation` from the struct commit.
The default assignments in init_commit_node() are no longer valid,
which is fine as the slab helpers return appropriate default values and
the assignments are removed.
We will replace existing use of commit->generation and commit->graph_pos
by commit_graph_data_slab helpers using
`contrib/coccinelle/commit.cocci'.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct commit is used in many contexts. However, members
`generation` and `graph_pos` are only used for commit-graph related
operations and otherwise waste memory.
This wastage would have been more pronounced as we transition to
generation number v2, which uses 64-bit generation number instead of
current 32-bits.
As they are often accessed together, let's introduce struct
commit_graph_data and move them to a commit_graph_data slab.
While the overall test suite runs just as fast as master,
(series: 26m48s, master: 27m34s, faster by 2.87%), certain commands
like `git merge-base --is-ancestor` were slowed by 40% as discovered
by Szeder Gábor [1]. After minimizing commit-slab access, the slow down
persists but is closer to 20%.
Derrick Stolee believes the slow down is attributable to the underlying
algorithm rather than the slowness of commit-slab access [2] and we will
follow-up in a later series.
[1]: https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2]: https://lore.kernel.org/git/13db757a-9412-7f1e-805c-8a028c4ab2b1@gmail.com/
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool->commit_count to keep count of
commits per repository and was used to assign commit->index.
However, commit-slab code requires commit->index values to be unique
and a global count would be correct, rather than a per-repo count.
Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.
As commit_count has no use anymore, let's also drop it from the struct.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repo_is_descendant_of() method is aware of the existence of the
commit-graph file. It checks for generation_numbers_enabled() before
deciding on using can_all_from_reach() or repo_in_merge_bases()
depending on the situation. The reason here is that can_all_from_reach()
uses a depth-first search that is limited by the minimum generation
number of the target commits, and that algorithm can be very slow when
generation numbers are not present. The alternative uses
paint_down_to_common() which will walk the entire merge-base boundary,
which is typically slower.
This method is used by commands like "git tag --contains" and "git
branch --contains" for very fast results when a commit-graph file
exists. Unfortunately, it is _not_ used in commands like "git merge-base
--is-ancestor" which is doing an even simpler request.
This issue was raised recently [1] with respect to a change to how
generation numbers are stored, but was also reported much earlier [2]
before commit-reach.c existed to simplify these reachability queries.
[1] https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2] https://lore.kernel.org/git/87608bawoa.fsf@evledraar.gmail.com/
The root cause is that builtin/merge-base.c has a method
handle_is_ancestor() that calls in_merge_bases(), an older version of
repo_in_merge_bases(). It would be better if we have every caller to
in_merge_bases() use the logic in can_all_from_reach() when possible.
This is where things get a little tricky: repo_is_descendant_of() calls
repo_in_merge_bases() in the non-generation numbers enabled case! If we
simply update repo_in_merge_bases() to call repo_is_descendant_of()
instead of repo_in_merge_bases_many(), then we will get a recursive call
loop. Thankfully, this is caught by the test suite in the default mode
(i.e. GIT_TEST_COMMIT_GRAPH=0).
The trick, then, is to make the non-generation number case for
repo_is_descendant_of() call repo_in_merge_bases_many() directly,
skipping the non-_many version. This allows us to take advantage of this
faster code path, when possible.
The easiest way to measure the performance impact is to test the
following command on the Linux kernel repository:
git merge-base --is-ancestor <A> <B>
| A | B | Time Before | Time After |
|------|------|-------------|------------|
| v3.0 | v5.7 | 0.459s | 0.028s |
| v4.0 | v5.7 | 0.267s | 0.021s |
| v5.0 | v5.7 | 0.074s | 0.013s |
Note that each of these samples return success. The old code performed
the same operation when <A> and <B> are swapped. However,
can_all_from_reach() will return immediately if the generation numbers
show that <A> has larger generation number than <B>. Thus, the time for
the swapped case is universally 0.004s in each case.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next change will make repo_in_merge_bases() depend on the logic in
is_descendant_of(), but we need to make the method independent of
the_repository first.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git branch` accepts `--edit-description` in conjunction with other
arguments. However, `--edit-description` is its own mode, similar to
`--set-upstream-to`, which is also made mutually exclusive with other
modes. Prevent `--edit-description` from being mixed with other modes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "--set-upstream-to" and "--unset-upstream" tests, specific error
conditions are being tested. However, there is no way of ensuring that a
test case is failing because of some specific error.
Check stderr of failing commands to ensure that they are failing in the
expected way.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up style of test by changing some filenames from "expected" to
"expect", which follows typical test convention.
Also, change a space-indent into a tab-indent.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6b1db43109 ("clean: teach clean -d to preserve ignored paths",
2017-05-23) added the following code block (among others) to git-clean:
if (remove_directories)
dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
The reason for these flags is well documented in the commit message, but
isn't obvious just from looking at the code. Add some explanations to
the code to make it clearer.
Further, it appears git-2.26 did not correctly handle this combination
of flags from git-clean. With both these flags and without
DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into
all untracked AND ignored directories. git-2.26.0 clearly was not doing
that. I don't know the full reasons for that or whether git < 2.27.0
had additional unknown bugs because of that misbehavior, because I don't
feel it's worth digging into. As per the huge changes and craziness
documented in commit 8d92fb2927 ("dir: replace exponential algorithm
with a linear one", 2020-04-01), the old algorithm was a mess and was
thrown out. What I can say is that git-2.27.0 correctly recurses into
untracked AND ignored directories with that combination.
However, in clean's case we don't need to recurse into ignored
directories; that is just a waste of time. Thus, when git-2.27.0
started correctly handling those flags, we got a performance regression
report. Rather than relying on other bugs in fill_directory()'s former
logic to provide the behavior of skipping ignored directories, make use
of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in
commit eec0f7f2b7 ("status: add option to show ignored files
differently", 2017-10-30) for this purpose.
Reported-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I spent a long time trying to figure out how and whether the code worked
with different values of ignore, ignore_only, and remove_directories.
After lots of time setting up lots of testcases, sifting through lots of
print statements, and walking through the debugger, I finally realized
that one piece of code related to how it was all setup was found in
clean.c rather than dir.c. Make a change that would have made it easier
for me to do the extra testing by putting this handling in one spot.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.h documented quite clearly that DIR_SHOW_IGNORED and
DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this
effect by the definition of both enum values. However, a command like
git clean -fx $DIR
would set both values for dir.flags. I _think_ it happened to work
because:
* As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect
if DIR_SHOW_IGNORED_TOO is set.
* As coded, I believe DIR_SHOW_IGNORED would just happen to take
precedence over DIR_SHOW_IGNORED_TOO in the code as currently
constructed.
Which is a long way of saying "we just got lucky".
Fix clean.c to avoid setting these mutually exclusive values at the same
time, and add a check to dir.c that will throw a BUG() to prevent anyone
else from making this mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ignoring the sparse-checkout feature momentarily, if one has a submodule and
creates local branches within it with unpushed changes and maybe adds some
untracked files to it, then we would want to avoid accidentally removing such
a submodule. So, for example with git.git, if you run
git checkout v2.13.0
then the sha1collisiondetection/ submodule is NOT removed even though it
did not exist as a submodule until v2.14.0. Similarly, if you only had
v2.13.0 checked out previously and ran
git checkout v2.14.0
the sha1collisiondetection/ submodule would NOT be automatically
initialized despite being part of v2.14.0. In both cases, git requires
submodules to be initialized or deinitialized separately. Further, we
also have special handling for submodules in other commands such as
clean, which requires two --force flags to delete untracked submodules,
and some commands have a --recurse-submodules flag.
sparse-checkout is very similar to checkout, as evidenced by the similar
name -- it adds and removes files from the working copy. However, for
the same avoid-data-loss reasons we do not want to remove a submodule
from the working copy with checkout, we do not want to do it with
sparse-checkout either. So submodules need to be separately initialized
or deinitialized; changing sparse-checkout rules should not
automatically trigger the removal or vivification of submodules.
I believe the previous wording in git-sparse-checkout.txt about
submodules was only about this particular issue. Unfortunately, the
previous wording could be interpreted to imply that submodules should be
considered active regardless of sparsity patterns. Update the wording
to avoid making such an implication. It may be helpful to consider two
example situations where the differences in wording become important:
In the future, we want users to be able to run commands like
git clone --sparse=moduleA --recurse-submodules $REPO_URL
and have sparsity paths automatically set up and have submodules *within
the sparsity paths* be automatically initialized. We do not want all
submodules in any path to be automatically initialized with that
command.
Similarly, we want to be able to do things like
git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN
and search through $REV for $PATTERN within the recorded sparsity
patterns. We want it to recurse into submodules within those sparsity
patterns, but do not want to recurse into directories that do not match
the sparsity patterns in search of a possible submodule.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preliminary clean-ups around refs API, plus file format
specification documentation for the reftable backend.
* hn/refs-cleanup:
reftable: define version 2 of the spec to accomodate SHA256
reftable: clarify how empty tables should be written
reftable: file format documentation
refs: improve documentation for ref iterator
t: use update-ref and show-ref to reading/writing refs
refs.h: clarify reflog iteration order
Since all invocations of test_submodule_forced_switch() are git
commands, automatically prepend "git" before invoking
test_submodule_switch_common().
Similarly, many invocations of test_submodule_switch() are also git
commands so automatically prepend "git" before invoking
test_submodule_switch_common() as well.
Finally, for invocations of test_submodule_switch() that invoke a custom
function, rename the old function to test_submodule_switch_func().
This is necessary because in a future commit, we will be adding some
logic that needs to distinguish between an invocation of a plain git
comamnd and an invocation of a test helper function.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As 'upload-pack.c' is now using 'struct upload_pack_data'
thoroughly, let's refactor some common code into a new
do_got_oid() function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'oldest_have' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to got_oid(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to ok_to_give_up(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_acks(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to process_haves(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's change allow_unadvertised_object_request,
which is now part of 'upload_pack_data', from an 'unsigned int'
to an enum.
This will make it clear which values this variable can take.
While at it let's change this variable name to 'allow_uor' to
make it shorter.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_unadvertised_object_request'
static variable into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'extra_edge_obj' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's properly initialize and clear 'extra_edge_obj'
in the appropriate 'upload_pack_data' initialization and
clearing functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'shallow_nr' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_unshallow(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen_by_rev_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_shallow_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to send part of its packfile response as URIs.
An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a fetch results in a packfile being downloaded, a .keep file is
generated, so that the packfile can be preserved (from, say, a running
"git repack") until refs are written referring to the contents of the
packfile.
In a subsequent patch, a successful fetch using protocol v2 may result
in more than one .keep file being generated. Therefore, teach
fetch_pack() and the transport mechanism to support multiple .keep
files.
Implementation notes:
- builtin/fetch-pack.c normally does not generate .keep files, and thus
is unaffected by this or future changes. However, it has an
undocumented "--lock-pack" feature, used by remote-curl.c when
implementing the "fetch" remote helper command. In keeping with the
remote helper protocol, only one "lock" line will ever be written;
the rest will result in warnings to stderr. However, in practice,
warnings will never be written because the remote-curl.c "fetch" is
only used for protocol v0/v1 (which will not generate multiple .keep
files). (Protocol v2 uses the "stateless-connect" command, not the
"fetch" command.)
- connected.c has an optimization in that connectivity checks on a ref
need not be done if the target object is in a pack known to be
self-contained and connected. If there are multiple packfiles, this
optimization can no longer be done.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent patches will change how the output of pack-objects is
processed, so extract that processing into its own function.
Currently, at most 1 character can be buffered (in the "buffered" local
variable). One of those patches will require a larger buffer, so replace
that "buffered" local variable with a buffer array.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current C Git implementation expects Git servers to follow a
specific order of sections when transmitting protocol v2 responses, but
this is not explicit in the documentation. Make the order explicit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach http-fetch the ability to download packfiles directly, given a
URL, and to verify them.
The http_pack_request suite has been augmented with a function that
takes a URL directly. With this function, the hash is only used to
determine the name of the temporary file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_main() in http-fetch.c will grow in a future patch, so refactor the
HTTP walking part into its own function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git fetches a pack using dumb HTTP, (among other things) it invokes
index-pack on a ".pack.temp" packfile, specifying the filename as an
argument.
A future commit will require the aforementioned invocation of index-pack
to also generate a "keep" file. To use this, we either have to use
index-pack's naming convention (because --keep requires the pack's
filename to end with ".pack") or to pass the pack through stdin. Of the
two, it is simpler to pass the pack through stdin.
Thus, teach http to pass --stdin to index-pack. As a bonus, the code is
now simpler.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" takes special care to avoid creating a new worktree
at a location already registered to an existing worktree even if that
worktree is missing (which can happen, for instance, if the worktree
resides on removable media). "git worktree move", however, is not so
careful when validating the destination location and will happily move
the source worktree atop the location of a missing worktree. This leads
to the anomalous situation of multiple worktrees being associated with
the same path, which is expressly forbidden by design. For example:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
$ git worktree remove ../bar
fatal: validation failed, cannot remove working tree:
'.../bar' does not point back to '.git/worktrees/bar'
Fix this shortcoming by enhancing "git worktree move" to perform the
same additional validation of the destination directory as done by "git
worktree add".
While at it, add a test to verify that "git worktree move" won't move a
worktree atop an existing (non-worktree) path -- a restriction which has
always been in place but was never tested.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" checks that the specified path is a valid location
for a new worktree by ensuring that the path does not already exist and
is not already registered to another worktree (a path can be registered
but missing, for instance, if it resides on removable media). Since "git
worktree add" is not the only command which should perform such
validation ("git worktree move" ought to also), generalize the the
validation function for use by other callers, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree prune" detects when multiple entries are associated with
the same path and prunes the duplicates, however, it does not detect
when a linked worktree points at the path of the main worktree.
Although "git worktree add" disallows creating a new worktree with the
same path as the main worktree, such a case can arise outside the
control of Git even without the user mucking with .git/worktree/<id>/
administrative files. For instance:
$ git clone foo.git
$ git -C foo worktree add ../bar
$ rm -rf bar
$ mv foo bar
$ git -C bar worktree list
.../bar deadfeeb [master]
.../bar deadfeeb [bar]
Help the user recover from such corruption by extending "git worktree
prune" to also detect when a linked worktree is associated with the path
of the main worktree.
Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fundamental restriction of linked working trees is that there must
only ever be a single worktree associated with a particular path, thus
"git worktree add" explicitly disallows creation of a new worktree at
the same location as an existing registered worktree. Nevertheless,
users can still "shoot themselves in the foot" by mucking with
administrative files in .git/worktree/<id>/. Worse, "git worktree move"
is careless[1] and allows a worktree to be moved atop a registered but
missing worktree (which can happen, for instance, if the worktree is on
removable media). For instance:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
Help users recover from this form of corruption by teaching "git
worktree prune" to detect when multiple worktrees are associated with
the same path.
[1]: A subsequent commit will fix "git worktree move" validation to be
more strict.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level logic for removing a worktree is well encapsulated in
delete_git_dir(). However, high-level details related to pruning a
worktree -- such as dealing with verbosity and dry-run mode -- are not
encapsulated. Factor out this high-level logic into its own function so
it can be re-used as new worktree corruption detectors are added.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Readers of the name prune_worktree() are likely to expect the function
to actually prune a worktree, however, it only answers the question
"should this worktree be pruned?". Give it a name more reflective of its
true purpose to avoid such confusion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Version appends a hash ID to the file header, making it slightly larger.
This commit also changes "SHA-1" into "object ID" in many places.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format allows for some ambiguity, as a lone footer also starts
with a valid file header. However, the current JGit code will barf on
this. This commit codifies this behavior into the standard.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn Pearce explains:
Some repositories contain a lot of references (e.g. android at 866k,
rails at 31k). The reftable format provides:
- Near constant time lookup for any single reference, even when the
repository is cold and not in process or kernel cache.
- Near constant time verification if a SHA-1 is referred to by at least
one reference (for allow-tip-sha1-in-want).
- Efficient lookup of an entire namespace, such as `refs/tags/`.
- Support atomic push `O(size_of_update)` operations.
- Combine reflog storage with ref storage.
This file format spec was originally written in July, 2017 by Shawn
Pearce. Some refinements since then were made by Shawn and by Han-Wen
Nienhuys based on experiences implementing and experimenting with the
format. (All of this was in the context of our work at Google and
Google is happy to contribute the result to the Git project.)
Imported from JGit[1]'s current version (c217d33ff,
"Documentation/technical/reftable: improve repo layout", 2020-02-04)
of Documentation/technical/reftable.md and converted to asciidoc by
running
pandoc -t asciidoc -f markdown reftable.md >reftable.txt
using pandoc 2.2.1. The result required the following additional
minor changes:
- removed the [TOC] directive to add a table of contents, since
asciidoc does not support it
- replaced git-scm.com/docs links with linkgit: directives that link
to other pages within Git's documentation
[1] https://eclipse.googlesource.com/jgit/jgit
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite support for GIT_CURL_VERBOSE in terms of GIT_TRACE_CURL.
Looking good.
* jt/curl-verbose-on-trace-curl:
http, imap-send: stop using CURLOPT_VERBOSE
t5551: test that GIT_TRACE_CURL redacts password
Code clean-up.
* cc/upload-pack-data:
upload-pack: use upload_pack_data fields in receive_needs()
upload-pack: pass upload_pack_data to create_pack_file()
upload-pack: remove static variable 'stateless_rpc'
upload-pack: pass upload_pack_data to check_non_tip()
upload-pack: pass upload_pack_data to send_ref()
upload-pack: move symref to upload_pack_data
upload-pack: use upload_pack_data writer in receive_needs()
upload-pack: pass upload_pack_data to receive_needs()
upload-pack: pass upload_pack_data to get_common_commits()
upload-pack: use 'struct upload_pack_data' in upload_pack()
upload-pack: move 'struct upload_pack_data' around
upload-pack: move {want,have}_obj to upload_pack_data
upload-pack: remove unused 'wants' from upload_pack_data
The code to parse "git bisect start" command line was lax in
validating the arguments.
* cb/bisect-helper-parser-fix:
bisect--helper: avoid segfault with bad syntax in `start --term-*`
On-the-wire protocol v2 easily falls into a deadlock between the
remote-curl helper and the fetch-pack process when the server side
prematurely throws an error and disconnects. The communication has
been updated to make it more robust.
* dl/remote-curl-deadlock-fix:
stateless-connect: send response end packet
pkt-line: define PACKET_READ_RESPONSE_END
remote-curl: error on incomplete packet
pkt-line: extern packet_length()
transport: extract common fetch_pack() call
remote-curl: remove label indentation
remote-curl: fix typo
Code simplification and test coverage enhancement.
* bc/filter-process:
t2060: add a test for switch with --orphan and --discard-changes
builtin/checkout: simplify metadata initialization
The command line completion script (in contrib/) tried to complete
"git stash -p" as if it were "git stash push -p", but it was too
aggressive and also affected "git stash show -p", which has been
corrected.
* vs/complete-stash-show-p-fix:
completion: don't override given stash subcommand with -p
The check in "git fsck" to ensure that the tree objects are sorted
still had corner cases it missed unsorted entries.
* rs/fsck-duplicate-names-in-trees:
fsck: detect more in-tree d/f conflicts
t1450: demonstrate undetected in-tree d/f conflict
t1450: increase test coverage of in-tree d/f detection
fsck: fix a typo in a comment
"git bugreport" learns to report what shell is in use.
* es/bugreport-shell:
bugreport: include user interactive shell
help: add shell-path to --build-options
As FreeBSD is not the only platform whose regexp library reports
a REG_ILLSEQ error when fed invalid UTF-8, add logic to detect that
automatically and skip the affected tests.
* cb/t4210-illseq-auto-detect:
t4210: detect REG_ILLSEQ dynamically and skip affected tests
t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ)
"git log -L..." now takes advantage of the "which paths are touched
by this commit?" info stored in the commit-graph system.
* ds/line-log-on-bloom:
line-log: integrate with changed-path Bloom filters
line-log: try to use generation number-based topo-ordering
line-log: more responsive, incremental 'git log -L'
t4211-line-log: add tests for parent oids
line-log: remove unused fields from 'struct line_log_data'
While the MyFirstContribution guide exists and has received some use and
positive reviews, it is still not as discoverable as it could be. Add a
reference to it from the GitHub pull request template, where many
brand-new contributors may look. Also add a reference to it in
SubmittingPatches, which is the central source of guidance for patch
contribution.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each worktree removed by "git worktree prune", it reports the reason
for the removal. All reasons share the common prefix "Removing
worktrees/%s:". As new removal reasons are added, this prefix needs to
be duplicated, which is error-prone and potentially cumbersome.
Therefore, factor out the common prefix.
Although this change seems to increase the "sentence lego quotient", it
should be reasonably safe, as the reason for removal is a distinct
clause, not strictly related to the prefix. Moreover, the "worktrees" in
"Removing worktrees/%s:" is a path literal which ought not be localized,
so by factoring it out, we can more easily avoid exposing that path
fragment to translators.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 0b4396f068 (git-p4: make python2.7 the oldest supported version,
2019-12-13), git-p4 was updated to only support 2.7 and newer. Since
Python 2.6 is pretty much ancient history, update CodingGuidelines to
show that 2.7 is the oldest version supported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 48a8c26c62 (Documentation: avoid poor-man's small caps GIT,
2013-01-21), the documentation was amended to spell Git's name as Git
when talking about the system as a whole. However, t/README was skipped
over when the treatment was applied.
Bring t/README into conformance with the CodingGuidelines by casing
"Git" properly.
While we're at it, fix a small typo. Change "the git internal" to "the
Git internals".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the provided free_commit_graph() to properly free the commit graph
in fuzz-commit-graph. Otherwise, the fuzzer itself leaks memory when the
struct contains pointers to allocated memory.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In trace output (when GIT_TRACE_CURL is true), redact the values of all
HTTP cookies by default. Now that auth headers (since the implementation
of GIT_TRACE_CURL in 74c682d3c6 ("http.c: implement the GIT_TRACE_CURL
environment variable", 2016-05-24)) and cookie values (since this
commit) are redacted by default in these traces, also allow the user to
inhibit these redactions through an environment variable.
Since values of all cookies are now redacted by default,
GIT_REDACT_COOKIES (which previously allowed users to select individual
cookies to redact) now has no effect.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_match_pathspec() started life as match_pathspec_depth_1() and for
correctness was only supposed to be called from match_pathspec_depth().
match_pathspec_depth() was later renamed to match_pathspec(), so the
invariant we expect today is that do_match_pathspec() has no direct
callers outside of match_pathspec().
Unfortunately, this intention was lost with the renames of the two
functions, and additional calls to do_match_pathspec() were added in
commits 75a6315f74 ("ls-files: add pathspec matching for submodules",
2016-10-07) and 89a1f4aaf7 ("dir: if our pathspec might match files
under a dir, recurse into it", 2019-09-17). Of course,
do_match_pathspec() had an important advantge over match_pathspec() --
match_pathspec() would hardcode flags to one of two values, and these
new callers needed to pass some other value for flags. Also, although
calling do_match_pathspec() directly was incorrect, there likely wasn't
any difference in the observable end output, because the bug just meant
that fill_diretory() would recurse into unneeded directories. Since
subsequent does-this-path-match checks on individual paths under the
directory would cause those extra paths to be filtered out, the only
difference from using the wrong function was unnecessary computation.
The second of those bad calls to do_match_pathspec() was involved -- via
either direct movement or via copying+editing -- into a number of later
refactors. See commits 777b420347 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19),
8d92fb2927 ("dir: replace exponential algorithm with a linear one",
2020-04-01), and 95c11ecc73 ("Fix error-prone fill_directory() API; make
it only return matches", 2020-04-01). The last of those introduced the
usage of do_match_pathspec() on an individual file, and thus resulted in
individual paths being returned that shouldn't be.
The problem with calling do_match_pathspec() instead of match_pathspec()
is that any negated patterns such as ':!unwanted_path` will be ignored.
Add a new match_pathspec_with_flags() function to fulfill the needs of
specifying special flags while still correctly checking negated
patterns, add a big comment above do_match_pathspec() to prevent others
from misusing it, and correct current callers of do_match_pathspec() to
instead use either match_pathspec() or match_pathspec_with_flags().
One final note is that DO_MATCH_LEADING_PATHSPEC needs special
consideration when working with DO_MATCH_EXCLUDE. The point of
DO_MATCH_LEADING_PATHSPEC is that if we have a pathspec like
*/Makefile
and we are checking a directory path like
src/module/component
that we want to consider it a match so that we recurse into the
directory because it _might_ have a file named Makefile somewhere below.
However, when we are using an exclusion pattern, i.e. we have a pathspec
like
:(exclude)*/Makefile
we do NOT want to say that a directory path like
src/module/component
is a (negative) match. While there *might* be a file named 'Makefile'
somewhere below that directory, there could also be other files and we
cannot pre-emptively rule all the files under that directory out; we
need to recurse and then check individual files. Adjust the
DO_MATCH_LEADING_PATHSPEC logic to only get activated for positive
pathspecs.
Reported-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, extensions were recognized regardless of repository format
version. If the user sets an undefined "extensions" value on a
repository of version 0 and that value is used by a future git version,
they might get an undesired result.
Because all extensions now also upgrade repository versions, tightening
the check would help avoid this for future extensions.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions' configuration variable gets special meaning in the new
repository version, so when enabling the extension we should upgrade the
repository to version 1.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Retroactively adding a filter can be useful for existing shallow clones as
they allow users to see earlier change histories without downloading all
git objects in a regular --unshallow fetch.
Without this patch, users can make a clone partial by editing the
repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1
git config extensions.partialClone origin
git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such
edits can be error-prone, teach Git to perform the required configuration
change automatically instead.
Note that this change does not modify the existing git behavior which
recognizes setting extensions.partialClone without changing
repositoryFormatVersion.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In version 1 of repository format, "extensions" gained special meaning
and it is safer to avoid upgrading when there are pre-existing
extensions.
Make list-objects-filter to use the helper function instead of setting
repository version directly as a prerequisite of exposing the upgrade
capability.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse-checkout's purpose is to update the working tree to have it
reflect a subset of the tracked files. As such, it shouldn't be
switching branches, making commits, downloading or uploading data, or
staging or unstaging changes. Other than updating the worktree, the
only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the
index. In particular, this sets up a nice invariant: running
sparse-checkout will never change the status of any file in `git status`
(reflecting the fact that we only set the SKIP_WORKTREE bit if the file
is safe to delete, i.e. if the file is unmodified).
Traditionally, we did a _really_ bad job with this goal. The
predecessor to sparse-checkout involved manual editing of
.git/info/sparse-checkout and running `git read-tree -mu HEAD`. That
command would stage and unstage changes and overwrite dirty changes in
the working tree.
The initial implementation of the sparse-checkout command was no better;
it simply invoked `git read-tree -mu HEAD` as a subprocess and had the
same caveats, though this issue came up repeatedly in review comments
and workarounds for the problems were put in place before the feature
was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6].
[1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/
[4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/
[6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/
However, these workarounds, in addition to disabling the feature in a
number of important cases, also missed one special case. I'll get back
to it later.
In the 2.27.0 cycle, the disabling of the feature was lifted by finally
replacing the internal equivalent of `git read-tree -mu HEAD` with
something that did what we wanted: the new update_sparsity() function in
unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index
and updates the working tree to match. This new function handles all
the cases that were problematic for the old implementation, except that
it breaks the same special case that avoided the workarounds of the old
implementation, but broke it in a different way.
So...that brings us to the special case: a git clone performed with
--no-checkout. As per the meaning of the flag, --no-checkout does not
check out any branch, with the implication that you aren't on one and
need to switch to one after the clone. Implementationally, HEAD is
still set (so in some sense you are partially on a branch), but
* the index is "unborn" (non-existent)
* there are no files in the working tree (other than .git/)
* the next time git switch (or git checkout) is run it will run
unpack_trees with `initial_checkout` flag set to true.
It is not until you run, e.g. `git switch <somebranch>` that the index
will be written and files in the working tree populated.
With this special --no-checkout case, the traditional `read-tree -mu
HEAD` behavior would have done the equivalent of acting like checkout --
switch to the default branch (HEAD), write out an index that matches
HEAD, and update the working tree to match. This special case slipped
through the avoid-making-changes checks in the original sparse-checkout
command and thus continued there.
After update_sparsity() was introduced and used (see commit f56f31af03
("sparse-checkout: use new update_sparsity() function", 2020-03-27)),
the behavior for the --no-checkout case changed: Due to git's
auto-vivification of an empty in-memory index (see do_read_index() and
note that `must_exist` is false), and due to sparse-checkout's
update_working_directory() code to always write out the index after it
was done, we got a new bug. That made it so that sparse-checkout would
switch the repository from a clone with an "unborn" index (i.e. still
needing an initial_checkout), to one that had a recorded index with no
entries. Thus, instead of all the files appearing deleted in `git
status` being known to git as a special artifact of not yet being on a
branch, our recording of an empty index made it suddenly look to git as
though it was definitely on a branch with ALL files staged for deletion!
A subsequent checkout or switch then had to contend with the fact that
it wasn't on an initial_checkout but had a bunch of staged deletions.
Make sure that sparse-checkout changes nothing in the index other than
the SKIP_WORKTREE bit; in particular, when the index is unborn we do not
have any branch checked out so there is no sparsification or
de-sparsification work to do. Simply return from
update_working_directory() early.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 897d68e7af (Makefile: use curl-config --cflags, 2020-03-26), we
taught the build process to use `curl-config --cflags` to make sure that
it can find cURL's headers.
In the MSVC build, this is completely bogus because we're running in a
Git for Windows SDK whose `curl-config` supports the _GCC_ build.
Let's just ignore each and every `-I<path>` option where `<path>` points
to GCC/Clang specific headers.
Reported by Jeff Hostetler in
https://github.com/microsoft/git/issues/275.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.
Let's scrub them before writing them to the reflog.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'pack_objects_hook' static
variable into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's also free() it in upload_pack_data_clear().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_sideband_all' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_ref_in_want' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_filter' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'keepalive' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to upload_pack_config(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
that are set in upload_pack_config() into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's take this opportunity to change the
'multi_ack' variable, which is now part of 'upload_pack_data',
to an enum.
This will make it clear which values this variable can take.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the multi_ack static variable into
this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the filter_capability_requested
static variable into this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'use_sideband' static variable
into this struct.
This variable is used by both v0 and v2 protocols.
While at it, let's update the comment near the variable
definition.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'no_done', 'daemon_mode' and
'timeout' variables into this struct.
They are only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling them into
upload_pack_data and just letting v2 code ignore them as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's annotate fields from this struct to let
people know which ones are used only for protocol v0 and which
ones only for protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's actually start using some bitfields of
that struct. These bitfields were introduced in 3145ea957d
("upload-pack: introduce fetch server command", 2018-03-15), but
were never used.
We could instead have just removed the following bitfields
from the struct:
unsigned use_thin_pack : 1;
unsigned use_ofs_delta : 1;
unsigned no_progress : 1;
unsigned use_include_tag : 1;
but using them makes it possible to remove a number of static
variables with the same name and purpose from 'upload-pack.c'.
This is a behavior change, as we accidentally used to let values
in those bitfields propagate from one v2 "fetch" command to
another for ssh/git/file connections (but not for http). That's
fixing a bug, but one nobody is likely to see, because it would
imply the client sending different capabilities for each request.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following lines were not covered in a recent line-coverage test
against Git:
builtin/commit-graph.c
5b6653e5 244) progress = start_delayed_progress(
5b6653e5 268) stop_progress(&progress);
These statements are executed when both '--stdin-commits' and
'--progress' are passed. Introduce a trio of tests that exercise various
combinations of these options to ensure that these lines are covered.
More importantly, this is exercising a (somewhat) previously-ignored
feature of '--stdin-commits', which is that it respects '--progress'.
Prior to 5b6653e523 (builtin/commit-graph.c: dereference tags in
builtin, 2020-05-13), dereferencing input from '--stdin-commits' was
done inside of commit-graph.c.
Now that an additional progress meter may be generated from outside of
commit-graph.c, add a corresponding test to make sure that it also
respects '--[no]-progress'.
The other location that generates progress meter output (from d335ce8f24
(commit-graph.c: show progress of finding reachable commits,
2020-05-13)) is already covered by any test that passes '--reachable'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of tests in t5318 use 'test_line_count = 0 ...' to make sure
that some command does not write any output. While correct, it is more
idiomatic to use 'test_must_be_empty' instead. Switch the former
invocations to use the latter instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some repositories in the wild have commits that record nonsense
committer timezone (e.g. rails.git); "git fast-import" learned an
option to pass these nonsense timestamps intact to allow recreating
existing repositories as-is.
* en/fast-import-looser-date:
fast-import: add new --date-format=raw-permissive format
The error message from "git checkout -b foo -t bar baz" was
confusing.
* rs/checkout-b-track-error:
checkout: improve error messages for -b with extra argument
checkout: add tests for -b and --track
We've adopted a convention that any on-stack structure can be
initialized to have zero values in all fields with "= { 0 }", even
when the first field happens to be a pointer, but sparse complained
that a null pointer should be spelled NULL for a long time. Start
using -Wno-universal-initializer option to squelch it.
* lo/sparse-universal-zero-init:
sparse: allow '{ 0 }' to be used without warnings
"feature.experimental" configuration variable is to let volunteers
easily opt into a set of newer features, which use of the v2
transport protocol is now a part of.
* jn/experimental-opts-into-proto-v2:
config: let feature.experimental imply protocol.version=2
The "--prepare-p4-only" option is supposed to stop after replaying
one changeset, but kept going (by mistake?)
* bk/p4-prepare-p4-only-fix:
git-p4.py: fix --prepare-p4-only error with multiple commits
When performing a tree-level diff against the working tree, we may find
that our index stat information is dirty, so we queue a filepair to be
examined later. If the actual content hasn't changed, we call this a
stat-unmatch; the stat information was out of date, but there's no
actual diff. Normally diffcore_std() would detect and remove these
identical filepairs via diffcore_skip_stat_unmatch(). However, when
"--quiet" is used, we want to stop the diff as soon as we see any
changes, so we check for stat-unmatches immediately in diff_change().
That check may require us to actually load the file contents into the
pair of diff_filespecs. If we find that the pair isn't a stat-unmatch,
then no big deal; we'd likely load the contents later anyway to generate
a patch, do rename detection, etc, so we want to hold on to it. But if
it is a stat-unmatch, then we have no more use for that data; the whole
point is that we're going discard the pair. However, we never free the
allocated diff_filespec data.
In most cases, keeping that data isn't a problem. We don't expect a lot
of stat-unmatch entries, and since we're using --quiet, we'd quit as
soon as we saw such a real change anyway. However, there are extreme
cases where it makes a big difference:
1. We'd generally mmap() the working tree half of the pair. And since
the OS may limit the total number of maps, we can run afoul of this
in large repositories. E.g.:
$ cd linux
$ git ls-files | wc -l
67959
$ sysctl vm.max_map_count
vm.max_map_count = 65530
$ git ls-files | xargs touch ;# everything is stat-dirty!
$ git diff --quiet
fatal: mmap failed: Cannot allocate memory
It should be unusual to have so many files stat-dirty, but it's
possible if you've just run a script like "sed -i" or similar.
After this patch, the above correctly exits with code 0.
2. Even if you don't hit mmap limits, the index half of the pair will
have been pulled from the object database into heap memory. Again
in a clone of linux.git, running:
$ git ls-files | head -n 10000 | xargs touch
$ git diff --quiet
peaks at 145MB heap before this patch, and 94MB after.
This patch solves the problem by freeing any diff_filespec data we
picked up during the "--quiet" stat-unmatch check in diff_changes.
Nobody is going to need that data later, so there's no point holding on
to it. There are a few things to note:
- we could skip queueing the pair entirely, which could in theory save
a little work. But there's not much to save, as we need a
diff_filepair to feed to diff_filespec_check_stat_unmatch() anyway.
And since we cache the result of the stat-unmatch checks, a later
call to diffcore_skip_stat_unmatch() call will quickly skip over
them. The diffcore code also counts up the number of stat-unmatched
pairs as it removes them. It's doubtful any callers would care about
that in combination with --quiet, but we'd have to reimplement the
logic here to be on the safe side. So it's not really worth the
trouble.
- I didn't write a test, because we always produce the correct output
unless we run up against system mmap limits, which are both
unportable and expensive to test against. Measuring peak heap
would be interesting, but our perf suite isn't yet capable of that.
- note that diff without "--quiet" does not suffer from the same
problem. In diffcore_skip_stat_unmatch(), we detect the stat-unmatch
entries and drop them immediately, so we're not carrying their data
around.
- you _can_ still trigger the mmap limit problem if you truly have
that many files with actual changes. But it's rather unlikely. The
stat-unmatch check avoids loading the file contents if the sizes
don't match, so you'd need a pretty trivial change in every single
file. Likewise, inexact rename detection might load the data for
many files all at once. But you'd need not just 64k changes, but
that many deletions and additions. The most likely candidate is
perhaps break-detection, which would load the data for all pairs and
keep it around for the content-level diff. But again, you'd need 64k
actually changed files in the first place.
So it's still possible to trigger this case, but it seems like "I
accidentally made all my files stat-dirty" is the most likely case
in the real world.
Reported-by: Jan Christoph Uhde <Jan@UhdeJc.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are multiple repositories in the wild with random, invalid
timezones. Most notably is a commit from rails.git with a timezone of
"+051800"[1]. A few searches will find other repos with that same
invalid timezone as well. Further, Peff reports that GitHub relaxed
their fsck checks in August 2011 to accept any timezone value[2], and
there have been multiple reports to filter-repo about fast-import
crashing while trying to import their existing repositories since they
had timezone values such as "-7349423" and "-43455309"[3].
The existing check on timezone values inside fast-import may prove
useful for people who are crafting fast-import input by hand or with a
new script. For them, the check may help them avoid accidentally
recording invalid dates. (Note that this check is rather simplistic and
there are still several forms of invalid dates that fast-import does not
check for: dates in the future, timezone values with minutes that are
not divisible by 15, and timezone values with minutes that are 60 or
greater.) While this simple check may have some value for those users,
other users or tools will want to import existing repositories as-is.
Provide a --date-format=raw-permissive format that will not error out on
these otherwise invalid timezones so that such existing repositories can
be imported.
[1] 4cf94979c9
[2] https://lore.kernel.org/git/20200521195513.GA1542632@coredump.intra.peff.net/
[3] https://github.com/newren/git-filter-repo/issues/88
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:ruester/git-po-de:
l10n: de.po: Fix typo in the German translation of octopus
l10n: de.po: Update German translation for Git 2.27.0
f1e3df3169 (t: increase test coverage of signature verification output,
2020-03-04) adds GPG dependent tests to t4202 and t6200 that were found
problematic with at least OpenBSD 6.7.
Using an escaped '|' for alternations works only in some implementations
of grep (e.g. GNU and busybox).
It is not part of POSIX[1] and not supported by some BSD, macOS, and
possibly other POSIX compatible implementations.
Use `grep -E`, and write it using extended regular expression.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --orphan option is used to create a local branch which is detached
from the current history. In git switch, it always resets to the empty
tree, and thus the only completion we can provide is a branch name.
Follow the same rules for -c/-C (and -b/-B) when completing the argument
to --orphan.
In the case of git switch, after we complete the argument, there is
nothing more we can complete for git switch, so do not even try. Nothing
else would be valid.
In the case of git checkout, --orphan takes a start point which it uses
to determine the checked out tree, even though it created orphaned
history.
Update the previously added test cases as they are now passing.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit added several test cases highlighting the subpar
completion logic for -c/-C and -b/-B when completing git switch and git
checkout.
In order to distinguish completing the argument vs the start-point for
this option, we now use the wordlist to determine the previous full word
on the command line.
If it's -c or -C (-b/-B for checkout), then we know that we are
completing the argument for the branch name.
Given that a user who already knows the branch name they want to
complete will simply not use completion, it makes sense to complete the
small subset of local branches when completing the argument for -c/-C.
In all other cases, if -c/-C are on the command line but are not the
most recent option, then we must be completing a start-point, and should
allow completing against all references.
Update the -c/-C and -b/-B tests to indicate they now pass.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current completion for the --track option of git switch and git checkout
is sub par. In addition to the DWIM logic of a bare branch name, --track
has DWIM logic to convert specified remote/branch names into a local
branch tracking that remote. For example
$git switch --track origin/master
This will create a local branch name master, that tracks the master
branch of the origin remote.
In fact, git switch --track on its own will not accept other forms of
references. These must instead be specified manually via the -c/-C/-b/-B
options.
Introduce __git_remote_heads() and the "remote-heads" mode for
__git_complete_refs. Use this when the --track option is provided while
completing in _git_switch and _git_checkout. Just as in the --detach
case, we never enable DWIM mode for --track, because it doesn't make
sense.
It should be noted that completion support is still a bit sub par when
it comes to handling -c/-C and --orphan. This will be resolved in
a future change.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like git switch, we should not complete DWIM remote branch names
if --detach has been specified. To avoid this, refactor _git_checkout in
a similar way to _git_switch.
Note that we don't simply clear dwim_opt when we find -d or --detach, as
we will be adding other modes and checks, making this flow easier to
follow.
Update the previously failing tests to show that the breakage has been
resolved.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new --mode option to __git_complete_refs, which allows changing
the behavior to call __git_heads instead of __git_refs.
By passing --mode=heads, __git_complete_refs will only output local
branches. This enables using "--mode=heads --dwim" to enable listing
local branches and the remote unique branch names for DWIM.
Refactor completion support to use the new mode option, rather than
calling __git_heads directly. This has the advantage that we can now
correctly allow local branches along with suitable DWIM refs, rather
than only allowing DWIM when we complete all references.
Choose what mode it uses when calling __git_complete_refs. If -d or
--detach have been provided, then simply complete all refs, but
*without* the DWIM option as these DWIM names won't work properly in
--detach mode.
Otherwise, call __git_complete_refs with the default dwim_opt value and
use the new "heads" mode.
In this way, the basic support for completing just "git switch <TAB>"
will result in only local branches and remote unique names for DWIM.
The basic no-options tests for git switch, as well as several of the
-c/-C tests now pass, so remove the known breakage tags.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new helper, __git_find_last_on_cmdline is introduced, similar to the
already existing __git_find_on_cmdline, but which operates in reverse,
finding the *last* matching word of the provided wordlist.
Use this in a new __git_checkout_default_dwim_mode() function that will
determine when to enable listing of DWIM remote branches.
The __git_find_last_on_cmdline() function is used to determine which
--guess or --no-guess is in effect. If either one is provided, then we
unconditionally enable or disable the DWIM mode based on the last
provided option.
If neither --guess nor --no-guess is provided, then we check for
--no-track, and finally for GIT_COMPLETION_CHECKOUT_NO_GUESS=1.
This function is then used in _git_switch and _git_checkout to improve
the handling for when we enable listing of these DWIM remote branches.
This new logic is more robust, as we will correctly identify superseded
options, and ensure that both _git_switch and _git_checkout enable DWIM
in similar ways.
We can now update a few tests to indicate they pass. A few of the tests
previously added to highlight issues with the old DWIM logic still fail.
This is because of a separate issue related to the default completion
behavior of git switch, which will be addressed in a future change.
Additionally, due to this change, a few tests for the -b/-B handling of
git checkout now fail. This is a minor regression, and will be fixed by
a following change that improves the overall handling of -b/-B. Mark
these tests as known breakages for now.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_complete_refs is the main function used for completing references.
It is primarily used as a wrapper around __git_refs, and is easier to
extend since its arguments are option-like.
One major downside of __git_complete_refs and __git_refs currently, is
the lack of ability to complete only a subset of refs such as branches
(refs/heads) or tags (refs/tags).
Normally, a caller might just decide to use __git_heads() or
__git_tags(). However, in the case of git-switch, it is useful to
complete both branches *and* DWIM remote branch names.
Due to the complexity and implementation of __git_refs, it is not easy
to extend it to support listing only a subset of references.
Instead, we can extend __git_complete_refs to do this. For this to be
done, we must first ensure that "--dwim" support is not tied to calling
__git_refs.
Instead of passing $dwim into __git_refs, we can implement
a __gitcomp_direct_append function which can append to COMPREPLY after
a call to __gitcomp_direct.
If --dwim is passed to __git_complete_refs, use __gitcomp_direct_append
to add the output of __git_dwim_remote_heads to the completion list.
In this way, --dwim support is now independent of calling __git_refs.
A future change will add an additional option to control what set of
references __git_complete_refs will output.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_refs() has the ability to report unique remote names for
supporting completion of remote branch names for the DWIMery of git
checkout and git switch.
For git checkout, this is fine, because it always supports completing
all local references.
However, git switch by default only supports either switching branches
or using this DWIMery to create a local branch tracking the remote
branch.
Future work to cleanup and improve completion support for git switch
will be aided if the remote branch names can be completed separately
from __git_refs.
Extract this logic to a function __git_dwim_remote_heads(), and use it
in __git_refs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The __git_complete_refs uses the "--track" option to specify when to
enable listing of unique remote branches which are used by the DWIM
logic of git checkout and git switch.
Using the term '--track' here is confusing because the git commands
themselves have '--track' as an argument. Additionally, the completion
logic for _git_switch also checks for --track. Keeping the meaning of
track_opt and --track for __git_complete_refs straight from the --track
git switch and git checkout option is difficult when reading this code.
Use the option '--dwim' instead, indicating this is about enabling or
disabling logic related to DWIM mode. Also rename the local variable
track_opt to dwim_opt to further reduce the confusion when reading the
completion code for _git_switch.
Because it is plausible for users to have developed their own
completions which rely on __git_complete_ref, keep --track as a synonym
for --dwim, even though we no longer use it in any of the core git
completion logic. Add a comment explaining why it remains as an
alternative spelling for --dwim.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to -c/-C, --orphan takes an argument which is the branch name to
use. We ought to complete this branch name using similar rules as to how
we complete new branch names for -c/-C and -b/-B. Namely, limit the
total number of options provided by completing to the local branches.
Additionally, git switch --orphan does not take any start point and will
always create using the empty-tree. Thus, after the branch name is
completed, git switch --orphan should not complete any references.
Add test cases showing the expected behavior of --orphan, for both the
argument and starting point.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout
(-c/-C or -b/-B), the commands switch to a different mode: `git switch
-c <branch> <some-referance>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing git switch or git checkout, it makes sense to complete
the branch name differently from the start point.
When completing a branch, one might consider that we do not have
anything worth completing. After all, a new branch must have an entirely
new name. Consider, however, that if a user names branches using some
similar scheme, they might wish to name a new branch by modifying the
name of an existing branch.
To avoid overloading completion for the argument, it seems reasonable to
complete only the local branch names and the valid "Do What I Mean"
remote branch names.
Add tests for the completion of the argument to -c/-C and -b/-B,
highlighting this preferred completion behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout,
-c/-C or -b/-B, the commands operate in a different mode: `git switch -c
<branch> <some-reference>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing the start-point, we ought to always complete all valid
references.
Add tests for the completion of the start-point to -c/-C and -b/-B.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --track option is provided to git switch or git checkout, and
no branch is specified by -c or -b, git will interpret the tracking
branch to determine the local branch name to use. This "Do What I Mean"
logic is similar but distinct from the default DWIM logic of
interpreting a unique remote branch name as a request to create and
track that branch.
For example, `git switch --track origin/master` is interpreted as
a request to create a local branch named master that is tracking
origin/master.
The current completion for git checkout in this regard is only somewhat
poor:
$git checkout --track <TAB>
HEAD
master
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
At least it still includes remote references. The clutter from including
all references isn't too bad.
However, git switch completion is terrible:
$git switch --track <TAB>
master
matching-branch
It only shows local branches, not even allowing any form of completion
of the remote references!
Add tests which highlight the expected behavior of completing --track on
its own.
Note that when -c/-C or -b/-B are provided we do expect completing more
references, but this will be discussed in a future change that addresses
these options specifically.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When completing words for git switch, the completion function correctly
disables the DWIM remote branch names when in the '--detach' mode. These
DWIM remote branch names will not work when the --detach option is
specified, so it does not make sense to complete them.
git checkout, however, does not disable the completion of DWIM remote
branch names in this case.
Add test cases for both git switch and git checkout showing the expected
behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with a single argument that is the name of a remote branch
that does not yet exist locally, both git switch and git checkout can
interpret this as a request to create a local branch that tracks that
remote branch. We call this behavior "Do What I Mean", or DWIM for
short.
To aid in using this DWIM, it makes sense for completion to list these
unique remote branch names when completing possible arguments for git
switch and git checkout. Indeed, both _git_checkout and _git_switch
implement support for completing such DWIM branch names.
In other words, in addition to the usual completions provided for git
switch, this "DWIM" logic means completion will include the names of
branches on remotes that are unique and thus there can be no ambiguity
of which remote to track when creating the local branch.
However, the DWIM logic is not always active. Many options, such as
--no-guess, --no-track, and --track disable this DWIM logic, as they
cause git switch and git checkout to behave in different modes.
Additionally, some completion users do not wish to have tab completion
include these remote names by default, and thus introduced
GIT_COMPLETION_CHECKOUT_NO_GUESS as an optional way to configure the
completion support to disable this feature of completion support.
For this reason, _git_checkout and _git_switch have many rules about
when to enable or disable completing of these remote refs. The two
commands follow similar but not identical rules.
Set aside the question of command modes that do not accept this DWIM
logic (--track, -c, --orphan, --detach) for now. Thinking just about the
main mode of git checkout and git switch, the following guidelines will
help explain the basic rules we ought to support when deciding whether
to list the remote branches for DWIM in completion.
1. if --guess is enabled, we should list DWIM remote branch names, even
if something else would disable it
2. if --no-guess, --no-track or GIT_COMPLETION_CHECKOUT_NO_GUESS=1,
then we should disable listing DWIM remote branch names.
3. Since the '--guess' option is a boolean option, a later --guess
should override --no-guess, and a later --no-guess should override
--guess.
Putting all of these together, add some tests that highlight the
expected behavior of this DWIM logic.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with no options, git switch only allows switching between
branches. The one exception to this is the "Do What I Mean" logic that
allows a unique remote branch name to be interpreted as a request to
create a branch of the same name that is tracking that remote branch.
Unfortunately, the logic for the completion of git switch results in
completing not just branch names, but also pseudorefs like HEAD, tags,
and fully specified <remote>/<branch> references.
For example, we currently complete the following:
$git switch <TAB>
HEAD
branch-in-other
master
master-in-other
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
Indeed, if one were to attempt to use git switch with some of these
provided options, git will reject the request:
$git switch HEAD
fatal: a branch is expected, got 'HEAD
$git switch matching-tag
fatal: a branch is expected, got tag 'matching-tag'
$git switch other/branch-in-other
fatal: a branch is expected, got remote branch 'other/branch-in-other'
Ideally, git switch without options ought to complete only words which
will be accepted. Without options, this means to list local branch names
and the unique remote branch names without their remote name pre-pended.
$git switch <TAB>
branch-in-other
master
master-in-other
matching-branch
Add a test case that highlights this subpar completion. Also add
a similar test for git checkout completion that shows that due to the
complex nature of git checkout, it must complete all references.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When clearing the builtin operations on re-sourcing in the ZSH case we
can use the native ${parameters} associative array keys values to get
the currently `__gitcomp_builtin_*` operations using pattern matching
instead of using sed.
As also stated in commit 94408dc7, introducing this change the usage of
sed has some overhead implications, while ZSH can do this check just
using its native syntax.
Signed-off-by: Marco Trevisan (Treviño) <mail@3v1n0.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original patch selection code was written for `git add -p`, and the
fundamental unit on which it works is a hunk.
We hacked around that to handle deletions back in 24ab81ae4d
(add-interactive: handle deletion of empty files, 2009-10-27). But `git
add -p` would never see a new file, since we only consider the set of
tracked files in the index.
However, since the same machinery was used for `git checkout -p` &
friends, we can see new files.
Handle this case specifically, adding a new prompt for it that is
modeled after the `deleted file` case.
This also fixes the problem where added _empty_ files could not be
staged via `git checkout -p`.
Reported-by: Merlin Büge <toni@bluenox07.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote may or may not operate within a repository, and as such will
not have been initialized with the repository's hash algorithm. Even if
it were, the remote side could be using a different algorithm and we
would still want to display those refs properly. Find the hash
algorithm used by the remote side by querying the transport object and
set our hash algorithm accordingly.
Without this change, if the remote side is using SHA-256, we truncate
the refs to 40 hex characters, since that's the length of the default
hash algorithm (SHA-1).
Note that technically this is not a correct setting of the repository
hash algorithm since, if we are in a repository, it might be one of a
different hash algorithm from the remote side. However, our current
code paths don't handle multiple algorithms and won't for some time, so
this is the best we can do. We rely on the fact that ls-remote never
modifies the current repository, which is a reasonable assumption to
make.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has hard-coded pkt-lines with object IDs. The pkt-line
lengths necessarily differ between hash algorithms, so generate these
lines with the packetize helper so they're always the right size. In
addition, we will require an object-format capability for SHA-256, so
pass that capability on to the upload-pack process.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to communicate the protocol supported by the server side, add
support for advertising the object-format capability. We check that the
client side sends us an identical algorithm if it sends us its own
object-format capability, and assume it speaks SHA-1 if not.
In the test, when we're using an algorithm other than SHA-1, we need to
specify the algorithm in use so we don't get a failure with an "unknown
format" message. Add a test that we handle a mismatched algorithm.
Remove the test_oid_init call since it's no longer necessary.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using protocol v2, we need to know what hash algorithm is used by
the remote end. See if the server has sent us an object-format
capability, and if so, use it to determine the hash algorithm in use and
set that value in the packet reader. Parse the refs using this
algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're parsing refs, we need to know not only what the line we're
parsing is, but also the hash algorithm we should use to parse it, which
is stored in the reader object. Pass the packet reader object through
to the protocol v2 ref parsing function.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using SHA-256, we need to take advantage of the extensions section
in the config file, so we need to use repository format version 1.
Update the test to look for the correct value.
Note that test_oid produces a value without a trailing newline, so use
echo to ensure we print a trailing newline to compare it correctly
against the actual results.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index is capable of reading any possible index file whether or not
the index is inside a repository. However, because our index files lack
metadata about the hash algorithm in use, it's not possible to
autodetect the algorithm that a particular index file is using.
In order to allow us to read index files of any algorithm, let's set up
the .git directory gently so that we default to the algorithm for the
current repository, and add an --object-format option to allow users to
override this setting and continue to run show-index outside of a
repository altogether. Let's also document this new option so that
people can find it and use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our style these days is to place the description and the opening quote
of the body on the same line as test_expect_success (if it fits), to
place the trailing quote on a line by itself after the body, and to use
tabs. Since we're going to be making several significant changes to
this test, modernize the style to aid in readability of the subsequent
patches.
This patch should have no functional change.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both v2 pack index files and the v3 format specified as part of the
NewHash work have similar data starting at the CRC table. Much of the
existing code wants to read either this table or the offset entries
following it, and in doing so computes the offset each time.
In order to share as much code between v2 and v3, compute the offset of
the CRC table and store it when the pack is opened. Use this value to
compute offsets to not only the CRC table, but to the offset entries
beyond it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the test assertions in this test checks that git branch -m works
even without a .git/config file. However, if the repository requires
configuration extensions, such as because it uses a non-SHA-1 algorithm,
this assertion will fail. Mark the assertion as requiring SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're checking the repository's format, set the hash algorithm at
the same time. This ensures that we perform a suitable initialization
early enough to avoid confusing any parts of the code. If we defer
until later, we can end up with portions of the code which are confused
about the hash algorithm, resulting in segfaults when working with
SHA-256 repositories.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parse the server's object-format capability and respond accordingly,
dying if there is a mismatch.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure that we pass the object-format capability in the synthesized test
data so that this test works with algorithms other than SHA-1.
In addition, add an additional test using the old data for when we're
using SHA-1 so that we can be sure that we preserve backwards
compatibility with servers not offering the object-format capability.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a clone, we don't know what hash algorithm the other end
will support. Currently, we don't support fetching data belonging to a
different algorithm, so we must know what algorithm the remote side is
using in order to properly initialize the repository. We can know that
only after fetching the refs, so if the remote side has any references,
use that information to reinitialize the repository with the correct
hash algorithm information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the object-format extensions that let us determine the hash
algorithm in use when pushing, pulling, and fetching.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the object-format extensions that let us determine the hash
algorithm in use when pushing or pulling data.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the remote helper docs to document the object-format extensions
we will implement in remote-curl and the transport helper code shortly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we're fetching refs, detect the hash algorithm and parse the refs
using that algorithm.
As mentioned in the documentation, if multiple versions of the
object-format capability are provided, we use the first. No known
implementation supports multiple algorithms now, but they may in the
future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're going to be using this function in other files, so no longer mark
this function static.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
If the server does support our hash, advertise it as part of our
capabilities.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function, server_supports_hash, to see if the remote server
supports a particular hash algorithm when speaking protocol v1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When connecting to a remote system, we need to know what hash algorithm
it will be using to talk to us. Add a hash_algo member to struct
transport and add a function to read this data from the transport
object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a member for the hash algorithm currently in use to the packet
reader so it can parse references correctly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So far in protocol v2, all of our server capabilities that have values
have not had values that we've been interested in parsing. For example,
we receive but ignore the agent value.
However, in a future commit, we're going to want to parse out the value
of a server capability. To make this easy, add a function,
server_feature_v2, that can fetch the value provided as part of the
server capability.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a capability response, we can have multiple symref entries. In the
future, we will also allow for multiple hash algorithms to be specified.
To avoid duplication, expand the parse_feature_value function to take an
optional offset where the parsing should begin next time. Add a wrapper
function that allows us to query the next server feature value, and use
it in the existing symref parsing code.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Advertise the current hash algorithm in use by using the object-format
capability as part of the ref advertisement.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing capabilities for the pack protocol, there are times we'll
want to compare the value of a capability to a NUL-terminated string.
Since the data we're reading will be space-terminated, not
NUL-terminated, we need a function that compares the two strings, but
also checks that they're the same length. Otherwise, if we used strncmp
to compare these strings, we might accidentally accept a parameter that
was a prefix of the expected value.
Add a function, xstrncmpz, that takes a NUL-terminated string and a
non-NUL-terminated string, plus a length, and compares them, ensuring
that they are the same length.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future patch, we'll want to access multiple members from struct
packet_reader when parsing references. Therefore, have the ref parsing
code take pointers to struct reader instead of having to pass multiple
arguments to each function.
Rename the len variable to "linelen" to make it clearer what the
variable does in light of the variable change.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To set the default hash algorithm you can set the `GIT_DEFAULT_HASH`
environment variable. In the documentation this variable is named
`GIT_DEFAULT_HASH_ALGORITHM`, which is incorrect.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The explanation of the `--show-pulls` option added in commit 8d049e182e
("revision: --show-pulls adds helpful merges", 2020-04-10) consists of
several paragraphs and we use "+" throughout to tie them together in one
long chain of list continuations. Only thing is, we're not in any kind
of list, so these pluses end up being rendered literally.
The preceding few paragraphs describe `--ancestry-path` and there we
*do* have a list, since we've started one with `--ancestry-path::`. In
fact, we have several such lists for all the various history-simplifying
options we're discussing earlier in this file.
Thus, we're missing a list both from a consistency point of view and
from a practical rendering standpoint.
Let's start a list for `--show-pulls` where we start actually discussing
the option, and keep the paragraphs preceding it out of that list. That
is, drop all those pluses before the new list we're adding here.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with asciidoc-8.4.5 (as found on CentOS/Red Hat 6), the
period in the "[[files-in-.gitignore-are-tracked]]" anchor is not
properly parsed as a section:
WARNING: gitfaq.txt: line 245: missing [[files-in-.gitignore-are-tracked]] section
The resulting XML file fails to validate with xmlto:
xmlto: /git/Documentation/gitfaq.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
/git/Documentation/gitfaq.xml:3: element refentry: validity error :
Element refentry content does not follow the DTD, expecting
(beginpage? , indexterm* , refentryinfo? , refmeta? , (remark | link
| olink | ulink)* , refnamediv+ , refsynopsisdiv? , (refsect1+ |
refsection+)), got (refmeta refnamediv refsynopsisdiv refsect1
refsect1 refsect1 refsect1 variablelist refsect1 refsect1 )
Document /git/Documentation/gitfaq.xml does not validate
Let's avoid breaking users of platforms which ship an old version of
asciidoc, since the cost to do so is quite low.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various doc fixes.
* ma/doc-fixes:
git-sparse-checkout.txt: add missing '
git-credential.txt: use list continuation
git-commit-graph.txt: fix list rendering
git-commit-graph.txt: fix grammo
date-formats.txt: fix list continuation
In standard C, '{ 0 }' can be used as an universal zero-initializer.
However, Sparse complains if this is used on a type where the first
member (possibly nested) is a pointer since Sparse purposely wants
to warn when '0' is used to initialize a pointer type.
Legitimaly, it's desirable to be able to use '{ 0 }' as an idiom
without these warnings [1,2]. To allow this, an option have now
been added to Sparse:
537e3e2dae univ-init: conditionally accept { 0 } without warnings
So, add this option to the SPARSE_FLAGS variable.
Note: The option have just been added to Sparse. So, to benefit
now from this patch it's needed to use the latest Sparse
source from kernel.org. The option will simply be ignored
by older versions of Sparse.
[1] https://lore.kernel.org/r/e6796c60-a870-e761-3b07-b680f934c537@ramsayjones.plus.com
[2] https://lore.kernel.org/r/xmqqd07xem9l.fsf@gitster.c.googlers.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.
This can be seen in the following command which does not terminate:
$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
whereas the v1 version does terminate as expected:
$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
fatal: the remote end hung up unexpectedly
Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.
A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.
Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will use PACKET_READ_RESPONSE_END to separate
messages proxied by remote-curl. To prepare for this, add the
PACKET_READ_RESPONSE_END enum value.
In switch statements that need a case added, die() or BUG() when a
PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how
PACKET_READ_DELIM is implemented (especially in cases where packets are
being forwarded).
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.
For a stateless connection, inspect packets before sending them and
error out if a packet line packet is incomplete.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `diff.relative` boolean option set to `true` shows only changes in
the current directory/value specified by the `path` argument of the
`relative` option and shows pathnames relative to the aforementioned
directory.
Teach `--no-relative` to override earlier `--relative`
Add for git-format-patch(1) options documentation `--relative` and
`--no-relative`
Signed-off-by: Laurent Arnoud <laurent@spkdev.net>
Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Printing a message directly to stdout could affect TAP processing
and is not really needed, as there is a standard way to skip all
tests that could be used instead, while printing an equivalent
message.
While at it; update the message to better reflect that since
a85efb5985 (t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool,
2019-11-22), the enabling variable should be a recognized boolean
(ex: true, false, 1, 0) and get rid of the prerequisite that used
to guard all the tests, since "skip_all" is just much faster and
idempotent.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to create a branch "foo" based on "origin/master" and give
git commit -b an extra unsupported argument "bar", it confusingly
reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
$ git checkout --track -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
That's wrong, because it very well understands that "origin/master" is
supposed to be the start point for the new branch and not "bar". Check
if we got a commit and show more fitting messages in that case instead:
$ git checkout -b foo origin/master bar
fatal: Cannot update paths and switch to branch 'foo' at the same time.
$ git checkout --track -b foo origin/master bar
fatal: '--track' cannot be used with updating paths
Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test git checkout -b with and without --track and demonstrate unexpected
error messages when it's given an extra (i.e. unsupported) path
argument. In both cases it reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
The problem is that the start point we gave for the new branch is
"origin/master" and "bar" is just some extra argument -- it could even
be a valid commit, which would make the message even more confusing. We
have more fitting error messages in git commit, but get confused; use
the text of the rights ones in the tests.
Reported-by: Dana Dahlstrom <dahlstrom@google.com>
Original-test-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
06f5608c14 (bisect--helper: `bisect_start` shell function partially in C,
2019-01-02) adds a lax parser for `git bisect start` which could result
in a segfault under a bad syntax call for start with custom terms.
Detect if there are enough arguments left in the command line to use for
--term-{old,good,new,bad} and abort with the same syntax error the original
implementation will show if not.
While at it, remove an unnecessary (and incomplete) check for unknown
arguments and make sure to add a test to avoid regressions.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
df70b190 (completion: make stash -p and alias for stash push -p,
2018-04-20) wanted to make sure "git stash -p <TAB>" offers the same
completion as "git stash push -p <TAB>", but it did so by forcing the
$subcommand to be "push" whenever then "-p" option is found on the
command line.
This harms any subcommand that can take the "-p" option---even when the
subcommand is explicitly given, e.g. "git stash show -p", the code added
by the change would overwrite the $subcommand the user gave us.
Fix it by making sure that the defaulting to "push" happens only when
there is no $subcommand given yet.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/prati0100/git-gui:
git-gui: Handle Ctrl + BS/Del in the commit msg
Subject: git-gui: fix syntax error because of missing semicolon
If the conflict candidate file name from the top of the stack is not a
prefix of the current candiate directory then we can discard it as no
matching directory can come up later. But we are not done checking the
candidate directory -- the stack might still hold a matching file name,
so stay in the loop and check the next candidate file name.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exercise the case of putting a conflict candidate file name back on the
stack because a matching directory might yet come up later.
Do that by factoring out the test code into a function to allow for more
concise notation in the form of parameters indicating names of trees
(with trailing slash) and blobs (without trailing slash) in no
particular order (they are sorted by git mktree). Then add the new test
case as a second function call.
Fix a typo in the test title while at it ("dublicate").
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first four bytes of the line, the pkt-len, indicates the total
length of the pkt-line in hexadecimal. Fix wrong pkt-len headers of
some pkt-line messages in `http-protocol.txt` and `pack-protocol.txt`.
Reviewed-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Jiuyang Xie <jiuyang.xjy@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several code paths in the checkout code which are traversed only
in this case, due to switch having different defaults from checkout.
Let's add a test that the combination of options works and produces the
expected behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call init_checkout_metadata in reset_tree, we want to pass the
object ID of the commit in question so that it can be passed to filters,
or if there is no commit, the tree. We anticipated this latter case,
which can occur elsewhere in the checkout code, but it cannot occur
here. The only case in which we do not have a commit object is when
invoking git switch with --orphan. Moreover, we can only hit this code
path without a commit object additionally with either --force or
--discard-changes.
In such a case, there is no point initializing the checkout metadata
with a commit or tree because (a) there is no commit, only the empty
tree, and (b) we will never use the data, since no files will be smudged
when checking out a branch with no files. Pass the all-zeros object ID
in this case, since we just need some value which is a valid pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.26 used protocol v2 as its default protocol, but soon after
release, users noticed that the protocol v2 negotiation code was prone
to fail when fetching from some remotes that are far ahead of others
(such as linux-next.git versus Linus's linux.git). That has been
fixed by 0b07eecf6e (Merge branch 'jt/v2-fetch-nego-fix',
2020-05-01), but to be cautious, we are using protocol v0 as the
default in 2.27 to buy some time for any other unanticipated issues to
surface.
To that end, let's ensure that users requesting the bleeding edge
using the feature.experimental flag *do* get protocol v2. This way,
we can gain experience with a wider audience for the new protocol
version and be more confident when it is time to enable it by default
for all users in some future Git version.
Implementation note: this isn't with the rest of the
feature.experimental options in repo-settings.c because those are tied
to a repository object, whereas this code path is used for operations
like "git ls-remote" that do not require a repository.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow deleting words backwards and forwards using Ctrl + Backspace and
Delete in the commit message buffer.
* il/ctrl-bs-del:
git-gui: Handle Ctrl + BS/Del in the commit msg
Document some of the flag options in refs_ref_iterator_begin, and explain how
ref_iterator_advance_fn should handle them.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading and writing .git/refs/* assumes that refs are stored in the 'files'
ref backend.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6c722cbe5a (bisect: allow CRLF line endings in "git bisect replay"
input, 2020-05-07) includes CR as a field separator, but relies on
it not being included in the last field, which breaks at least when
running under OpenBSD 6.7's sh.
Instead of just assume the CR will get swallowed, read the rest of
the line into an otherwise unused variable and ignore it everywhere
except on the call for git bisect start, where it matters.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'gitfaq.txt' was added in 2149b6748f (docs: add a FAQ, 2020-03-30),
it was added to the Makefile but not to command-list.txt.
Add it there also, so that the new FAQ is listed in the output of
`git help --guides`.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a BRE, that broke tests 30-32, 37-39, 42 at least with
OpenBSD 6.7; use a simpler ERE.
Fixes: d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seems to trigger a bug in at least OpenBSD's 6.7 sh where it is
interpreted as a history lookup and therefore fails 125-126, 128,
130.
Remove the subshell and get a space between ! and grep, so tests
pass successfully.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
95acf11a3d ("diff: restrict when prefetching occurs", 2020-04-07) taught
diff to prefetch blobs in a more limited set of situations. These
limited situations include when the output format requires blob data,
and when inexact rename detection is needed.
There is an existing test case that tests inexact rename detection, but
it also uses an output format that requires blob data, resulting in the
inexact-rename-detection-only code not being tested. Update this test to
use the raw output format, which does not require blob data.
Thanks to Derrick Stolee for noticing this lapse in code coverage and
for doing the preliminary analysis [1].
[1] https://lore.kernel.org/git/853759d3-97c3-241f-98e1-990883cd204e@gmail.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.
Change the function parameter from `const char *linelen` to
`const char lenbuf_hex[4]`. Even though these two types behave
identically as function parameters, use the array notation to
semantically indicate exactly what this function is expecting as an
argument. Also, rename it from linelen to lenbuf_hex as the former
sounds like it should be an integral type which is misleading.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.
Extract the common fetch_pack() call out of the switch statement so that
code duplication is reduced and the logic is more clear for future
readers. While we're at it, rewrite the switch statement as an if-else
tower for increased clarity.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a merge with a single strategy, the result of evaluate_result() is
effectively not used and therefore is not needed, so avoid altogether.
On Windows, this optimization can halve the time required to perform a
recursive merge of a single commit with the LLVM repo.
Signed-off-by: Andrew Ng <andrew.ng@sony.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some platforms likes HP-UX, grep(1) doesn't understand "-a".
Let's switch to perl.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Where we explain the 'reapply' command, we don't properly wrap it in
single quote marks like we do with the other commands: We omit the
closing mark ("'reapply") and this ends up being rendered literally as
"'reapply". Add the missing "'".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use list continuation to avoid the second and third paragraphs
rendering with a different indentation from the first one where we
describe the "url" attribute.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first list item follows immediately on the paragraph where we
introduce the list. This makes the "*" render literally as part of one
huge paragraph. (With AsciiDoc, everything is fine after that, but with
Asciidoctor, we get some minor follow-on errors.) Add an empty line --
with a list continuation ("+") -- to make the first list item render ok.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to mix up the possessive "its" and "it's" ("it is"). Correct
an instance of this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The blank line before the lone "+" means it isn't detected as a list
continuation, but instead renders literally, at least with AsciiDoc.
Drop the empty line and, while at it, add a closing period to the
preceding paragraph.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite, and to do that copies the common branch in
test-lib and expands it to include it in a special case for FreeBSD.
Instead; test for it using a previously added extension to test-tool and
use that, together with a function that identifies when regcomp/regexec
will be called with broken patterns to avoid any test that would otherwise
rely on undefined behaviour.
The description of the first test which wasn't accurate has been corrected,
and the test rearranged for clarity, including a helper function that avoids
overly long lines.
Only the affected engines will have their tests suppressed, also including
"fixed" if the PCRE optimization that uses LIBPCRE2 since b65abcafc7
(grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) is not
available.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite to avoid failures from the tests added in
4e2443b181 (log tests: test regex backends in "--encode=<enc>" tests,
2019-06-28), but hardcodes it to be only enabled in FreeBSD.
Instead of hardcoding the affected platform, teach the test-regex helper,
how to validate a pattern and report back, so it can be used to detect the
same issue in other affected systems (like DragonFlyBSD or macOS).
While at it, refactor the tool so it can report back the source of the
errors it founds, and can be invoked also in a --silent mode, when needed,
for backward compatibility. A missing flag has been added and the code
reformatted, as well as updates to the way the parameters are handled, for
consistency.
To minimize changes, it is assumed the regcomp error is of the right type
since we control the only caller, and is also assumed to affect both basic
and extended syntax (only basic is tested, but both behave the same in all
three affected platforms since they use the same function).
Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use fields from this struct in
receive_needs(), instead of local variables with the same name
and purpose.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to create_pack_file(),
so that this function, and the function it calls, can use all
the fields of the struct.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove the 'stateless_rpc' static
variable, as we can now use the field of 'struct upload_pack_data'
with the same name instead.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to check_non_tip(), so
that this function and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_ref(), so that
this function, and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, we are passing around that struct to many
functions, so let's also pass 'struct string_list symref' around
at the same time by moving it from a local variable in
upload_pack() into a field of 'struct upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use the 'struct packet_writer writer'
field from 'struct upload_pack_data' in receive_needs(),
instead of a local 'struct packet_writer writer' variable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
receive_needs(), so that this function and the functions it
calls can use all the fields of that struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
get_common_commits(), so that this function and the functions
it calls can use all the fields of that struct in followup
commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use 'struct upload_pack_data' in
upload_pack().
This will make it possible in followup commits to remove a lot
of static variables and local variables that have the same name
and purpose as fields in 'struct upload_pack_data'. This will
also make upload_pack() work in a more similar way as
upload_pack_v2().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move 'struct upload_pack_data' and the
related upload_pack_data_init() and upload_pack_data_clear()
functions towards the beginning of the file, so that this struct
and its related functions can then be used by upload_pack() in a
followup commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the want_obj and have_obj object
arrays into 'struct upload_pack_data'.
These object arrays are used by both upload_pack() and
upload_pack_v2(), for example when these functions call
create_pack_file(). We are going to use
'struct upload_pack_data' in upload_pack() in a followup
commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove 'struct object_array wants' from
'struct upload_pack_data', as it appears to be unused.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7c5c9b9c57 (commit-graph: error out on invalid commit oids in
'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on
receiving non-commit OIDs as input to '--stdin-commits'.
This behavior can be cumbersome to work around in, say, the case of
piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if
the caller does not want to cull out non-commits themselves. In this
situation, it would be ideal if 'git commit-graph write' wrote the graph
containing the inputs that did pertain to commits, and silently ignored
the remainder of the input.
Some options have been proposed to the effect of '--[no-]check-oids'
which would allow callers to have the commit-graph builtin do just that.
After some discussion, it is difficult to imagine a caller who wouldn't
want to pass '--no-check-oids', suggesting that we should get rid of the
behavior of complaining about non-commit inputs altogether.
If callers do wish to retain this behavior, they can easily work around
this change by doing the following:
git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
awk '
!/commit/ { print "not-a-commit:"$1 }
/commit/ { print $1 }
' |
git commit-graph write --stdin-commits
To make it so that valid OIDs that refer to non-existent objects are
indeed an error after loosening the error handling, perform an extra
lookup to make sure that object indeed exists before sending it to the
commit-graph internals.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the subsequent commit, we will introduce a dependency on
'graph_read_expect' from t5318.7. Preemptively move it below
'graph_read_expect()'s definition so that the test can call it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous handful of commits, both 'git commit-graph write
--reachable' and '--stdin-commits' learned to peel tags down to the
commits which they refer to before passing them into the commit-graph
internals.
This makes the call to 'lookup_commit_reference_gently()' inside of
'fill_oids_from_commits()' a noop, since all OIDs are commits by that
point.
As such, remove the call entirely, as well as the progress meter, which
has been split and moved out to the callers in the aforementioned
earlier commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.
In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.
Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.
Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With either '--stdin-commits' or '--stdin-packs', the commit-graph
builtin will read line-delimited input, and interpret it either as a
series of commit OIDs, or pack names.
In a subsequent commit, we will begin handling '--stdin-commits'
differently by processing each line as it comes in, instead of in one
shot at the end. To make adequate room for this additional logic, split
the '--stdin-commits' case from '--stdin-packs' by only storing the
input when '--stdin-packs' is given.
In the case of '--stdin-commits', feed each line to a new
'read_one_commit' helper, which (for now) will merely call
'parse_oid_hex'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From e76eec3554 (ci: allow per-branch config for GitHub Actions,
2020-05-07), we started to allow contributors decide which branch
they want to build with GitHub Actions
by checking for a file named "ci/config/allow-ref".
In order to assist those contributors,
we provided a sample in "ci/config/allow-refs.sample",
and instructed them to drop the ".sample",
then commit that file to their repository.
We've misspelt the filename in that change.
Let's fix the spelling.
While we're at it, also instruct our contributors introduce that new
file to Git before commit, in case of they've never told Git before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On hppa these tests crash because the allocated stack space is too
small, even after it was doubled in b9a190789 (and the data size
doubled to match) to make it work on powerpc. For this arch just
skip these tests, which is enough to make the whole suite pass.
Fixes: https://bugs.debian.org/757402
Based-on-patch-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Price <gnprice@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document a capability that indicates which hash algorithms are in use by
both sides of a remote connection. Use the term "object-format", since
this is the term used for the repository extension as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While iterating references (to discover the set of commits to write to
the commit-graph with 'git commit-graph write --reachable'),
'add_ref_to_set' can save 'fill_oids_from_commits()' some time by
peeling the references beforehand.
Move peeling out of 'fill_oids_from_commits()' and into
'add_ref_to_set()' to use 'peel_ref()' instead of 'deref_tag()'. Doing
so allows the commit-graph machinery to use the peeled value from
'$GIT_DIR/packed-refs' instead of having to load and parse tags.
While we're at it, discard non-commit objects reachable from ref tips.
This would be done automatically by 'fill_oids_from_commits()', but such
functionality will be removed in a subsequent patch after the call to
'lookup_commit_reference_gently' is dropped (at which point a non-commit
object in the commits oidset will become an error).
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git commit-graph write --reachable' is invoked, the commit-graph
machinery calls 'for_each_ref()' to discover the set of reachable
commits.
Right now the 'add_ref_to_set' callback is not doing anything other than
adding an OID to the set of known-reachable OIDs. In a subsequent
commit, 'add_ref_to_set' will presumptively peel references. This
operation should be fast for repositories with an up-to-date
'$GIT_DIR/packed-refs', but may be slow in the general case.
So that it doesn't appear that 'git commit-graph write' is idling with
'--reachable' in the slow case, add a progress meter to provide some
output in the meantime.
In general, we don't expect a progress meter to appear at all, since
peeling references with a 'packed-refs' file is quick. If it's slow and
we do show a progress meter, the subsequent 'fill_oids_from_commits()'
will be fast, since all of the calls to
'lookup_commit_reference_gently()' will be no-ops.
Both progress meters are delayed, so it is unlikely that more than one
will appear. In either case, this intermediate state will go away in a
handful of patches, at which point there will be at most one progress
meter.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pattern here looking for failures is specific to SHA-1. Let's
create a variable that matches the regex or glob pattern for a path
within the objects directory.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible a user may complain about the way that Git interacts with
their interactive shell, e.g. autocompletion or shell prompt. In that
case, it's useful for us to know which shell they're using
interactively.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It may be useful to know which shell Git was built to try to point to,
in the event that shell-based Git commands are failing. $SHELL_PATH is
set during the build and used to launch the manpage viewer, as well as
by git-compat-util.h, and it's used during tests. 'git version
--build-options' is encouraged for use in bug reports, so it makes sense
to include this information there.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git p4 submit with the --prepare-p4-only option, the program
should prepare a single p4 changelist and notify the user that more
commits are pending and then stop processing.
A bug has been introduced by the p4-changelist hook feature that
causes the program to continue to try and process all pending
changelists at the same time.
The function applyCommit returns True when applying the commit
was successful and the program should continue. However, when the
optional flag --prepare-p4-only is set, the program should stop
after the first application.
Change the logic in the run method for P4Submit to check for the
flag --prepare-p4-only after successfully completing the applyCommit
method.
Be aware - this change will fix the existing test error in t9807.23
for --prepare-p4-only. However there is insufficent coverage for
this flag. If more than 1 commit is pending submission to P4, the
method will properly prepare the P4 changelist, however it will
still exit the application with an exitcode of 1.
The current documentation does not define what the exit code should be
in this condition.
(See: https://git-scm.com/docs/git-p4#Documentation/git-p4.txt---prepare-p4-only)
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Control+BackSpace: Delete word to the left of the cursor.
- Control+Delete : Delete word to the right of the cursor.
Originally introduced by BRIEF and Turbo Vision between 1985 and 1992,
they were adopted by most CUA-Compliant UIs, including those of: OS/2,
Windows, Mac OS, Qt, GTK, Open/Libre Office, Gecko, and GNU Emacs.
In both cases Tk already implements the functionality bound to other key
combination, so we use that.
Graphical examples:
Deleting to the left:
v------ pointer
X_WORD____X
^-----^------ selection
Deleting to the right:
v--------- pointer
X_WORD_X
^--^------ selection
Signed-off-by: Ismael Luceno <ismael.luceno@tttech-auto.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that when GIT_TRACE_CURL is set, Git prints out "Authorization:
Basic <redacted>" instead of the base64-encoded authorization details.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous changes to the line-log machinery focused on making the
first result appear faster. This was achieved by no longer walking the
entire commit history before returning the early results. There is still
another way to improve the performance: walk most commits much faster.
Let's use the changed-path Bloom filters to reduce time spent computing
diffs.
Since the line-log computation requires opening blobs and checking the
content-diff, there is still a lot of necessary computation that cannot
be replaced with changed-path Bloom filters. The part that we can reduce
is most effective when checking the history of a file that is deep in
several directories and those directories are modified frequently. In
this case, the computation to check if a commit is TREESAME to its first
parent takes a large fraction of the time. That is ripe for improvement
with changed-path Bloom filters.
We must ensure that prepare_to_use_bloom_filters() is called in
revision.c so that the bloom_filter_settings are loaded into the struct
rev_info from the commit-graph. Of course, some cases are still
forbidden, but in the line-log case the pathspec is provided in a
different way than normal.
Since multiple paths and segments could be requested, we compute the
struct bloom_key data dynamically during the commit walk. This could
likely be improved, but adds code complexity that is not valuable at
this time.
There are two cases to care about: merge commits and "ordinary" commits.
Merge commits have multiple parents, but if we are TREESAME to our first
parent in every range, then pass the blame for all ranges to the first
parent. Ordinary commits have the same condition, but each is done
slightly differently in the process_ranges_[merge|ordinary]_commit()
methods. By checking if the changed-path Bloom filter can guarantee
TREESAME, we can avoid that tree-diff cost. If the filter says "probably
changed", then we need to run the tree-diff and then the blob-diff if
there was a real edit.
The Linux kernel repository is a good testing ground for the performance
improvements claimed here. There are two different cases to test. The
first is the "entire history" case, where we output the entire history
to /dev/null to see how long it would take to compute the full line-log
history. The second is the "first result" case, where we find how long
it takes to show the first value, which is an indicator of how quickly a
user would see responses when waiting at a terminal.
To test, I selected the paths that were changed most frequently in the
top 10,000 commits using this command (stolen from StackOverflow [1]):
git log --pretty=format: --name-only -n 10000 | sort | \
uniq -c | sort -rg | head -10
which results in
121 MAINTAINERS
63 fs/namei.c
60 arch/x86/kvm/cpuid.c
59 fs/io_uring.c
58 arch/x86/kvm/vmx/vmx.c
51 arch/x86/kvm/x86.c
45 arch/x86/kvm/svm.c
42 fs/btrfs/disk-io.c
42 Documentation/scsi/index.rst
(along with a bogus first result). It appears that the path
arch/x86/kvm/svm.c was renamed, so we ignore that entry. This leaves the
following results for the real command time:
| | Entire History | First Result |
| Path | Before | After | Before | After |
|------------------------------|--------|--------|--------|--------|
| MAINTAINERS | 4.26 s | 3.87 s | 0.41 s | 0.39 s |
| fs/namei.c | 1.99 s | 0.99 s | 0.42 s | 0.21 s |
| arch/x86/kvm/cpuid.c | 5.28 s | 1.12 s | 0.16 s | 0.09 s |
| fs/io_uring.c | 4.34 s | 0.99 s | 0.94 s | 0.27 s |
| arch/x86/kvm/vmx/vmx.c | 5.01 s | 1.34 s | 0.21 s | 0.12 s |
| arch/x86/kvm/x86.c | 2.24 s | 1.18 s | 0.21 s | 0.14 s |
| fs/btrfs/disk-io.c | 1.82 s | 1.01 s | 0.06 s | 0.05 s |
| Documentation/scsi/index.rst | 3.30 s | 0.89 s | 1.46 s | 0.03 s |
It is worth noting that the least speedup comes for the MAINTAINERS file
which is
* edited frequently,
* low in the directory heirarchy, and
* quite a large file.
All of those points lead to spending more time doing the blob diff and
less time doing the tree diff. Still, we see some improvement in that
case and significant improvement in other cases. A 2-4x speedup is
likely the more typical case as opposed to the small 5% change for that
file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch made it possible to perform line-level filtering
during history traversal instead of in an expensive preprocessing
step, but it still requires some simpler preprocessing steps, notably
topo-ordering. However, nowadays we have commit-graphs storing
generation numbers, which make it possible to incrementally traverse
the history in topological order, without the preparatory limit_list()
and sort_in_topological_order() steps; see b45424181e (revision.c:
generation-based topo-order algorithm, 2018-11-01).
This patch combines the two, so we can do both the topo-ordering and
the line-level filtering during history traversal, eliminating even
those simpler preprocessing steps, and thus further reducing the delay
before showing the first commit modifying the given line range.
The 'revs->limited' flag plays the central role in this, because, due
to limitations of the current implementation, the generation
number-based topo-ordering is only enabled when this flag remains
unset. Line-level log, however, always sets this flag in
setup_revisions() ever since the feature was introduced in 12da1d1f6f
(Implement line-history search (git log -L), 2013-03-28). The reason
for setting 'limited' is unclear, though, because the line-level log
itself doesn't directly depend on it, and it doesn't affect how the
limit_list() function limits the revision range. However, there is an
indirect dependency: the line-level log requires topo-ordering, and
the "traditional" sort_in_topological_order() requires an already
limited commit list since e6c3505b44 (Make sure we generate the whole
commit list before trying to sort it topologically, 2005-07-06). The
new, generation numbers-based topo-ordering doesn't require a limited
commit list anymore.
So don't set 'revs->limited' for line-level log, unless it is really
necessary, namely:
- The user explicitly requested parent rewriting, because that is
still done in the line_log_filter() preprocessing step (see
previous patch), which requires sort_in_topological_order() and in
turn limit_list() as well.
- A commit-graph file is not available or it doesn't yet contain
generation numbers. In these cases we had to fall back on
sort_in_topological_order() and in turn limit_list(). The
existing condition with generation_numbers_enabled() has already
ensured that the 'limited' flag is set in these cases; this patch
just makes sure that the line-level log sets 'revs->topo_order'
before that condition.
While the reduced delay before showing the first commit is measurable
in git.git, it takes a bigger repository to make it clearly noticable.
In both cases below the line ranges were chosen so that they were
modified rather close to the starting revisions, so the effect of this
change is most noticable.
# git.git
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
After:
real 0m0.058s
user 0m0.050s
sys 0m0.005s
# linux.git
$ time git --no-pager log \
-L:build_restore_work_registers:arch/mips/mm/tlbex.c -1 v5.2
Before:
real 0m1.129s
user 0m1.061s
sys 0m0.069s
After:
real 0m0.096s
user 0m0.087s
sys 0m0.009s
Additional testing by Derrick Stolee: Since this patch improves
the performance for the first result, I repeated the experiment
from the previous patch on the Linux kernel repository, reporting
real time here:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 0.71 s
After: 0.05 s
Now, we have dropped the full topo-order of all ~910,000 commits
before reporting the first result. The remaining performance
improvements then are:
1. Update the parent-rewriting logic to be incremental similar to
how "git log --graph" behaves.
2. Use changed-path Bloom filters to reduce the time spend in the
tree-diff to see if the path(s) changed.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current line-level log implementation performs a preprocessing
step in prepare_revision_walk(), during which the line_log_filter()
function filters and rewrites history to keep only commits modifying
the given line range. This preprocessing affects both responsiveness
and correctness:
- Git doesn't produce any output during this preprocessing step.
Checking whether a commit modified the given line range is
somewhat expensive, so depending on the size of the given revision
range this preprocessing can result in a significant delay before
the first commit is shown.
- Limiting the number of displayed commits (e.g. 'git log -3 -L...')
doesn't limit the amount of work during preprocessing, because
that limit is applied during history traversal. Alas, by that
point this expensive preprocessing step has already churned
through the whole revision range to find all commits modifying the
revision range, even though only a few of them need to be shown.
- It rewrites parents, with no way to turn it off. Without the user
explicitly requesting parent rewriting any parent object ID shown
should be that of the immediate parent, just like in case of a
pathspec-limited history traversal without parent rewriting.
However, after that preprocessing step rewrote history, the
subsequent "regular" history traversal (i.e. get_revision() in a
loop) only sees commits modifying the given line range.
Consequently, it can only show the object ID of the last ancestor
that modified the given line range (which might happen to be the
immediate parent, but many-many times it isn't).
This patch addresses both the correctness and, at least for the common
case, the responsiveness issues by integrating line-level log
filtering into the regular revision walking machinery:
- Make process_ranges_arbitrary_commit(), the static function in
'line-log.c' deciding whether a commit modifies the given line
range, public by removing the static keyword and adding the
'line_log_' prefix, so it can be called from other parts of the
revision walking machinery.
- If the user didn't explicitly ask for parent rewriting (which, I
believe, is the most common case):
- Call this now-public function during regular history traversal,
namely from get_commit_action() to ignore any commits not
modifying the given line range.
Note that while this check is relatively expensive, it must be
performed before other, much cheaper conditions, because the
tracked line range must be adjusted even when the commit will
end up being ignored by other conditions.
- Skip the line_log_filter() call, i.e. the expensive
preprocessing step, in prepare_revision_walk(), because, thanks
to the above points, the revision walking machinery is now able
to filter out commits not modifying the given line range while
traversing history.
This way the regular history traversal sees the unmodified
history, and is therefore able to print the object ids of the
immediate parents of the listed commits. The eliminated
preprocessing step can greatly reduce the delay before the first
commit is shown, see the numbers below.
- However, if the user did explicitly ask for parent rewriting via
'--parents' or a similar option, then stick with the current
implementation for now, i.e. perform that expensive filtering and
history rewriting in the preprocessing step just like we did
before, leaving the initial delay as long as it was.
I tried to integrate line-level log filtering with parent rewriting
into the regular history traversal, but, unfortunately, several
subtleties resisted... :) Maybe someday we'll figure out how to do
that, but until then at least the simple and common (i.e. without
parent rewriting) 'git log -L:func:file' commands can benefit from the
reduced delay.
This change makes the failing 'parent oids without parent rewriting'
test in 't4211-line-log.sh' succeed.
The reduced delay is most noticable when there's a commit modifying
the line range near the tip of a large-ish revision range:
# no parent rewriting requested, no commit-graph present
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m9.570s
user 0m9.494s
sys 0m0.076s
After:
real 0m0.718s
user 0m0.674s
sys 0m0.044s
A significant part of the remaining delay is spent reading and parsing
commit objects in limit_list(). With the help of the commit-graph we
can eliminate most of that reading and parsing overhead, so here are
the timing results of the same command as above, but this time using
the commit-graph:
Before:
real 0m8.874s
user 0m8.816s
sys 0m0.057s
After:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
The next patch will further reduce the remaining delay.
To be clear: this patch doesn't actually optimize the line-level log,
but merely moves most of the work from the preprocessing step to the
history traversal, so the commits modifying the line range can be
shown as soon as they are processed, and the traversal can be
terminated as soon as the given number of commits are shown.
Consequently, listing the full history of a line range, potentially
all the way to the root commit, will take the same time as before (but
at least the user might start reading the output earlier).
Furthermore, if the most recent commit modifying the line range is far
away from the starting revision, then that initial delay will still be
significant.
Additional testing by Derrick Stolee: In the Linux kernel repository,
the MAINTAINERS file was changed ~3,500 times across the ~915,000
commits. In addition to that edit frequency, the file itself is quite
large (~18,700 lines). This means that a significant portion of the
computation is taken up by computing the patch-diff of the file. This
patch improves the real time it takes to output the first result quite
a bit:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 3.88 s
After: 0.71 s
If we drop the "-n 1" in the command, then there is no change in
end-to-end process time. This is because the command still needs to
walk the entire commit history, which negates the point of this
patch. This is expected.
As a note for future reference, the ~4.3 seconds in the old code
spends ~2.6 seconds computing the patch-diffs, and the rest of the
time is spent walking commits and computing diffs for which paths
changed at each commit. The changed-path Bloom filters could improve
the end-to-end computation time (i.e. no "-n 1" in the command).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the tests in 't4211-line-log.sh' really check which parent
object IDs are shown in the output, either implicitly as part of
"Merge: ..." lines [1] or explicitly via the '%p' or '%P' format
specifiers in a custom pretty format.
Add two tests to 't4211-line-log.sh' to check which parent object IDs
are shown, one without and one with explicitly requested parent
rewriting, IOW without and with the '--parents' option.
The test without '--parents' is marked as failing, because without
that option parent rewriting should not be performed, and thus the
parent object ID should be that of the immediate parent, just like in
case of a pathspec-limited history traversal without parent rewriting.
The current line-level log implementation, however, performs parent
rewriting unconditionally and without a possibility to turn it off,
and, consequently, it shows the object ID of the most recent ancestor
that modified the given line range.
In both of these new tests we only really care about the object IDs of
the listed commits and their parents, but not the diffs of the line
ranges; the diffs have already been thoroughly checked in the previous
tests.
[1] While one of the tests ('-M -L ':f:b.c' parallel-change') does
list a merge commit, both of its parents happen to modify the
given line range and are listed as well, so the implications of
parent rewriting remained hidden and untested.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the unused fields 'status', 'arg_alloc', 'arg_nr' and 'args'
from 'struct line_log_data'. They were already part of the struct
when it was introduced in commit 12da1d1f6 (Implement line-history
search (git log -L), 2013-03-28), but as far as I can tell none of
them have ever been actually used.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent patches, we are going to update a progress meter when
'add_ref_to_set()' is called, and need a convenient way to pass a
'struct progress *' in from the caller.
Introduce 'refs_cb_data' as a catch-all for parameters that
'add_ref_to_set' may need, and wrap the existing single parameter in
that struct.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both test_submodule_switch_recursing_with_args() and
test_submodule_forced_switch_recursing_with_args() call the internal
function test_submodule_recursing_with_args_common() with the final
argument of `--recurse-submodules`. Consolidate this duplication by
appending the argument in test_submodule_recursing_with_args_common().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the shell scripts in this codebase, the usual style is to include a
space between the function name and the (). Add these missing spaces to
conform to the usual style of the code.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some asynchronous operations, we build a chain of callbacks to
execute when the operation is done. These callbacks are held in $after,
and a new callback can be added by appending to $after. Once the
operation is done, $after is executed as a script.
But if we don't append a semi-colon after the procedure calls, they will
appear to Tcl as arguments to the previous procedure's arguments. So,
for example, if $after is "foo", and we just append "bar", then $after
becomes "foo bar", and bar will be treated as an argument to foo. If foo
does not accept any optional arguments, it would result in Tcl throwing
an error. If instead we do append a semi-colon, $after will look like
"foo;bar;", and these will be treated as two separate procedure calls.
Before d9c6469 (git-gui: update status bar to track operations,
2019-12-01), this problem was masked because ui_ready/ui_status did
accept an optional argument. In d9c6469, ui_ready stopped accepting an
optional argument, and this error started showing up.
Another instance of this problem is when a call to ui_status without a
trailing semicolon. ui_status never accepted an optional argument to
begin with, but the issue never managed to surface.
So, fix these errors by making sure we always append a semi-colon after
procedure calls when multiple callbacks are involved in $after.
Helped-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Ansgar Röber <ansgar.roeber@rwth-aachen.de>
2020-04-22 18:32:44 +05:30
322 changed files with 60339 additions and 50737 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.