A new refs backend "reftable" to replace the traditional
combination of packed-refs files and one-file-per-ref loose refs
has been implemented and integrated for improved performance and
atomicity.
* hn/reftable:
SQUASH??? whitespace breakage fix
Add "test-tool dump-reftable" command.
Add reftable testing infrastructure
vcxproj: adjust for the reftable changes
Add GIT_DEBUG_REFS debugging mechanism
Hookup unittests for the reftable library.
Reftable support for git-core
Add reftable library
Add .gitattributes for the reftable/ directory
Iterate over the "refs/" namespace in for_each_[raw]ref
Move REF_LOG_ONLY to refs-internal.h
Treat REVERT_HEAD as a pseudo ref
Treat CHERRY_PICK_HEAD as a pseudo ref
Treat BISECT_HEAD as a pseudo ref
Make refs_ref_exists public
Write pseudorefs through ref backends.
checkout: add '\n' to reflog message
lib-t6000.sh: write tag using git-update-ref
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.
* js/default-branch-name:
Document how the default branch name can be overridden
fmt-merge-msg: learn about the possibly-configured default branch name
clone: learn about the possibly-configured default branch name
submodule: use the (possibly overridden) default branch name
testsvn: respect `core.defaultBranchName`
send-pack/transport-helper: respect `core.defaultBranchName`
remote: respect `core.defaultBranchName`
init: allow overriding the default branch name for new repositories
The "hooks defined in config" topic.
Ready???
* es/config-hooks:
hook: add --porcelain to list command
hook: add list command
hook: scaffolding for git-hook subcommand
doc: propose hooks managed by the config
The wrapper to call into zlib followed our long tradition to use
"unsigned long" for sizes of regions in memory, which have been
updated to use "size_t".
* mk/use-size-t-in-zlib:
zlib.c: use size_t for size
Rewrite of the remainder of "git bisect" script in C continues.
* mr/bisect-in-c-2:
bisect--helper: retire `--bisect-autostart` subcommand
bisect--helper: retire `--write-terms` subcommand
bisect--helper: retire `--check-expected-revs` subcommand
bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C
bisect--helper: retire `--next-all` subcommand
bisect--helper: retire `--bisect-clean-state` subcommand
bisect--helper: finish porting `bisect_start()` to C
bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C
bisect--helper: reimplement `bisect_autostart` shell function in C
bisect--helper: introduce new `write_in_file()` function
bisect--helper: use '-res' in 'cmd_bisect__helper' return
bisect--helper: fix `cmd_*()` function switch default return
"git receive-pack" that accepts requests by "git push" learned to
outsource most of the ref updates to the new "proc-receive" hook.
* jx/proc-receive-hook:
doc: add documentation for the proc-receive hook
transport: parse report options for tracking refs
t5411: test updates of remote-tracking branches
receive-pack: new config receive.procReceiveRefs
refs.c: refactor to reuse ref_is_hidden()
receive-pack: feed report options to post-receive
doc: add document for capability report-status-v2
New capability "report-status-v2" for git-push
receive-pack: add new proc-receive hook
t5411: add basic test cases for proc-receive hook
transport: not report a non-head push as a branch
"git grep" has been tweaked to be limited to the sparse checkout
paths.
Review needed on 4/6; otherwise looking sane.
cf. <CABPp-BGdEyEeajYZj_rdxp=MyEQdszuyjVTax=hhYj3fOtRQUQ@mail.gmail.com>
* mt/grep-sparse-checkout:
config: add setting to ignore sparsity patterns in some cmds
grep: honor sparse checkout patterns
config: correctly read worktree configs in submodules
t/helper/test-config: facilitate addition of new cli options
t/helper/test-config: return exit codes consistently
doc: grep: unify info on configuration variables
The "%(push:remoteref)" placeholder in the "--format=" argument of
"git format-patch" (and friends) only showed what got explicitly
configured, not what ref at the receiving end would be updated when
"git push" was used, as it ignored the default behaviour (e.g. update
the same ref as the source).
* dr/push-remoteref-fix:
remote.c: fix handling of %(push:remoteref)
"git rebase -i" learns a bit more options.
Not quite.
cf. <nycvar.QRO.7.76.6.2005290437350.56@tvgsbejvaqbjf.bet>
* pw/rebase-i-more-options:
rebase: add --reset-author-date
rebase -i: support --ignore-date
sequencer: rename amend_author to author_to_free
rebase -i: support --committer-date-is-author-date
rebase -i: add --ignore-whitespace flag
The changed-path Bloom filter is improved using ideas from an
independent implementation.
* sg/commit-graph-cleanups:
commit-graph: persist existence of changed-paths
commit-graph: change test to die on parse, not load
bloom: enforce a minimum size of 8 bytes
commit-graph: check all leading directories in changed path Bloom filters
commit-graph: check chunk sizes after writing
commit-graph: simplify chunk writes into loop
commit-graph: unify the signatures of all write_graph_chunk_*() functions
commit-graph: place bloom_settings in context
commit-graph: simplify write_commit_graph_file() #2
commit-graph: simplify write_commit_graph_file() #1
commit-graph: simplify parse_commit_graph() #2
commit-graph: simplify parse_commit_graph() #1
commit-graph: clean up #includes
diff.h: drop diff_tree_oid() & friends' return value
commit-slab: add a function to deep free entries on the slab
commit-graph-format.txt: all multi-byte numbers are in network byte order
commit-graph: fix parsing the Chunk Lookup table
tree-walk.c: don't match submodule entries for 'submod/anything'
The effort to avoid using test_must_fail on non-git command continues.
Getting very close.
cf. <20200618181533.GA633383@coredump.intra.peff.net>
* dl/test-must-fail-fixes-5:
lib-submodule-update: use callbacks in test_submodule_switch_common()
lib-submodule-update: prepend "git" to $command
lib-submodule-update: consolidate --recurse-submodules
lib-submodule-update: add space after function name
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
A misdesigned strbuf_write_fd() function has been retired.
* rs/retire-strbuf-write-fd:
strbuf: remove unreferenced strbuf_write_fd method.
bugreport.c: replace strbuf_write_fd with write_in_full
The way refnames are anonymized has been updated and a way to help
debugging using the anonymized output hsa been added.
* jk/fast-export-anonym:
fast-export: allow dumping the path mapping
fast-export: anonymize "master" refname
fast-export: allow dumping the refname mapping
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.
* ak/commit-graph-to-slab:
commit-graph: minimize commit_graph_data_slab access
commit: move members graph_pos, generation to a slab
commit-graph: introduce commit_graph_data_slab
object: drop parsed_object_pool->commit_count
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.
* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"
"git status" learned to report the status of sparse checkout.
* en/sparse-status:
git-prompt: include sparsity state as well
wt-status: show sparse checkout status as well
CMake support to build with MSVC for Windows bypassing the Makefile.
Almost there.
* ss/cmake-build:
ci: modification of main.yml to use cmake for vs-build job
cmake: support for building git on windows with msvc and clang.
cmake: support for building git on windows with mingw
cmake: support for testing git when building out of the source tree
cmake: support for testing git with ctest
cmake: installation support for git
cmake: generate the shell/perl/python scripts and templates, translations
Introduce CMake support for configuring Git
Allow runtime upgrade of the repository format version, which needs
to be done carefully.
There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.
* xl/upgrade-repo-format:
check_repository_format_gently(): refuse extensions for old repositories
sparse-checkout: upgrade repository to version 1 when enabling extension
fetch: allow adding a filter after initial clone
repository: add a helper function to perform repository format upgrade
strbuf_write_fd was only used in bugreport.c. Since that file now uses
write_in_full, this method is no longer needed. In addition, strbuf_write_fd
did not guard against exceeding MAX_IO_SIZE for the platform, nor
provided error handling in the event of a failure if only partial data
was written to the file descriptor. Since already write_in_full has this
capability and is in general use, it should be used instead. The change
impacts strbuf.c and strbuf.h.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_write_fd method did not provide checks for buffers larger
than MAX_IO_SIZE. Replacing with write_in_full ensures the entire
buffer will always be written to disk or report an error and die.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a `GIT_TEST_*` environment variable and a `core.` config
setting (with the former taking precendence over the latter) to allow
overriding what name Git uses by default as main branch of new
repositories.
Now that all kinds of Git operations have learned to respect those,
let's document them.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working with an anonymized repo, it can be useful to be able to
refer to particular paths. E.g., reproducing a bug with "git rev-list --
foo.c" in the original repo would need to replace "foo.c" with its
anonymized counterpart to produce the same effect.
We recently taught fast-export to dump the refname mapping. Let's do the
same thing for paths, which can reuse most of the same infrastructure.
Note that the output format isn't unambiguous here (because paths could
contain spaces). That's OK because this is meant to be examined by a
human.
We could also just introduce a "dump mapping" file that shows every
mapping we make. But it would be a bit more awkward to work with, as the
user would have to sort through more data to find the parts they're
interested in (and there are likely to be many more paths than refnames,
making it annoying for people who just want to dump the refnames).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:
- it helped to have some known reference point between the original
and anonymized repository
- since it's historically the default branch name, it doesn't leak any
information
Now that we can ask fast-export to dump the anonymized ref mapping, we
have a much better tool for the first one (because it works for _any_
ref, not just master).
For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.
Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. But this gives us a good
opportunity to further test the new dumping feature.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.
Let's make it possible to dump the mapping separate from the output
stream. This can be used by a bug reporter to modify their reproduction
recipe without revealing the original names (see the example in the
documentation).
The implementation is slightly non-obvious. There's no point in the
program where we know the complete set of refs we're going to anonymize.
Nor do we have a complete set of anonymized refs after finishing (we
have a set of anonymized ref path components, but no knowledge of how
those are assembled into complete refs). So we lazily write to the dump
file as we anonymize each name, and keep a list of ones that we've
output in order to avoid duplicates.
Some possible alternatives:
- we could just output the mapping of anonymized components (e.g.,
that "foo" became "ref123"). That works OK when you have short
refnames (e.g., "refs/heads/foo" becomes "refs/heads/ref123"), but
longer names would require the user to look up each component to
assemble the result. For example, "refs/remotes/origin/jk/foo" might
become "refs/remotes/refs37/refs56/refs102".
- instead of dumping the mapping, the same problem could be solved by
allowing the user to leave some refs alone. So if you want to
reproduce "git rev-list branch~17..HEAD" in the anonymized repo, we
could allow something like:
git tag anon-one branch
git tag anon-two HEAD
git fast-export --anonymize --all \
--no-anonymize-ref=anon-one \
--no-anonymize-ref=anon-two \
>stream
and then presumably "git rev-list anon-one~17..anon-two" would
behave the same in the re-imported repository. This is more
convenient in some ways, but it does require modifying the
original repository. And the concept doesn't easily extend to
other fields (e.g., pathnames, which will be addressed in a
subsequent patch).
- we could dump before/after commit hashes; combined with rev-parse,
that could convert these cases (as well as ones using raw hashes).
But we don't actually know the anonymized commit hashes; we're just
generating a stream that will produce them in the anonymized repo.
- likewise, we probably could insert object names or other markers
into commit messages, blob contents, etc, in order to let a user
with the original repo figure out which parts correspond. But using
this gets complicated (I have to find my commits in the result with
"git log --all --grep" or similar). It also makes it less clear that
the anonymized repo didn't leak any information (because we are
relying on object ids being unguessable).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_pull() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Leaking in cmd_*() isn't a big deal, but sets a bad example for other
users of is_descendant_of().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref_newer() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git branches have been qualified as topic branches, integration branches,
development branches, feature branches, release branches and so on.
Git has a branch that is the master *for* development, but it is not
the master *of* any "slave branch": Git does not have slave branches,
and has never had, except for a single testcase that claims otherwise. :)
Independent of any future change to the naming of the "master" branch,
removing this sole appearance of the term is a strict improvement: it
avoids divisive language, and talking about "feature branch" clarifies
which developer workflow the test is trying to emulate.
Reported-by: Till Maas <tmaas@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.
* ss/submodule-set-branch-in-c:
submodule: port subcommand 'set-branch' from shell to C
"git merge-base --is-ancestor" is taught to take advantage of the
commit graph.
* ds/merge-base-is-ancestor-optim:
commit-reach: use fast logic in repo_in_merge_base
commit-reach: create repo_is_descendant_of()
Code clean-up around "git branch" with a minor bugfix.
* dl/branch-cleanup:
branch: don't mix --edit-description
t3200: test for specific errors
t3200: rename "expected" to "expect"
The `diff-files' command and related commands which call `cmd_diff_files()',
consider the "intent-to-add" files as a part of the index when comparing the
work-tree against it. This was previously addressed in [1] and [2] by turning
the option `--ita-invisible-in-index' (introduced in [3]) on by default.
For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as
as new, `ita_invisible_in_index' will be enabled by default here as well.
[1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26)
[2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in
index", 2016-10-24)
[3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24)
Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment in cmd_diff() states that if one tree-ish and no blobs are
provided, (the "N=1, M=0" case), it will provide a diff between the tree
and the cache. This is incorrect because a diff happens between the
tree-ish and the working tree. Remove the `--cached` in the comment so
that the correct behavior is shown. Add a new section describing the
"N=1, M=0, --cached" behavior.
Next, describe the "N=0, M=0, --cached" case, similar to the above since
it is undocumented.
Finally, fix some spacing issues. Add spaces between each section for
consistency and readability. Also, change tabs within the comment into
spaces.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current git prompt includes a lot of possible state information from
cherry-picks, merges, bisects, and various flavors of rebases. Add
sparsity as another state flavor (though one which can be present
simultaneously with any of rebase/cherry-pick/merge/bisect). This extra
state is shown with an extra
|SPARSE
substring before the other states, providing a prompt that looks like:
(branchname|SPARSE|REBASE 6/10)
The reason for showing the "|SPARSE" substring before other states is to
emphasize those other states. Sparsity is probably not going to change
much within a repository, while temporary operations will. So we want
the state changes related to temporary operations to be listed last, to
make them appear closer to where the user types and make them more
likely to be noticed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the early feedback of folks trying out sparse-checkouts at
$dayjob is that sparse checkouts can sometimes be disorienting; users
can forget that they had a sparse-checkout and then wonder where files
went. Add some output to 'git status' in the form of a simple line that
states:
You are in a sparse checkout with 35% of files present.
where, obviously, the exact figure changes depending on what percentage
of files from the index do not have the SKIP_WORKTREE bit set.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run a test helper function in test_submodule_switch_common(), we
sometimes specify a whole helper function as the $command. When we do
this, in some test cases, we just mark the whole function with
`test_must_fail`. However, it's possible that the helper function might
fail earlier or later than expected due to an introduced bug. If this
happens, then the test case will still report as passing but it should
really be marked as failing since it didn't actually display the
intended behaviour.
Instead of invoking $command as one monolithic helper function, break it
up into three parts:
1. $command which is always a git command.
2. $before which is a callback function that runs just prior to
$command.
3. $after which is a callback function that runs just after
$command.
If the command requires a filename argument, specify it as `\$arg` since
that variable will be set and the whole $command string will be eval'd.
Unfortunately, there is no way to get rid of the eval as some of the
commands that are passed (such as the `git pull` tests) require that no
additional arguments are passed so we must have some mechanism for the
caller to specify whether or not it wants the filename argument.
The $before and $after callback functions will be passed the filename as
the first arg. These callback functions are optional and, if missing,
will be replaced with `true`. Also, in the case where we have a
`test_must_fail` test, $after will not be executed, similar to how the
helper functions currently behave when the git command fails and exits
the &&-chain.
Finally, as an added bonus, `test_must_fail` will only run on $command
which is guaranteed to be a git command.
An alternate design was considered where $OVERWRITING_FAIL is set from
test_submodule_switch_common() and exposed to the helper function. This
approach was considered too difficult to understand due to the fact that
using a signalling magic environment variable might be too indirect.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up in the codepath that serves "git fetch" continues.
* cc/upload-pack-data-3:
upload-pack: refactor common code into do_got_oid()
upload-pack: move oldest_have to upload_pack_data
upload-pack: pass upload_pack_data to got_oid()
upload-pack: pass upload_pack_data to ok_to_give_up()
upload-pack: pass upload_pack_data to send_acks()
upload-pack: pass upload_pack_data to process_haves()
upload-pack: change allow_unadvertised_object_request to an enum
upload-pack: move allow_unadvertised_object_request to upload_pack_data
upload-pack: move extra_edge_obj to upload_pack_data
upload-pack: move shallow_nr to upload_pack_data
upload-pack: pass upload_pack_data to send_unshallow()
upload-pack: pass upload_pack_data to deepen_by_rev_list()
upload-pack: pass upload_pack_data to deepen()
upload-pack: pass upload_pack_data to send_shallow_list()
"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.
* ct/diff-with-merge-base-clarification:
Documentation: usage for diff combined commits
git diff: improve range handling
t/t3430: avoid undefined git diff behavior
"git clean" code clean-up that resulted in a fix of recent
performance regression.
* en/clean-cleanups:
clean: optimize and document cases where we recurse into subdirectories
clean: consolidate handling of ignored parameters
dir, clean: avoid disallowed behavior
dir: fix a few confusing comments
The effect of sparse checkout settings on submodules is documented.
* en/sparse-with-submodule-doc:
git-sparse-checkout: clarify interactions with submodules
The same worktree directory must be registered only once, but
"git worktree move" allowed this invariant to be violated, which
has been corrected.
* es/worktree-duplicate-paths:
worktree: make "move" refuse to move atop missing registered worktree
worktree: generalize candidate worktree path validation
worktree: prune linked worktree referencing main worktree path
worktree: prune duplicate entries referencing same worktree path
worktree: make high-level pruning re-usable
worktree: give "should be pruned?" function more meaningful name
worktree: factor out repeated string literal
The interface to redact sensitive information in the trace output
has been simplified.
* jt/redact-all-cookies:
http: redact all cookies, teach GIT_TRACE_REDACT=0
Further code clean-up.
* cc/upload-pack-data-2:
upload-pack: move pack_objects_hook to upload_pack_data
upload-pack: move allow_sideband_all to upload_pack_data
upload-pack: move allow_ref_in_want to upload_pack_data
upload-pack: move allow_filter to upload_pack_data
upload-pack: move keepalive to upload_pack_data
upload-pack: pass upload_pack_data to upload_pack_config()
upload-pack: change multi_ack to an enum
upload-pack: move multi_ack to upload_pack_data
upload-pack: move filter_capability_requested to upload_pack_data
upload-pack: move use_sideband to upload_pack_data
upload-pack: move static vars to upload_pack_data
upload-pack: annotate upload_pack_data fields
upload-pack: actually use some upload_pack_data bitfields
Use of negative pathspec, while collecting paths including
untracked ones in the working tree, was broken.
* en/do-match-pathspec-fix:
dir: fix treatment of negated pathspecs
The behaviour of "sparse-checkout" in the state "git clone
--no-checkout" left was changed accidentally in 2.27, which has
been corrected.
* en/sparse-checkout:
sparse-checkout: avoid staging deletions of all files
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.
* js/reflog-anonymize-for-clone-and-fetch:
clone/fetch: anonymize URLs in the reflog
Reduce memory usage during "diff --quiet" in a worktree with too
many stat-unmatched paths.
* jk/diff-memuse-optim-with-stat-unmatch:
diff: discard blob data from stat-unmatched pairs
In an earlier patch, multiple struct acccesses to `graph_pos` and
`generation` were auto-converted to multiple method calls.
Since the values are fixed and commit-slab access costly, we would be
better off with storing the values as a local variable and reusing it.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We remove members `graph_pos` and `generation` from the struct commit.
The default assignments in init_commit_node() are no longer valid,
which is fine as the slab helpers return appropriate default values and
the assignments are removed.
We will replace existing use of commit->generation and commit->graph_pos
by commit_graph_data_slab helpers using
`contrib/coccinelle/commit.cocci'.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct commit is used in many contexts. However, members
`generation` and `graph_pos` are only used for commit-graph related
operations and otherwise waste memory.
This wastage would have been more pronounced as we transition to
generation number v2, which uses 64-bit generation number instead of
current 32-bits.
As they are often accessed together, let's introduce struct
commit_graph_data and move them to a commit_graph_data slab.
While the overall test suite runs just as fast as master,
(series: 26m48s, master: 27m34s, faster by 2.87%), certain commands
like `git merge-base --is-ancestor` were slowed by 40% as discovered
by Szeder Gábor [1]. After minimizing commit-slab access, the slow down
persists but is closer to 20%.
Derrick Stolee believes the slow down is attributable to the underlying
algorithm rather than the slowness of commit-slab access [2] and we will
follow-up in a later series.
[1]: https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2]: https://lore.kernel.org/git/13db757a-9412-7f1e-805c-8a028c4ab2b1@gmail.com/
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool->commit_count to keep count of
commits per repository and was used to assign commit->index.
However, commit-slab code requires commit->index values to be unique
and a global count would be correct, rather than a per-repo count.
Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.
As commit_count has no use anymore, let's also drop it from the struct.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This command dumps individual tables or a stack of of tables.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add GIT_TEST_REFTABLE environment var to control default ref storage
* Add test_prerequisite REFTABLE.
* Skip some tests that are incompatible:
* t3210-pack-refs.sh - does not apply
* t9903-bash-prompt - The bash mode reads .git/HEAD directly
* t1450-fsck.sh - manipulates .git/ directly to create invalid state
Major test failures:
* t1400-update-ref.sh - Reads from .git/{refs,logs} directly
* t1404-update-ref-errors.sh - Manipulates .git/refs/ directly
* t1405 - inspecs .git/ directly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows Git to be compiled via Visual Studio again after integrating
the `hn/reftable` branch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When set in the environment, GIT_DEBUG_REFS makes git print operations and
results as they flow through the ref storage backend. This helps debug
discrepancies between different ref backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unittests are under reftable/*_test.c, so all of the reftable code stays in
one directory. They are called from t/helpers/test-reftable.c in t0031-reftable.sh
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For background, see the previous commit introducing the library.
This introduces the refs/reftable-backend.c containing reftable powered ref
storage backend.
It can be activated by passing --ref-storage=reftable to "git init".
TODO:
* Fix worktree commands
* Spots marked XXX
Example use: see t/t0031-reftable.sh
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable is a format for storing the ref database. Its rationale and
specification is in the preceding commit.
This imports the upstream library as one big commit. For understanding
the code, it is suggested to read the code in the following order:
* The specification under Documentation/technical/reftable.txt
* reftable.h - the public API
* record.{c,h} - reading and writing records
* block.{c,h} - reading and writing blocks.
* writer.{c,h} - writing a complete reftable file.
* merged.{c,h} and pq.{c,h} - reading a stack of reftables
* stack.{c,h} - writing and compacting stacks of reftable on the
filesystem.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This happens implicitly in the files/packed ref backend; making it
explicit simplifies adding alternate ref storage backends, such as
reftable.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
REF_LOG_ONLY is used in the transaction preparation: if a symref is involved in
a transaction, the referent of the symref should be updated, and the symref
itself should only be updated in the reflog. Other ref backends will need to
duplicate this logic.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check for existence and delete CHERRY_PICK_HEAD through pseudo ref functions.
This will help cherry-pick work with alternate ref storage backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the git-bisect.sh as bisect--helper inspected the file system directly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pseudorefs store transient data in in the repository. Examples are HEAD,
CHERRY_PICK_HEAD, etc.
These refs have always been read through the ref backends, but they were written
in a one-off routine that wrote an object ID or symref directly into
.git/<pseudo_ref_name>.
This causes problems when introducing a new ref storage backend. To remedy this,
extend the ref backend implementation with a write_pseudoref_fn and
update_pseudoref_fn.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable precisely reproduces the given message. This leads to differences,
because the files backend implicitly adds a trailing '\n' to all messages.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters were released in v2.27.0, but have a
significant drawback. A user can opt-in to writing the changed-path
filters using the "--changed-paths" option to "git commit-graph write"
but the next write will drop the filters unless that option is
specified.
This becomes even more important when considering the interaction with
gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
features.experimental). These config options trigger commit-graph writes
that the user did not signal, and hence there is no --changed-paths
option available.
Allow a user that opts-in to the changed-path filters to persist the
property of "my commit-graph has changed-path filters" automatically. A
user can drop filters using the --no-changed-paths option.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
43d3561 (commit-graph write: don't die if the existing graph is corrupt,
2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment
variable. This was created to verify that commit-graph was not loaded
when writing a new non-incremental commit-graph.
An upcoming change wants to load a commit-graph in some valuable cases,
but we want to maintain that we don't trust the commit-graph data when
writing our new file. Instead of dying on load, instead die if we ever
try to parse a commit from the commit-graph. This functionally verifies
the same intended behavior, but allows a more advanced feature in the
next change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original design of changed-path Bloom filters included an 8-byte
block size for filter lengths. This was changed mid-way through the
submission process, and now the length stored in the commit-graph has
one-byte granularity.
This can cause some issues for very small filters. The analysis for
false positive rates assume large filters, so rounding errors become
less important at that scale. When there are only a few paths changed,
a filter that has size only a few bytes could have very different
behavior. In fact, this is evidenced in the Git repository due to the
code organization and careful patch creation that leads to many commits
with very small filters. These small filters frequently have
false-positive rates in the 8-10% range or higher.
The previous change improved the false-positive rate using multiple
Bloom keys when the path has multiple directory components. However,
that does not help at all for files at root. It is typical to have
several commits that change only the README at root, and those commits
would be likely to have these artificially high false-positive rates.
Correct this issue by creating a minimum filters size of 8 bytes. This
requires the very small commits (with fewer than six changes, including
non-root directories) to have a larger filter. In principle, this
violates the bits_per_entry value of struct bloom_filter_settings.
However, it does not actually create a functional problem.
As for compatibility, this only affects new versions writing filters for
commits that do not yet have a filter. Old version will write the
smaller filters and this version will persist and properly read that
data. Now, the new files will be generated slightly larger.
Bytes before Bytes after Difference
--------------------------------------------------
git 4,021,078 4,275,311 +6.32%
linux 72,212,101 73,909,286 +2.35%
tensorflow 7,596,359 7,691,646 +1.25%
This has a measurable improvement in the false-positive rate and the
end-to-end run time for these repos. The table below compares the average
false-positive rate and runtime of
git rev-list HEAD -- "$path"
before and after this change for 5000+ randomly* selected paths from
each repository:
Average false Average Average
positive rate runtime runtime
before after before after difference
------------------------------------------------------------------
git 0.786% 0.227% 0.0387s 0.0289s -25.5%
linux 0.0296% 0.0174% 0.0766s 0.0706s -7.8%
tensorflow 0.6977% 0.0268% 0.0420s 0.0384s -8.5%
*Path selection was done with the following pipeline:
git ls-tree -r --name-only HEAD | sort -R | head -n 5000
These relatively-small increases in file size appear to be a fair price
to pay for these performance improvements.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file 'dir/subdir/file' can only be modified if its leading
directories 'dir' and 'dir/subdir' are modified as well.
So when checking modified path Bloom filters looking for commits
modifying a path with multiple path components, then check not only
the full path in the Bloom filters, but all its leading directories as
well. Take care to check these paths in "deepest first" order,
because it's the full path that is least likely to be modified, and
the Bloom filter queries can short circuit sooner.
This can significantly reduce the average false positive rate, by
about an order of magnitude or three(!), and can further speed up
pathspec-limited revision walks. The table below compares the average
false positive rate and runtime of
git rev-list HEAD -- "$path"
before and after this change for 5000+ randomly* selected paths from
each repository:
Average false Average Average
positive rate runtime runtime
before after before after difference
------------------------------------------------------------------
git 3.220% 0.7853% 0.0558s 0.0387s -30.6%
linux 2.453% 0.0296% 0.1046s 0.0766s -26.8%
tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2%
*Path selection was done with the following pipeline:
git ls-tree -r --name-only HEAD | sort -R | head -n 5000
The improvements in runtime are much smaller than the improvements in
average false positive rate, as we are clearly reaching diminishing
returns here. However, all these timings depend on that accessing
tree objects is reasonably fast (warm caches). If we had a partial
clone and the tree objects had to be fetched from a promisor remote,
e.g.:
$ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git
$ git -C webkit.git -c core.modifiedPathBloomFilters=1 \
commit-graph write --reachable
$ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/
$ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \
rev-list HEAD -- "$path"
then checking all leading path component can reduce the runtime from
over an hour to a few seconds (and this is with the clone and the
promisor on the same machine).
This adjusts the tracing values in t4216-log-bloom.sh, which provides a
concrete way to notice the improvement.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In my experience while experimenting with new commit-graph chunks,
early versions of the corresponding new write_commit_graph_my_chunk()
functions are, sadly but not surprisingly, often buggy, and write more
or less data than they are supposed to, especially if the chunk size
is not directly proportional to the number of commits. This then
causes all kinds of issues when reading such a bogus commit-graph
file, raising the question of whether the writing or the reading part
happens to be buggy this time.
Let's catch such issues early, already when writing the commit-graph
file, and check that each write_graph_chunk_*() function wrote the
amount of data that it was expected to, and what has been encoded in
the Chunk Lookup table. Now that all commit-graph chunks are written
in a loop we can do this check in a single place for all chunks, and
any chunks added in the future will get checked as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_commit_graph_file() we now have one block of code filling the
array of 'struct chunk_info' with the IDs and sizes of chunks to be
written, and an other block of code calling the functions responsible
for writing individual chunks. In case of optional chunks like Extra
Edge List an Base Graphs List there is also a condition checking
whether that chunk is necessary/desired, and that same condition is
repeated in both blocks of code. Other, newer chunks have similar
optional conditions.
Eliminate these repeated conditions by storing the function pointers
responsible for writing individual chunks in the 'struct chunk_info'
array as well, and calling them in a loop to write the commit-graph
file. This will open up the possibility for a bit of foolproofing in
the following patch.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the write_graph_chunk_*() helper functions to have the same
signature:
- Return an int error code from all these functions.
write_graph_chunk_base() already has an int error code, now the
others will have one, too, but since they don't indicate any
error, they will always return 0.
- Drop the hash size parameter of write_graph_chunk_oids() and
write_graph_chunk_data(); its value can be read directly from
'the_hash_algo' inside these functions as well.
This opens up the possibility for further cleanups and foolproofing in
the following two patches.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Place an instance of struct bloom_settings into the struct
write_commit_graph_context. This allows simplifying the function
prototype of write_graph_chunk_bloom_data(). This will allow us
to combine the function prototypes and use function pointers to
simplify write_commit_graph_file().
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repo_is_descendant_of() method is aware of the existence of the
commit-graph file. It checks for generation_numbers_enabled() before
deciding on using can_all_from_reach() or repo_in_merge_bases()
depending on the situation. The reason here is that can_all_from_reach()
uses a depth-first search that is limited by the minimum generation
number of the target commits, and that algorithm can be very slow when
generation numbers are not present. The alternative uses
paint_down_to_common() which will walk the entire merge-base boundary,
which is typically slower.
This method is used by commands like "git tag --contains" and "git
branch --contains" for very fast results when a commit-graph file
exists. Unfortunately, it is _not_ used in commands like "git merge-base
--is-ancestor" which is doing an even simpler request.
This issue was raised recently [1] with respect to a change to how
generation numbers are stored, but was also reported much earlier [2]
before commit-reach.c existed to simplify these reachability queries.
[1] https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2] https://lore.kernel.org/git/87608bawoa.fsf@evledraar.gmail.com/
The root cause is that builtin/merge-base.c has a method
handle_is_ancestor() that calls in_merge_bases(), an older version of
repo_in_merge_bases(). It would be better if we have every caller to
in_merge_bases() use the logic in can_all_from_reach() when possible.
This is where things get a little tricky: repo_is_descendant_of() calls
repo_in_merge_bases() in the non-generation numbers enabled case! If we
simply update repo_in_merge_bases() to call repo_is_descendant_of()
instead of repo_in_merge_bases_many(), then we will get a recursive call
loop. Thankfully, this is caught by the test suite in the default mode
(i.e. GIT_TEST_COMMIT_GRAPH=0).
The trick, then, is to make the non-generation number case for
repo_is_descendant_of() call repo_in_merge_bases_many() directly,
skipping the non-_many version. This allows us to take advantage of this
faster code path, when possible.
The easiest way to measure the performance impact is to test the
following command on the Linux kernel repository:
git merge-base --is-ancestor <A> <B>
| A | B | Time Before | Time After |
|------|------|-------------|------------|
| v3.0 | v5.7 | 0.459s | 0.028s |
| v4.0 | v5.7 | 0.267s | 0.021s |
| v5.0 | v5.7 | 0.074s | 0.013s |
Note that each of these samples return success. The old code performed
the same operation when <A> and <B> are swapped. However,
can_all_from_reach() will return immediately if the generation numbers
show that <A> has larger generation number than <B>. Thus, the time for
the swapped case is universally 0.004s in each case.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next change will make repo_in_merge_bases() depend on the logic in
is_descendant_of(), but we need to make the method independent of
the_repository first.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git branch` accepts `--edit-description` in conjunction with other
arguments. However, `--edit-description` is its own mode, similar to
`--set-upstream-to`, which is also made mutually exclusive with other
modes. Prevent `--edit-description` from being mixed with other modes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "--set-upstream-to" and "--unset-upstream" tests, specific error
conditions are being tested. However, there is no way of ensuring that a
test case is failing because of some specific error.
Check stderr of failing commands to ensure that they are failing in the
expected way.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up style of test by changing some filenames from "expected" to
"expect", which follows typical test convention.
Also, change a space-indent into a tab-indent.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6b1db43109 ("clean: teach clean -d to preserve ignored paths",
2017-05-23) added the following code block (among others) to git-clean:
if (remove_directories)
dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
The reason for these flags is well documented in the commit message, but
isn't obvious just from looking at the code. Add some explanations to
the code to make it clearer.
Further, it appears git-2.26 did not correctly handle this combination
of flags from git-clean. With both these flags and without
DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into
all untracked AND ignored directories. git-2.26.0 clearly was not doing
that. I don't know the full reasons for that or whether git < 2.27.0
had additional unknown bugs because of that misbehavior, because I don't
feel it's worth digging into. As per the huge changes and craziness
documented in commit 8d92fb2927 ("dir: replace exponential algorithm
with a linear one", 2020-04-01), the old algorithm was a mess and was
thrown out. What I can say is that git-2.27.0 correctly recurses into
untracked AND ignored directories with that combination.
However, in clean's case we don't need to recurse into ignored
directories; that is just a waste of time. Thus, when git-2.27.0
started correctly handling those flags, we got a performance regression
report. Rather than relying on other bugs in fill_directory()'s former
logic to provide the behavior of skipping ignored directories, make use
of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in
commit eec0f7f2b7 ("status: add option to show ignored files
differently", 2017-10-30) for this purpose.
Reported-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I spent a long time trying to figure out how and whether the code worked
with different values of ignore, ignore_only, and remove_directories.
After lots of time setting up lots of testcases, sifting through lots of
print statements, and walking through the debugger, I finally realized
that one piece of code related to how it was all setup was found in
clean.c rather than dir.c. Make a change that would have made it easier
for me to do the extra testing by putting this handling in one spot.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.h documented quite clearly that DIR_SHOW_IGNORED and
DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this
effect by the definition of both enum values. However, a command like
git clean -fx $DIR
would set both values for dir.flags. I _think_ it happened to work
because:
* As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect
if DIR_SHOW_IGNORED_TOO is set.
* As coded, I believe DIR_SHOW_IGNORED would just happen to take
precedence over DIR_SHOW_IGNORED_TOO in the code as currently
constructed.
Which is a long way of saying "we just got lucky".
Fix clean.c to avoid setting these mutually exclusive values at the same
time, and add a check to dir.c that will throw a BUG() to prevent anyone
else from making this mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ignoring the sparse-checkout feature momentarily, if one has a submodule and
creates local branches within it with unpushed changes and maybe adds some
untracked files to it, then we would want to avoid accidentally removing such
a submodule. So, for example with git.git, if you run
git checkout v2.13.0
then the sha1collisiondetection/ submodule is NOT removed even though it
did not exist as a submodule until v2.14.0. Similarly, if you only had
v2.13.0 checked out previously and ran
git checkout v2.14.0
the sha1collisiondetection/ submodule would NOT be automatically
initialized despite being part of v2.14.0. In both cases, git requires
submodules to be initialized or deinitialized separately. Further, we
also have special handling for submodules in other commands such as
clean, which requires two --force flags to delete untracked submodules,
and some commands have a --recurse-submodules flag.
sparse-checkout is very similar to checkout, as evidenced by the similar
name -- it adds and removes files from the working copy. However, for
the same avoid-data-loss reasons we do not want to remove a submodule
from the working copy with checkout, we do not want to do it with
sparse-checkout either. So submodules need to be separately initialized
or deinitialized; changing sparse-checkout rules should not
automatically trigger the removal or vivification of submodules.
I believe the previous wording in git-sparse-checkout.txt about
submodules was only about this particular issue. Unfortunately, the
previous wording could be interpreted to imply that submodules should be
considered active regardless of sparsity patterns. Update the wording
to avoid making such an implication. It may be helpful to consider two
example situations where the differences in wording become important:
In the future, we want users to be able to run commands like
git clone --sparse=moduleA --recurse-submodules $REPO_URL
and have sparsity paths automatically set up and have submodules *within
the sparsity paths* be automatically initialized. We do not want all
submodules in any path to be automatically initialized with that
command.
Similarly, we want to be able to do things like
git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN
and search through $REV for $PATTERN within the recorded sparsity
patterns. We want it to recurse into submodules within those sparsity
patterns, but do not want to recurse into directories that do not match
the sparsity patterns in search of a possible submodule.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preliminary clean-ups around refs API, plus file format
specification documentation for the reftable backend.
* hn/refs-cleanup:
reftable: define version 2 of the spec to accomodate SHA256
reftable: clarify how empty tables should be written
reftable: file format documentation
refs: improve documentation for ref iterator
t: use update-ref and show-ref to reading/writing refs
refs.h: clarify reflog iteration order
Teach .github/workflows/main.yml to use CMake for VS builds.
Modified the vs-test step to match windows-test step. This speeds
up the vs-test. Calling git-cmd from powershell and then calling git-bash
to perform the tests slows things down(factor of about 6). So git-bash
is directly called from powershell to perform the tests using prove.
NOTE: Since GitHub keeps the same directory for each job
(with respect to path) absolute paths are used in the bin-wrapper
scripts.
GitHub has switched to CMake 3.17.1 which changed the behaviour of
FindCURL module. An extra definition (-DCURL_NO_CURL_CMAKE=ON) has been
added to revert to the old behaviour.
In the configuration phase CMake looks for the required libraries for
building git (eg zlib,libiconv). So we extract the libraries before we
configure.
To check for ICONV_OMITS_BOM libiconv.dll needs to be in the working
directory of script or path. So we copy the dlls before we configure.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds support for Visual Studio and Clang builds
The minimum required version of CMake is upgraded to 3.15 because
this version offers proper support for Clang builds on Windows.
Libintl is not searched for when building with Visual Studio or Clang
because there is no binary compatible version available yet.
NOTE: In the link options invalidcontinue.obj has to be included.
The reason for this is because by default, Windows calls abort()'s
instead of setting errno=EINVAL when invalid arguments are passed to
standard functions.
This commit explains it in detail:
4b623d80f7
On Windows the default generator is Visual Studio,so for Visual Studio
builds do this:
cmake `relative-path-to-srcdir`
NOTE: Visual Studio generator is a multi config generator, which means
that Debug and Release builds can be done on the same build directory.
For Clang builds do this:
On bash
CC=clang cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]
On cmd
set CC=Clang
cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch facilitates building git on Windows with CMake using MinGW
NOTE: The funtions unsetenv and hstrerror are not checked in Windows
builds.
Reasons
NO_UNSETENV is not compatible with Windows builds.
lines 262-264 compat/mingw.h
compat/mingw.h(line 25) provides a definition of hstrerror which
conflicts with the definition provided in
git-compat-util.h(lines 733-736).
To use CMake on Windows with MinGW do this:
cmake `relative-path-to-srcdir` -G "MinGW Makefiles"
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch allows git to be tested when performin out of source builds.
This involves changing GIT_BUILD_DIR in t/test-lib.sh to point to the
build directory. Also some miscellaneous copies from the source directory
to the build directory.
The copies are:
t/chainlint.sed needed by a bunch of test scripts
po/is.po needed by t0204-gettext-rencode-sanity
mergetools/tkdiff needed by t7800-difftool
contrib/completion/git-prompt.sh needed by t9903-bash-prompt
contrib/completion/git-completion.bash needed by t9902-completion
contrib/svn-fe/svnrdump_sim.py needed by t9020-remote-svn
NOTE: t/test-lib.sh is only modified when tests are run not during
the build or configure.
The trash directory is still srcdir/t
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch provides an alternate way to test git using ctest.
CTest ships with CMake, so there is no additional dependency being
introduced.
To perform the tests with ctest do this after building:
ctest -j[number of jobs]
NOTE: -j is optional, the default number of jobs is 1
Each of the jobs does this:
cd t/ && sh t[something].sh
The reason for using CTest is that it logs the output of the tests
in a neat way, which can be helpful during diagnosis of failures.
After the tests have run ctest generates three log files located in
`build-directory`/Testing/Temporary/
These log files are:
CTestCostData.txt:
This file contains the time taken to complete each test.
LastTestsFailed.log:
This log file contains the names of the tests that have failed in the
run.
LastTest.log:
This log file contains the log of all the tests that have run.
A snippet of the file is given below.
10/901 Testing: D:/my/git-master/t/t0009-prio-queue.sh
10/901 Test: D:/my/git-master/t/t0009-prio-queue.sh
Command: "sh.exe" "D:/my/git-master/t/t0009-prio-queue.sh"
Directory: D:/my/git-master/t
"D:/my/git-master/t/t0009-prio-queue.sh"
Output:
----------------------------------------------------------
ok 1 - basic ordering
ok 2 - mixed put and get
ok 3 - notice empty queue
ok 4 - stack order
passed all 4 test(s)
1..4
<end of output>
Test time = 1.11 sec
NOTE: Testing only works when building in source for now.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Install the built binaries and scripts using CMake
This is very similar to `make install`.
By default the destination directory(DESTDIR) is /usr/local/ on Linux
To set a custom installation path do this:
cmake `relative-path-to-srcdir`
-DCMAKE_INSTALL_PREFIX=`preferred-install-path`
Then run `make install`
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the placeholder substitution to generate scripted
Porcelain commands, e.g. git-request-pull out of
git-request-pull.sh
Generate shell/perl/python scripts and template using CMake instead of
using sed like the build procedure in the Makefile does.
The text translations are only build if `msgfmt` is found in your path.
NOTE: The scripts and templates are generated during configuration.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the moment, the recommended way to configure Git's builds is to
simply run `make`. If that does not work, the recommended strategy is to
look at the top of the `Makefile` to see whether any "Makefile knob" has
to be turned on/off, e.g. `make NO_OPENSSL=YesPlease`.
Alternatively, Git also has an `autoconf` setup which allows configuring
builds via `./configure [<option>...]`.
Both of these options are fine if the developer works on Unix or Linux.
But on Windows, we have to jump through hoops to configure a build
(read: we force the user to install a full Git for Windows SDK, which
occupies around two gigabytes (!) on disk and downloads about three
quarters of a gigabyte worth of Git objects).
The build infrastructure for Git is written around being able to run
make, which is not supported natively on Windows.
To help Windows developers a CMake build script is introduced here.
With a working support CMake, developers on Windows need only install
CMake, configure their build, load the generated Visual Studio solution
and immediately start modifying the code and build their own version of
Git. Likewise, developers on other platforms can use the convenient GUI
tools provided by CMake to configure their build.
So let's start building CMake support for Git.
This is only the first step, and to make it easier to review, it only
allows for configuring builds on the platform that is easiest to
configure for: Linux.
The CMake script checks whether the headers are present(eg. libgen.h),
whether the functions are present(eg. memmem), whether the funtions work
properly (eg. snprintf) and generate the required compile definitions
for the platform. The script also searches for the required libraries,
if it fails to find the required libraries the respective executables
won't be built.(eg. If libcurl is not found then git-remote-http won't
be built). This will help building Git easier.
With a CMake script an out of source build of git is possible resulting
in a clean source tree.
Note: this patch asks for the minimum version v3.14 of CMake (which is
not all that old as of time of writing) because that is the first
version to offer a platform-independent way to generate hardlinks as
part of the build. This is needed to generate all those hardlinks for
the built-in commands of Git.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sparse checkout is enabled, some users expect the output of certain
commands (such as grep, diff, and log) to be also restricted within the
sparsity patterns. This would allow them to effectively work only on the
subset of files in which they are interested; and allow some commands to
possibly perform better, by not considering uninteresting paths. For
this reason, we taught grep to honor the sparsity patterns, in the
previous patch. But, on the other hand, allowing grep and the other
commands mentioned to optionally ignore the patterns also make for some
interesting use cases. E.g. using grep to search for a function
documentation that resides outside the sparse checkout.
In any case, there is no current way for users to configure the behavior
they want for these commands. Aiming to provide this flexibility, let's
introduce the sparse.restrictCmds setting (and the analogous
--[no]-restrict-to-sparse-paths global option). The default value is
true. For now, grep is the only one affected by this setting, but the
goal is to have support for more commands, in the future.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the main uses for a sparse checkout is to allow users to focus on
the subset of files in a repository in which they are interested. But
git-grep currently ignores the sparsity patterns and reports all matches
found outside this subset, which kind of goes in the opposite direction.
There are some use cases for ignoring the sparsity patterns and the next
commit will add an option to obtain this behavior, but here we start by
making grep honor the sparsity boundaries in every case where this is
relevant:
- git grep in worktree
- git grep --cached
- git grep $REVISION
For the worktree and cached cases, we iterate over paths without the
SKIP_WORKTREE bit set, and limit our searches to these paths. For the
$REVISION case, we limit the paths we search to those that match the
sparsity patterns. (We do not check the SKIP_WORKTREE bit for the
$REVISION case, because $REVISION may contain paths that do not exist in
HEAD and thus for which we have no SKIP_WORKTREE bit to consult. The
sparsity patterns tell us how the SKIP_WORKTREE bit would be set if we
were to check out $REVISION, so we consult those. Also, we don't use the
sparsity patterns with the worktree or cached cases, both because we
have a bit we can check directly and more efficiently, and because
unmerged entries from a merge or a rebase could cause more files to
temporarily be present than the sparsity patterns would normally
select.)
Note that there is a special case here: `git grep $TREE`. In this case,
we cannot know whether $TREE corresponds to the root of the repository
or some sub-tree, and thus there is no way for us to know which sparsity
patterns, if any, apply. So the $TREE case will not use sparsity
patterns or any SKIP_WORKTREE bits and will instead always search all
files within the $TREE.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the steps in do_git_config_sequence() is to load the
worktree-specific config file. Although the function receives a git_dir
string, it relies on git_pathdup(), which uses the_repository->git_dir,
to make the path to the file. Furthermore, it also checks that
extensions.worktreeConfig is set through the
repository_format_worktree_config variable, which refers to
the_repository only. Thus, when a submodule has worktree-specific
settings, a command executed in the superproject that recurses into the
submodule won't find the said settings.
This will be especially important in the next patch: git-grep will learn
to honor sparse checkouts and, when running with --recurse-submodules,
the submodule's sparse checkout settings must be loaded. As these
settings are stored in the config.worktree file, they would be ignored
without this patch. So let's fix this by reading the right
config.worktree file and extensions.worktreeConfig setting, based on the
git_dir and commondir paths given to do_git_config_sequence(). Also
add a test to avoid any regressions.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-config parses its arguments in an if-else chain, with one arm for
each available subcommand. Every arm expects (and checks) that argv
corresponds to something like "config <subcommand> [<subcommand args>]".
This means that whenever we want to change the syntax to accommodate a
new argument before <subcommand> (as we will do in the next patch), we
also need to increment the indexes accessing argv everywhere in the
if-else chain. This makes patches adding new options much noisier than
they need to be, besides being error-prone. So let's skip the "config"
argument in argv and argc to take the extra complexity out of such
patches (as the following one).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test-config helper may exit with a variety of at least four
different codes, to reflect the status of the requested operations.
These codes are sometimes checked in the tests, but not all of the codes
are returned consistently by the helper: 1 will usually refer to a
"value not found", but usage errors can also return 1 or 128. Moreover,
128 is also expected on errors within the configset functions. These
inconsistent uses of the exit codes can lead to false positives in the
tests. Although all tests which expect errors and check the helper's
exit code currently also check the output, it's still better to
standardize the exit codes and avoid future problems in new tests.
While we are here, let's also check that we have the expected argc for
configset_get_value and configset_get_value_multi, before trying to use
argv.
Note: this change is implemented with the unification of the exit
labels. This might seem unnecessary, for now, but it will benefit the
next patch, which will increase the cleanup section.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since all invocations of test_submodule_forced_switch() are git
commands, automatically prepend "git" before invoking
test_submodule_switch_common().
Similarly, many invocations of test_submodule_switch() are also git
commands so automatically prepend "git" before invoking
test_submodule_switch_common() as well.
Finally, for invocations of test_submodule_switch() that invoke a custom
function, rename the old function to test_submodule_switch_func().
This is necessary because in a future commit, we will be adding some
logic that needs to distinguish between an invocation of a plain git
comamnd and an invocation of a test helper function.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As 'upload-pack.c' is now using 'struct upload_pack_data'
thoroughly, let's refactor some common code into a new
do_got_oid() function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'oldest_have' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to got_oid(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to ok_to_give_up(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_acks(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to process_haves(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's change allow_unadvertised_object_request,
which is now part of 'upload_pack_data', from an 'unsigned int'
to an enum.
This will make it clear which values this variable can take.
While at it let's change this variable name to 'allow_uor' to
make it shorter.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_unadvertised_object_request'
static variable into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'extra_edge_obj' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's properly initialize and clear 'extra_edge_obj'
in the appropriate 'upload_pack_data' initialization and
clearing functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'shallow_nr' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_unshallow(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen_by_rev_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_shallow_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to send part of its packfile response as URIs.
An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a fetch results in a packfile being downloaded, a .keep file is
generated, so that the packfile can be preserved (from, say, a running
"git repack") until refs are written referring to the contents of the
packfile.
In a subsequent patch, a successful fetch using protocol v2 may result
in more than one .keep file being generated. Therefore, teach
fetch_pack() and the transport mechanism to support multiple .keep
files.
Implementation notes:
- builtin/fetch-pack.c normally does not generate .keep files, and thus
is unaffected by this or future changes. However, it has an
undocumented "--lock-pack" feature, used by remote-curl.c when
implementing the "fetch" remote helper command. In keeping with the
remote helper protocol, only one "lock" line will ever be written;
the rest will result in warnings to stderr. However, in practice,
warnings will never be written because the remote-curl.c "fetch" is
only used for protocol v0/v1 (which will not generate multiple .keep
files). (Protocol v2 uses the "stateless-connect" command, not the
"fetch" command.)
- connected.c has an optimization in that connectivity checks on a ref
need not be done if the target object is in a pack known to be
self-contained and connected. If there are multiple packfiles, this
optimization can no longer be done.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent patches will change how the output of pack-objects is
processed, so extract that processing into its own function.
Currently, at most 1 character can be buffered (in the "buffered" local
variable). One of those patches will require a larger buffer, so replace
that "buffered" local variable with a buffer array.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current C Git implementation expects Git servers to follow a
specific order of sections when transmitting protocol v2 responses, but
this is not explicit in the documentation. Make the order explicit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach http-fetch the ability to download packfiles directly, given a
URL, and to verify them.
The http_pack_request suite has been augmented with a function that
takes a URL directly. With this function, the hash is only used to
determine the name of the temporary file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_main() in http-fetch.c will grow in a future patch, so refactor the
HTTP walking part into its own function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git fetches a pack using dumb HTTP, (among other things) it invokes
index-pack on a ".pack.temp" packfile, specifying the filename as an
argument.
A future commit will require the aforementioned invocation of index-pack
to also generate a "keep" file. To use this, we either have to use
index-pack's naming convention (because --keep requires the pack's
filename to end with ".pack") or to pass the pack through stdin. Of the
two, it is simpler to pass the pack through stdin.
Thus, teach http to pass --stdin to index-pack. As a bonus, the code is
now simpler.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatting the commit message for merge commits, Git appends "into
<branch-name>" unless the current branch is the default branch.
Now that we can configure what the default branch name should be, we
will want to respect that setting in that scenario rather than using the
compiled-in default branch name.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.
Now that we can configure what the default branch name should be, we
will want `git clone` to respect that setting.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To allow for overriding the default branch name, we have introduced a
config setting. With this patch, the `git submodule` command learns
about this, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the default branch name can now be configured, the `testsvn`
remote helper needs to be told about it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When mentioning the default branch name in an error message, we want to
go with the preference specified by the user.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When guessing the default branch name of a remote, and there are no refs
to guess from, we want to go with the preference specified by the user
for the fall-back.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a growing number of projects trying to avoid the non-inclusive
name `master` in their repositories. For existing repositories, this
requires manual work. For new repositories, the only way to do that
automatically is by copying all of Git's template directory, then
hard-coding the desired default branch name into the `.git/HEAD` file,
and then configuring `init.templateDir` to point to those copied
template files.
To make this process much less cumbersome, let's introduce support for
`core.defaultBranchName`. That way, users won't need to keep their
copied template files up to date, and won't interfere with default hooks
installed by their administrators.
While at it, also let users set the default branch name via the
environment variable `GIT_TEST_DEFAULT_BRANCH_NAME`, in preparation for
adjusting Git's test suite to a more inclusive default branch name. As
is common in Git, the `GIT_TEST_*` variable takes precedence over the
config setting.
Note: we use the prefix `core.` instead of `init.` because we want to
adjust also `git clone`, `git fmt-merge-msg` and other commands over the
course of the next commits to respect this setting.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" takes special care to avoid creating a new worktree
at a location already registered to an existing worktree even if that
worktree is missing (which can happen, for instance, if the worktree
resides on removable media). "git worktree move", however, is not so
careful when validating the destination location and will happily move
the source worktree atop the location of a missing worktree. This leads
to the anomalous situation of multiple worktrees being associated with
the same path, which is expressly forbidden by design. For example:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
$ git worktree remove ../bar
fatal: validation failed, cannot remove working tree:
'.../bar' does not point back to '.git/worktrees/bar'
Fix this shortcoming by enhancing "git worktree move" to perform the
same additional validation of the destination directory as done by "git
worktree add".
While at it, add a test to verify that "git worktree move" won't move a
worktree atop an existing (non-worktree) path -- a restriction which has
always been in place but was never tested.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" checks that the specified path is a valid location
for a new worktree by ensuring that the path does not already exist and
is not already registered to another worktree (a path can be registered
but missing, for instance, if it resides on removable media). Since "git
worktree add" is not the only command which should perform such
validation ("git worktree move" ought to also), generalize the the
validation function for use by other callers, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree prune" detects when multiple entries are associated with
the same path and prunes the duplicates, however, it does not detect
when a linked worktree points at the path of the main worktree.
Although "git worktree add" disallows creating a new worktree with the
same path as the main worktree, such a case can arise outside the
control of Git even without the user mucking with .git/worktree/<id>/
administrative files. For instance:
$ git clone foo.git
$ git -C foo worktree add ../bar
$ rm -rf bar
$ mv foo bar
$ git -C bar worktree list
.../bar deadfeeb [master]
.../bar deadfeeb [bar]
Help the user recover from such corruption by extending "git worktree
prune" to also detect when a linked worktree is associated with the path
of the main worktree.
Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fundamental restriction of linked working trees is that there must
only ever be a single worktree associated with a particular path, thus
"git worktree add" explicitly disallows creation of a new worktree at
the same location as an existing registered worktree. Nevertheless,
users can still "shoot themselves in the foot" by mucking with
administrative files in .git/worktree/<id>/. Worse, "git worktree move"
is careless[1] and allows a worktree to be moved atop a registered but
missing worktree (which can happen, for instance, if the worktree is on
removable media). For instance:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
Help users recover from this form of corruption by teaching "git
worktree prune" to detect when multiple worktrees are associated with
the same path.
[1]: A subsequent commit will fix "git worktree move" validation to be
more strict.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level logic for removing a worktree is well encapsulated in
delete_git_dir(). However, high-level details related to pruning a
worktree -- such as dealing with verbosity and dry-run mode -- are not
encapsulated. Factor out this high-level logic into its own function so
it can be re-used as new worktree corruption detectors are added.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Readers of the name prune_worktree() are likely to expect the function
to actually prune a worktree, however, it only answers the question
"should this worktree be pruned?". Give it a name more reflective of its
true purpose to avoid such confusion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Version appends a hash ID to the file header, making it slightly larger.
This commit also changes "SHA-1" into "object ID" in many places.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format allows for some ambiguity, as a lone footer also starts
with a valid file header. However, the current JGit code will barf on
this. This commit codifies this behavior into the standard.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn Pearce explains:
Some repositories contain a lot of references (e.g. android at 866k,
rails at 31k). The reftable format provides:
- Near constant time lookup for any single reference, even when the
repository is cold and not in process or kernel cache.
- Near constant time verification if a SHA-1 is referred to by at least
one reference (for allow-tip-sha1-in-want).
- Efficient lookup of an entire namespace, such as `refs/tags/`.
- Support atomic push `O(size_of_update)` operations.
- Combine reflog storage with ref storage.
This file format spec was originally written in July, 2017 by Shawn
Pearce. Some refinements since then were made by Shawn and by Han-Wen
Nienhuys based on experiences implementing and experimenting with the
format. (All of this was in the context of our work at Google and
Google is happy to contribute the result to the Git project.)
Imported from JGit[1]'s current version (c217d33ff,
"Documentation/technical/reftable: improve repo layout", 2020-02-04)
of Documentation/technical/reftable.md and converted to asciidoc by
running
pandoc -t asciidoc -f markdown reftable.md >reftable.txt
using pandoc 2.2.1. The result required the following additional
minor changes:
- removed the [TOC] directive to add a table of contents, since
asciidoc does not support it
- replaced git-scm.com/docs links with linkgit: directives that link
to other pages within Git's documentation
[1] https://eclipse.googlesource.com/jgit/jgit
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite support for GIT_CURL_VERBOSE in terms of GIT_TRACE_CURL.
Looking good.
* jt/curl-verbose-on-trace-curl:
http, imap-send: stop using CURLOPT_VERBOSE
t5551: test that GIT_TRACE_CURL redacts password
Code clean-up.
* cc/upload-pack-data:
upload-pack: use upload_pack_data fields in receive_needs()
upload-pack: pass upload_pack_data to create_pack_file()
upload-pack: remove static variable 'stateless_rpc'
upload-pack: pass upload_pack_data to check_non_tip()
upload-pack: pass upload_pack_data to send_ref()
upload-pack: move symref to upload_pack_data
upload-pack: use upload_pack_data writer in receive_needs()
upload-pack: pass upload_pack_data to receive_needs()
upload-pack: pass upload_pack_data to get_common_commits()
upload-pack: use 'struct upload_pack_data' in upload_pack()
upload-pack: move 'struct upload_pack_data' around
upload-pack: move {want,have}_obj to upload_pack_data
upload-pack: remove unused 'wants' from upload_pack_data
The code to parse "git bisect start" command line was lax in
validating the arguments.
* cb/bisect-helper-parser-fix:
bisect--helper: avoid segfault with bad syntax in `start --term-*`
On-the-wire protocol v2 easily falls into a deadlock between the
remote-curl helper and the fetch-pack process when the server side
prematurely throws an error and disconnects. The communication has
been updated to make it more robust.
* dl/remote-curl-deadlock-fix:
stateless-connect: send response end packet
pkt-line: define PACKET_READ_RESPONSE_END
remote-curl: error on incomplete packet
pkt-line: extern packet_length()
transport: extract common fetch_pack() call
remote-curl: remove label indentation
remote-curl: fix typo
Code simplification and test coverage enhancement.
* bc/filter-process:
t2060: add a test for switch with --orphan and --discard-changes
builtin/checkout: simplify metadata initialization
The command line completion script (in contrib/) tried to complete
"git stash -p" as if it were "git stash push -p", but it was too
aggressive and also affected "git stash show -p", which has been
corrected.
* vs/complete-stash-show-p-fix:
completion: don't override given stash subcommand with -p
The check in "git fsck" to ensure that the tree objects are sorted
still had corner cases it missed unsorted entries.
* rs/fsck-duplicate-names-in-trees:
fsck: detect more in-tree d/f conflicts
t1450: demonstrate undetected in-tree d/f conflict
t1450: increase test coverage of in-tree d/f detection
fsck: fix a typo in a comment
"git bugreport" learns to report what shell is in use.
* es/bugreport-shell:
bugreport: include user interactive shell
help: add shell-path to --build-options
As FreeBSD is not the only platform whose regexp library reports
a REG_ILLSEQ error when fed invalid UTF-8, add logic to detect that
automatically and skip the affected tests.
* cb/t4210-illseq-auto-detect:
t4210: detect REG_ILLSEQ dynamically and skip affected tests
t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ)
"git log -L..." now takes advantage of the "which paths are touched
by this commit?" info stored in the commit-graph system.
* ds/line-log-on-bloom:
line-log: integrate with changed-path Bloom filters
line-log: try to use generation number-based topo-ordering
line-log: more responsive, incremental 'git log -L'
t4211-line-log: add tests for parent oids
line-log: remove unused fields from 'struct line_log_data'
While the MyFirstContribution guide exists and has received some use and
positive reviews, it is still not as discoverable as it could be. Add a
reference to it from the GitHub pull request template, where many
brand-new contributors may look. Also add a reference to it in
SubmittingPatches, which is the central source of guidance for patch
contribution.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each worktree removed by "git worktree prune", it reports the reason
for the removal. All reasons share the common prefix "Removing
worktrees/%s:". As new removal reasons are added, this prefix needs to
be duplicated, which is error-prone and potentially cumbersome.
Therefore, factor out the common prefix.
Although this change seems to increase the "sentence lego quotient", it
should be reasonably safe, as the reason for removal is a distinct
clause, not strictly related to the prefix. Moreover, the "worktrees" in
"Removing worktrees/%s:" is a path literal which ought not be localized,
so by factoring it out, we can more easily avoid exposing that path
fragment to translators.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unify the 'chunk_ids' and 'chunk_sizes' arrays into an array of
'struct chunk_info'. This will allow more cleanups in the following
patches.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_commit_graph_file() one block of code fills the array of
chunk IDs, another block of code fills the array of chunk offsets,
then the chunk IDs and offsets are written to the Chunk Lookup table,
and finally a third block of code writes the actual chunks. In case
of optional chunks like Extra Edge List and Base Graphs List there is
also a condition checking whether that chunk is necessary/desired, and
that same condition is repeated in all those three blocks of code.
This patch series is about to add more optional chunks, so there would
be even more repeated conditions.
Those chunk offsets are relative to the beginning of the file, so they
inherently depend on the size of the Chunk Lookup table, which in turn
depends on the number of chunks that are to be written to the
commit-graph file. IOW at the time we set the first chunk's ID we
can't yet know its offset, because we don't yet know how many chunks
there are.
Simplify this by initially filling an array of chunk sizes, not
offsets, and calculate the offsets based on the chunk sizes only
later, while we are writing the Chunk Lookup table. This way we can
fill the arrays of chunk IDs and sizes in one go, eliminating one set
of repeated conditions.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Chunk Lookup table stores the chunks' starting offset in the
commit-graph file, not their sizes. Consequently, the size of a chunk
can only be calculated by subtracting its offset from the offset of
the subsequent chunk (or that of the terminating label). This is
currenly implemented in a bit complicated way: as we iterate over the
entries of the Chunk Lookup table, we check the id of each chunk and
store its starting offset, then we check the id of the last seen chunk
and calculate its size using its previously saved offset. At the
moment there is only one chunk for which we calculate its size, but
this patch series will add more, and the repeated chunk id checks are
not that pretty.
Instead let's read ahead the offset of the next chunk on each
iteration, so we can calculate the size of each chunk right away,
right where we store its starting offset.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we iterate over all entries of the Chunk Lookup table we make
sure that we don't attempt to read past the end of the mmap-ed
commit-graph file, and check in each iteration that the chunk ID and
offset we are about to read is still within the mmap-ed memory region.
However, these checks in each iteration are not really necessary,
because the number of chunks in the commit-graph file is already known
before this loop from the just parsed commit-graph header.
So let's check that the commit-graph file is large enough for all
entries in the Chunk Lookup table before we start iterating over those
entries, and drop those per-iteration checks. While at it, take into
account the size of everything that is necessary to have a valid
commit-graph file, i.e. the size of the header, the size of the
mandatory OID Fanout chunk, and the size of the signature in the
trailer as well.
Note that this necessitates the change of the error message as well,
and, consequently, have to update the 'detect incorrect chunk count'
test in 't5318-commit-graph.sh' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our CodingGuidelines says that it's sufficient to include one of
'git-compat-util.h' and 'cache.h', but both 'commit-graph.c' and
'commit-graph.h' include both. Let's include only 'git-compat-util.h'
to loose a bunch of unnecessary dependencies; but include 'hash.h',
because 'commit-graph.h' does require the definition of 'struct
object_id'.
'commit-graph.h' explicitly includes 'repository.h' and
'string-list.h', but only needs the declaration of a few structs from
them. Drop these includes and forward-declare the necessary structs
instead.
'commit-graph.c' includes 'dir.h', but doesn't actually use anything
from there, so let's drop that #include as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ll_diff_tree_oid() has only ever returned 0 [1], so it's return value
is basically useless. It's only caller diff_tree_oid() has only ever
returned the return value of ll_diff_tree_oid() as-is [2], so its
return value is just as useless. Most of diff_tree_oid()'s callers
simply ignore its return value, except:
- diff_root_tree_oid() is a thin wrapper around diff_tree_oid() and
returns with its return value, but all of diff_root_tree_oid()'s
callers ignore its return value.
- rev_compare_tree() and rev_same_tree_as_empty() do look at the
return value in a condition, but, since the return value is always
0, the former's < 0 condition is never fulfilled, while the
latter's >= 0 condition is always fulfilled.
So let's drop the return value of ll_diff_tree_oid(), diff_tree_oid()
and diff_root_tree_oid(), and drop those conditions from
rev_compare_tree() and rev_same_tree_as_empty() as well.
[1] ll_diff_tree_oid() and its ancestors have been returning only 0
ever since it was introduced as diff_tree() in 9174026cfe (Add
"diff-tree" program to show which files have changed between two
trees., 2005-04-09).
[2] diff_tree_oid() traces back to diff-tree.c:main() in 9174026cfe as
well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
clear_##slabname() frees only the memory allocated for a commit slab
itself, but entries in the commit slab might own additional memory
outside the slab that should be freed as well. We already have (at
least) one such commit slab, and this patch series is about to add one
more.
To free all additional memory owned by entries on the commit slab the
user of such a slab could iterate over all commits it knows about,
peek whether there is a valid entry associated with each commit, and
free the additional memory, if any. Or it could rely on intimate
knowledge about the internals of the commit slab implementation, and
could itself iterate directly through all entries in the slab, and
free the additional memory. Or it could just leak the additional
memory...
Introduce deep_clear_##slabname() to allow releasing memory owned by
commit slab entries by invoking the 'void free_fn(elemtype *ptr)'
function specified as parameter for each entry in the slab.
Use it in get_shallow_commits() in 'shallow.c' to replace an
open-coded iteration over a commit slab's entries.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph format specifies that "All 4-byte numbers are in
network order", but the commit-graph contains 8-byte integers as well
(file offsets in the Chunk Lookup table), and their byte order is
unspecified.
Clarify that all multi-byte integers are in network byte order.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file format specifies that the chunks may be in any
order. However, if the OID Lookup chunk happens to be the last one in
the file, then any command attempting to access the commit-graph data
will fail with:
fatal: invalid commit position. commit-graph is likely corrupt
In this case the error is wrong, the commit-graph file does conform to
the specification, but the parsing of the Chunk Lookup table is a bit
buggy, and leaves the field holding the number of commits in the
commit-graph zero-initialized.
The number of commits in the commit-graph is determined while parsing
the Chunk Lookup table, by dividing the size of the OID Lookup chunk
with the hash size. However, the Chunk Lookup table doesn't actually
store the size of the chunks, but it stores their starting offset.
Consequently, the size of a chunk can only be calculated by
subtracting the starting offsets of that chunk from the offset of the
subsequent chunk, or in case of the last chunk from the offset
recorded in the terminating label. This is currenly implemented in a
bit complicated way: as we iterate over the entries of the Chunk
Lookup table, we check the ID of each chunk and store its starting
offset, then we check the ID of the last seen chunk and calculate its
size using its previously saved offset if necessary (at the moment
it's only necessary for the OID Lookup chunk). Alas, while parsing
the Chunk Lookup table we only interate through the "real" chunks, but
never look at the terminating label, thus don't even check whether
it's necessary to calulate the size of the last chunk. Consequently,
if the OID Lookup chunk is the last one, then we don't calculate its
size and turn don't run the piece of code determining the number of
commits in the commit graph, leaving the field holding that number
unchanged (i.e. zero-initialized), eventually triggering the sanity
check in load_oid_from_graph().
Fix this by iterating through all entries in the Chunk Lookup table,
including the terminating label.
Note that this is the minimal fix, suitable for the maintenance track.
A better fix would be to simplify how the chunk sizes are calculated,
but that is a more invasive change, less suitable for 'maint', so that
will be done in later patches.
This additional flexibility of scanning more chunks breaks a test for
"git commit-graph verify" so alter that test to mutate the commit-graph
to have an even lower chunk count.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Submodules should be handled the same as regular directories with
respect to the presence of a trailing slash, i.e. commands like:
git diff rev1 rev2 -- $path
git rev-list HEAD -- $path
should produce the same output whether $path is 'submod' or 'submod/'.
This has been fixed in commit 74b4f7f277 (tree-walk.c: ignore trailing
slash on submodule in tree_entry_interesting(), 2014-01-23).
Unfortunately, that commit had the unintended side effect to handle
'submod/anything' the same as 'submod' and 'submod/' as well, e.g.:
$ git log --oneline --name-only -- sha1collisiondetection/whatever
4125f78222 sha1dc: update from upstream
sha1collisiondetection
07a20f569b Makefile: fix unaligned loads in sha1dc with UBSan
sha1collisiondetection
23e37f8e9d sha1dc: update from upstream
sha1collisiondetection
86cfd61e6b sha1dc: optionally use sha1collisiondetection as a submodule
sha1collisiondetection
Fix this by rejecting submodules as partial pathnames when their
trailing slash is followed by anything.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 0b4396f068 (git-p4: make python2.7 the oldest supported version,
2019-12-13), git-p4 was updated to only support 2.7 and newer. Since
Python 2.6 is pretty much ancient history, update CodingGuidelines to
show that 2.7 is the oldest version supported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 48a8c26c62 (Documentation: avoid poor-man's small caps GIT,
2013-01-21), the documentation was amended to spell Git's name as Git
when talking about the system as a whole. However, t/README was skipped
over when the treatment was applied.
Bring t/README into conformance with the CodingGuidelines by casing
"Git" properly.
While we're at it, fix a small typo. Change "the git internal" to "the
Git internals".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the provided free_commit_graph() to properly free the commit graph
in fuzz-commit-graph. Otherwise, the fuzzer itself leaks memory when the
struct contains pointers to allocated memory.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In trace output (when GIT_TRACE_CURL is true), redact the values of all
HTTP cookies by default. Now that auth headers (since the implementation
of GIT_TRACE_CURL in 74c682d3c6 ("http.c: implement the GIT_TRACE_CURL
environment variable", 2016-05-24)) and cookie values (since this
commit) are redacted by default in these traces, also allow the user to
inhibit these redactions through an environment variable.
Since values of all cookies are now redacted by default,
GIT_REDACT_COOKIES (which previously allowed users to select individual
cookies to redact) now has no effect.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_match_pathspec() started life as match_pathspec_depth_1() and for
correctness was only supposed to be called from match_pathspec_depth().
match_pathspec_depth() was later renamed to match_pathspec(), so the
invariant we expect today is that do_match_pathspec() has no direct
callers outside of match_pathspec().
Unfortunately, this intention was lost with the renames of the two
functions, and additional calls to do_match_pathspec() were added in
commits 75a6315f74 ("ls-files: add pathspec matching for submodules",
2016-10-07) and 89a1f4aaf7 ("dir: if our pathspec might match files
under a dir, recurse into it", 2019-09-17). Of course,
do_match_pathspec() had an important advantge over match_pathspec() --
match_pathspec() would hardcode flags to one of two values, and these
new callers needed to pass some other value for flags. Also, although
calling do_match_pathspec() directly was incorrect, there likely wasn't
any difference in the observable end output, because the bug just meant
that fill_diretory() would recurse into unneeded directories. Since
subsequent does-this-path-match checks on individual paths under the
directory would cause those extra paths to be filtered out, the only
difference from using the wrong function was unnecessary computation.
The second of those bad calls to do_match_pathspec() was involved -- via
either direct movement or via copying+editing -- into a number of later
refactors. See commits 777b420347 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19),
8d92fb2927 ("dir: replace exponential algorithm with a linear one",
2020-04-01), and 95c11ecc73 ("Fix error-prone fill_directory() API; make
it only return matches", 2020-04-01). The last of those introduced the
usage of do_match_pathspec() on an individual file, and thus resulted in
individual paths being returned that shouldn't be.
The problem with calling do_match_pathspec() instead of match_pathspec()
is that any negated patterns such as ':!unwanted_path` will be ignored.
Add a new match_pathspec_with_flags() function to fulfill the needs of
specifying special flags while still correctly checking negated
patterns, add a big comment above do_match_pathspec() to prevent others
from misusing it, and correct current callers of do_match_pathspec() to
instead use either match_pathspec() or match_pathspec_with_flags().
One final note is that DO_MATCH_LEADING_PATHSPEC needs special
consideration when working with DO_MATCH_EXCLUDE. The point of
DO_MATCH_LEADING_PATHSPEC is that if we have a pathspec like
*/Makefile
and we are checking a directory path like
src/module/component
that we want to consider it a match so that we recurse into the
directory because it _might_ have a file named Makefile somewhere below.
However, when we are using an exclusion pattern, i.e. we have a pathspec
like
:(exclude)*/Makefile
we do NOT want to say that a directory path like
src/module/component
is a (negative) match. While there *might* be a file named 'Makefile'
somewhere below that directory, there could also be other files and we
cannot pre-emptively rule all the files under that directory out; we
need to recurse and then check individual files. Adjust the
DO_MATCH_LEADING_PATHSPEC logic to only get activated for positive
pathspecs.
Reported-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, extensions were recognized regardless of repository format
version. If the user sets an undefined "extensions" value on a
repository of version 0 and that value is used by a future git version,
they might get an undesired result.
Because all extensions now also upgrade repository versions, tightening
the check would help avoid this for future extensions.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions' configuration variable gets special meaning in the new
repository version, so when enabling the extension we should upgrade the
repository to version 1.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Retroactively adding a filter can be useful for existing shallow clones as
they allow users to see earlier change histories without downloading all
git objects in a regular --unshallow fetch.
Without this patch, users can make a clone partial by editing the
repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1
git config extensions.partialClone origin
git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such
edits can be error-prone, teach Git to perform the required configuration
change automatically instead.
Note that this change does not modify the existing git behavior which
recognizes setting extensions.partialClone without changing
repositoryFormatVersion.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In version 1 of repository format, "extensions" gained special meaning
and it is safer to avoid upgrading when there are pre-existing
extensions.
Make list-objects-filter to use the helper function instead of setting
repository version directly as a prerequisite of exposing the upgrade
capability.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse-checkout's purpose is to update the working tree to have it
reflect a subset of the tracked files. As such, it shouldn't be
switching branches, making commits, downloading or uploading data, or
staging or unstaging changes. Other than updating the worktree, the
only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the
index. In particular, this sets up a nice invariant: running
sparse-checkout will never change the status of any file in `git status`
(reflecting the fact that we only set the SKIP_WORKTREE bit if the file
is safe to delete, i.e. if the file is unmodified).
Traditionally, we did a _really_ bad job with this goal. The
predecessor to sparse-checkout involved manual editing of
.git/info/sparse-checkout and running `git read-tree -mu HEAD`. That
command would stage and unstage changes and overwrite dirty changes in
the working tree.
The initial implementation of the sparse-checkout command was no better;
it simply invoked `git read-tree -mu HEAD` as a subprocess and had the
same caveats, though this issue came up repeatedly in review comments
and workarounds for the problems were put in place before the feature
was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6].
[1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/
[4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/
[6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/
However, these workarounds, in addition to disabling the feature in a
number of important cases, also missed one special case. I'll get back
to it later.
In the 2.27.0 cycle, the disabling of the feature was lifted by finally
replacing the internal equivalent of `git read-tree -mu HEAD` with
something that did what we wanted: the new update_sparsity() function in
unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index
and updates the working tree to match. This new function handles all
the cases that were problematic for the old implementation, except that
it breaks the same special case that avoided the workarounds of the old
implementation, but broke it in a different way.
So...that brings us to the special case: a git clone performed with
--no-checkout. As per the meaning of the flag, --no-checkout does not
check out any branch, with the implication that you aren't on one and
need to switch to one after the clone. Implementationally, HEAD is
still set (so in some sense you are partially on a branch), but
* the index is "unborn" (non-existent)
* there are no files in the working tree (other than .git/)
* the next time git switch (or git checkout) is run it will run
unpack_trees with `initial_checkout` flag set to true.
It is not until you run, e.g. `git switch <somebranch>` that the index
will be written and files in the working tree populated.
With this special --no-checkout case, the traditional `read-tree -mu
HEAD` behavior would have done the equivalent of acting like checkout --
switch to the default branch (HEAD), write out an index that matches
HEAD, and update the working tree to match. This special case slipped
through the avoid-making-changes checks in the original sparse-checkout
command and thus continued there.
After update_sparsity() was introduced and used (see commit f56f31af03
("sparse-checkout: use new update_sparsity() function", 2020-03-27)),
the behavior for the --no-checkout case changed: Due to git's
auto-vivification of an empty in-memory index (see do_read_index() and
note that `must_exist` is false), and due to sparse-checkout's
update_working_directory() code to always write out the index after it
was done, we got a new bug. That made it so that sparse-checkout would
switch the repository from a clone with an "unborn" index (i.e. still
needing an initial_checkout), to one that had a recorded index with no
entries. Thus, instead of all the files appearing deleted in `git
status` being known to git as a special artifact of not yet being on a
branch, our recording of an empty index made it suddenly look to git as
though it was definitely on a branch with ALL files staged for deletion!
A subsequent checkout or switch then had to contend with the fact that
it wasn't on an initial_checkout but had a bunch of staged deletions.
Make sure that sparse-checkout changes nothing in the index other than
the SKIP_WORKTREE bit; in particular, when the index is unborn we do not
have any branch checked out so there is no sparsification or
de-sparsification work to do. Simply return from
update_working_directory() early.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 897d68e7af (Makefile: use curl-config --cflags, 2020-03-26), we
taught the build process to use `curl-config --cflags` to make sure that
it can find cURL's headers.
In the MSVC build, this is completely bogus because we're running in a
Git for Windows SDK whose `curl-config` supports the _GCC_ build.
Let's just ignore each and every `-I<path>` option where `<path>` points
to GCC/Clang specific headers.
Reported by Jeff Hostetler in
https://github.com/microsoft/git/issues/275.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.
Let's scrub them before writing them to the reflog.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'pack_objects_hook' static
variable into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's also free() it in upload_pack_data_clear().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_sideband_all' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_ref_in_want' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_filter' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'keepalive' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to upload_pack_config(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
that are set in upload_pack_config() into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's take this opportunity to change the
'multi_ack' variable, which is now part of 'upload_pack_data',
to an enum.
This will make it clear which values this variable can take.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the multi_ack static variable into
this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the filter_capability_requested
static variable into this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'use_sideband' static variable
into this struct.
This variable is used by both v0 and v2 protocols.
While at it, let's update the comment near the variable
definition.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'no_done', 'daemon_mode' and
'timeout' variables into this struct.
They are only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling them into
upload_pack_data and just letting v2 code ignore them as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's annotate fields from this struct to let
people know which ones are used only for protocol v0 and which
ones only for protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's actually start using some bitfields of
that struct. These bitfields were introduced in 3145ea957d
("upload-pack: introduce fetch server command", 2018-03-15), but
were never used.
We could instead have just removed the following bitfields
from the struct:
unsigned use_thin_pack : 1;
unsigned use_ofs_delta : 1;
unsigned no_progress : 1;
unsigned use_include_tag : 1;
but using them makes it possible to remove a number of static
variables with the same name and purpose from 'upload-pack.c'.
This is a behavior change, as we accidentally used to let values
in those bitfields propagate from one v2 "fetch" command to
another for ssh/git/file connections (but not for http). That's
fixing a bug, but one nobody is likely to see, because it would
imply the client sending different capabilities for each request.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following lines were not covered in a recent line-coverage test
against Git:
builtin/commit-graph.c
5b6653e5 244) progress = start_delayed_progress(
5b6653e5 268) stop_progress(&progress);
These statements are executed when both '--stdin-commits' and
'--progress' are passed. Introduce a trio of tests that exercise various
combinations of these options to ensure that these lines are covered.
More importantly, this is exercising a (somewhat) previously-ignored
feature of '--stdin-commits', which is that it respects '--progress'.
Prior to 5b6653e523 (builtin/commit-graph.c: dereference tags in
builtin, 2020-05-13), dereferencing input from '--stdin-commits' was
done inside of commit-graph.c.
Now that an additional progress meter may be generated from outside of
commit-graph.c, add a corresponding test to make sure that it also
respects '--[no]-progress'.
The other location that generates progress meter output (from d335ce8f24
(commit-graph.c: show progress of finding reachable commits,
2020-05-13)) is already covered by any test that passes '--reachable'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of tests in t5318 use 'test_line_count = 0 ...' to make sure
that some command does not write any output. While correct, it is more
idiomatic to use 'test_must_be_empty' instead. Switch the former
invocations to use the latter instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some repositories in the wild have commits that record nonsense
committer timezone (e.g. rails.git); "git fast-import" learned an
option to pass these nonsense timestamps intact to allow recreating
existing repositories as-is.
* en/fast-import-looser-date:
fast-import: add new --date-format=raw-permissive format
The error message from "git checkout -b foo -t bar baz" was
confusing.
* rs/checkout-b-track-error:
checkout: improve error messages for -b with extra argument
checkout: add tests for -b and --track
We've adopted a convention that any on-stack structure can be
initialized to have zero values in all fields with "= { 0 }", even
when the first field happens to be a pointer, but sparse complained
that a null pointer should be spelled NULL for a long time. Start
using -Wno-universal-initializer option to squelch it.
* lo/sparse-universal-zero-init:
sparse: allow '{ 0 }' to be used without warnings
"feature.experimental" configuration variable is to let volunteers
easily opt into a set of newer features, which use of the v2
transport protocol is now a part of.
* jn/experimental-opts-into-proto-v2:
config: let feature.experimental imply protocol.version=2
The "--prepare-p4-only" option is supposed to stop after replaying
one changeset, but kept going (by mistake?)
* bk/p4-prepare-p4-only-fix:
git-p4.py: fix --prepare-p4-only error with multiple commits
When performing a tree-level diff against the working tree, we may find
that our index stat information is dirty, so we queue a filepair to be
examined later. If the actual content hasn't changed, we call this a
stat-unmatch; the stat information was out of date, but there's no
actual diff. Normally diffcore_std() would detect and remove these
identical filepairs via diffcore_skip_stat_unmatch(). However, when
"--quiet" is used, we want to stop the diff as soon as we see any
changes, so we check for stat-unmatches immediately in diff_change().
That check may require us to actually load the file contents into the
pair of diff_filespecs. If we find that the pair isn't a stat-unmatch,
then no big deal; we'd likely load the contents later anyway to generate
a patch, do rename detection, etc, so we want to hold on to it. But if
it is a stat-unmatch, then we have no more use for that data; the whole
point is that we're going discard the pair. However, we never free the
allocated diff_filespec data.
In most cases, keeping that data isn't a problem. We don't expect a lot
of stat-unmatch entries, and since we're using --quiet, we'd quit as
soon as we saw such a real change anyway. However, there are extreme
cases where it makes a big difference:
1. We'd generally mmap() the working tree half of the pair. And since
the OS may limit the total number of maps, we can run afoul of this
in large repositories. E.g.:
$ cd linux
$ git ls-files | wc -l
67959
$ sysctl vm.max_map_count
vm.max_map_count = 65530
$ git ls-files | xargs touch ;# everything is stat-dirty!
$ git diff --quiet
fatal: mmap failed: Cannot allocate memory
It should be unusual to have so many files stat-dirty, but it's
possible if you've just run a script like "sed -i" or similar.
After this patch, the above correctly exits with code 0.
2. Even if you don't hit mmap limits, the index half of the pair will
have been pulled from the object database into heap memory. Again
in a clone of linux.git, running:
$ git ls-files | head -n 10000 | xargs touch
$ git diff --quiet
peaks at 145MB heap before this patch, and 94MB after.
This patch solves the problem by freeing any diff_filespec data we
picked up during the "--quiet" stat-unmatch check in diff_changes.
Nobody is going to need that data later, so there's no point holding on
to it. There are a few things to note:
- we could skip queueing the pair entirely, which could in theory save
a little work. But there's not much to save, as we need a
diff_filepair to feed to diff_filespec_check_stat_unmatch() anyway.
And since we cache the result of the stat-unmatch checks, a later
call to diffcore_skip_stat_unmatch() call will quickly skip over
them. The diffcore code also counts up the number of stat-unmatched
pairs as it removes them. It's doubtful any callers would care about
that in combination with --quiet, but we'd have to reimplement the
logic here to be on the safe side. So it's not really worth the
trouble.
- I didn't write a test, because we always produce the correct output
unless we run up against system mmap limits, which are both
unportable and expensive to test against. Measuring peak heap
would be interesting, but our perf suite isn't yet capable of that.
- note that diff without "--quiet" does not suffer from the same
problem. In diffcore_skip_stat_unmatch(), we detect the stat-unmatch
entries and drop them immediately, so we're not carrying their data
around.
- you _can_ still trigger the mmap limit problem if you truly have
that many files with actual changes. But it's rather unlikely. The
stat-unmatch check avoids loading the file contents if the sizes
don't match, so you'd need a pretty trivial change in every single
file. Likewise, inexact rename detection might load the data for
many files all at once. But you'd need not just 64k changes, but
that many deletions and additions. The most likely candidate is
perhaps break-detection, which would load the data for all pairs and
keep it around for the content-level diff. But again, you'd need 64k
actually changed files in the first place.
So it's still possible to trigger this case, but it seems like "I
accidentally made all my files stat-dirty" is the most likely case
in the real world.
Reported-by: Jan Christoph Uhde <Jan@UhdeJc.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are multiple repositories in the wild with random, invalid
timezones. Most notably is a commit from rails.git with a timezone of
"+051800"[1]. A few searches will find other repos with that same
invalid timezone as well. Further, Peff reports that GitHub relaxed
their fsck checks in August 2011 to accept any timezone value[2], and
there have been multiple reports to filter-repo about fast-import
crashing while trying to import their existing repositories since they
had timezone values such as "-7349423" and "-43455309"[3].
The existing check on timezone values inside fast-import may prove
useful for people who are crafting fast-import input by hand or with a
new script. For them, the check may help them avoid accidentally
recording invalid dates. (Note that this check is rather simplistic and
there are still several forms of invalid dates that fast-import does not
check for: dates in the future, timezone values with minutes that are
not divisible by 15, and timezone values with minutes that are 60 or
greater.) While this simple check may have some value for those users,
other users or tools will want to import existing repositories as-is.
Provide a --date-format=raw-permissive format that will not error out on
these otherwise invalid timezones so that such existing repositories can
be imported.
[1] 4cf94979c9
[2] https://lore.kernel.org/git/20200521195513.GA1542632@coredump.intra.peff.net/
[3] https://github.com/newren/git-filter-repo/issues/88
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:ruester/git-po-de:
l10n: de.po: Fix typo in the German translation of octopus
l10n: de.po: Update German translation for Git 2.27.0
f1e3df3169 (t: increase test coverage of signature verification output,
2020-03-04) adds GPG dependent tests to t4202 and t6200 that were found
problematic with at least OpenBSD 6.7.
Using an escaped '|' for alternations works only in some implementations
of grep (e.g. GNU and busybox).
It is not part of POSIX[1] and not supported by some BSD, macOS, and
possibly other POSIX compatible implementations.
Use `grep -E`, and write it using extended regular expression.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --orphan option is used to create a local branch which is detached
from the current history. In git switch, it always resets to the empty
tree, and thus the only completion we can provide is a branch name.
Follow the same rules for -c/-C (and -b/-B) when completing the argument
to --orphan.
In the case of git switch, after we complete the argument, there is
nothing more we can complete for git switch, so do not even try. Nothing
else would be valid.
In the case of git checkout, --orphan takes a start point which it uses
to determine the checked out tree, even though it created orphaned
history.
Update the previously added test cases as they are now passing.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit added several test cases highlighting the subpar
completion logic for -c/-C and -b/-B when completing git switch and git
checkout.
In order to distinguish completing the argument vs the start-point for
this option, we now use the wordlist to determine the previous full word
on the command line.
If it's -c or -C (-b/-B for checkout), then we know that we are
completing the argument for the branch name.
Given that a user who already knows the branch name they want to
complete will simply not use completion, it makes sense to complete the
small subset of local branches when completing the argument for -c/-C.
In all other cases, if -c/-C are on the command line but are not the
most recent option, then we must be completing a start-point, and should
allow completing against all references.
Update the -c/-C and -b/-B tests to indicate they now pass.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current completion for the --track option of git switch and git checkout
is sub par. In addition to the DWIM logic of a bare branch name, --track
has DWIM logic to convert specified remote/branch names into a local
branch tracking that remote. For example
$git switch --track origin/master
This will create a local branch name master, that tracks the master
branch of the origin remote.
In fact, git switch --track on its own will not accept other forms of
references. These must instead be specified manually via the -c/-C/-b/-B
options.
Introduce __git_remote_heads() and the "remote-heads" mode for
__git_complete_refs. Use this when the --track option is provided while
completing in _git_switch and _git_checkout. Just as in the --detach
case, we never enable DWIM mode for --track, because it doesn't make
sense.
It should be noted that completion support is still a bit sub par when
it comes to handling -c/-C and --orphan. This will be resolved in
a future change.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like git switch, we should not complete DWIM remote branch names
if --detach has been specified. To avoid this, refactor _git_checkout in
a similar way to _git_switch.
Note that we don't simply clear dwim_opt when we find -d or --detach, as
we will be adding other modes and checks, making this flow easier to
follow.
Update the previously failing tests to show that the breakage has been
resolved.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new --mode option to __git_complete_refs, which allows changing
the behavior to call __git_heads instead of __git_refs.
By passing --mode=heads, __git_complete_refs will only output local
branches. This enables using "--mode=heads --dwim" to enable listing
local branches and the remote unique branch names for DWIM.
Refactor completion support to use the new mode option, rather than
calling __git_heads directly. This has the advantage that we can now
correctly allow local branches along with suitable DWIM refs, rather
than only allowing DWIM when we complete all references.
Choose what mode it uses when calling __git_complete_refs. If -d or
--detach have been provided, then simply complete all refs, but
*without* the DWIM option as these DWIM names won't work properly in
--detach mode.
Otherwise, call __git_complete_refs with the default dwim_opt value and
use the new "heads" mode.
In this way, the basic support for completing just "git switch <TAB>"
will result in only local branches and remote unique names for DWIM.
The basic no-options tests for git switch, as well as several of the
-c/-C tests now pass, so remove the known breakage tags.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new helper, __git_find_last_on_cmdline is introduced, similar to the
already existing __git_find_on_cmdline, but which operates in reverse,
finding the *last* matching word of the provided wordlist.
Use this in a new __git_checkout_default_dwim_mode() function that will
determine when to enable listing of DWIM remote branches.
The __git_find_last_on_cmdline() function is used to determine which
--guess or --no-guess is in effect. If either one is provided, then we
unconditionally enable or disable the DWIM mode based on the last
provided option.
If neither --guess nor --no-guess is provided, then we check for
--no-track, and finally for GIT_COMPLETION_CHECKOUT_NO_GUESS=1.
This function is then used in _git_switch and _git_checkout to improve
the handling for when we enable listing of these DWIM remote branches.
This new logic is more robust, as we will correctly identify superseded
options, and ensure that both _git_switch and _git_checkout enable DWIM
in similar ways.
We can now update a few tests to indicate they pass. A few of the tests
previously added to highlight issues with the old DWIM logic still fail.
This is because of a separate issue related to the default completion
behavior of git switch, which will be addressed in a future change.
Additionally, due to this change, a few tests for the -b/-B handling of
git checkout now fail. This is a minor regression, and will be fixed by
a following change that improves the overall handling of -b/-B. Mark
these tests as known breakages for now.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_complete_refs is the main function used for completing references.
It is primarily used as a wrapper around __git_refs, and is easier to
extend since its arguments are option-like.
One major downside of __git_complete_refs and __git_refs currently, is
the lack of ability to complete only a subset of refs such as branches
(refs/heads) or tags (refs/tags).
Normally, a caller might just decide to use __git_heads() or
__git_tags(). However, in the case of git-switch, it is useful to
complete both branches *and* DWIM remote branch names.
Due to the complexity and implementation of __git_refs, it is not easy
to extend it to support listing only a subset of references.
Instead, we can extend __git_complete_refs to do this. For this to be
done, we must first ensure that "--dwim" support is not tied to calling
__git_refs.
Instead of passing $dwim into __git_refs, we can implement
a __gitcomp_direct_append function which can append to COMPREPLY after
a call to __gitcomp_direct.
If --dwim is passed to __git_complete_refs, use __gitcomp_direct_append
to add the output of __git_dwim_remote_heads to the completion list.
In this way, --dwim support is now independent of calling __git_refs.
A future change will add an additional option to control what set of
references __git_complete_refs will output.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_refs() has the ability to report unique remote names for
supporting completion of remote branch names for the DWIMery of git
checkout and git switch.
For git checkout, this is fine, because it always supports completing
all local references.
However, git switch by default only supports either switching branches
or using this DWIMery to create a local branch tracking the remote
branch.
Future work to cleanup and improve completion support for git switch
will be aided if the remote branch names can be completed separately
from __git_refs.
Extract this logic to a function __git_dwim_remote_heads(), and use it
in __git_refs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The __git_complete_refs uses the "--track" option to specify when to
enable listing of unique remote branches which are used by the DWIM
logic of git checkout and git switch.
Using the term '--track' here is confusing because the git commands
themselves have '--track' as an argument. Additionally, the completion
logic for _git_switch also checks for --track. Keeping the meaning of
track_opt and --track for __git_complete_refs straight from the --track
git switch and git checkout option is difficult when reading this code.
Use the option '--dwim' instead, indicating this is about enabling or
disabling logic related to DWIM mode. Also rename the local variable
track_opt to dwim_opt to further reduce the confusion when reading the
completion code for _git_switch.
Because it is plausible for users to have developed their own
completions which rely on __git_complete_ref, keep --track as a synonym
for --dwim, even though we no longer use it in any of the core git
completion logic. Add a comment explaining why it remains as an
alternative spelling for --dwim.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to -c/-C, --orphan takes an argument which is the branch name to
use. We ought to complete this branch name using similar rules as to how
we complete new branch names for -c/-C and -b/-B. Namely, limit the
total number of options provided by completing to the local branches.
Additionally, git switch --orphan does not take any start point and will
always create using the empty-tree. Thus, after the branch name is
completed, git switch --orphan should not complete any references.
Add test cases showing the expected behavior of --orphan, for both the
argument and starting point.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout
(-c/-C or -b/-B), the commands switch to a different mode: `git switch
-c <branch> <some-referance>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing git switch or git checkout, it makes sense to complete
the branch name differently from the start point.
When completing a branch, one might consider that we do not have
anything worth completing. After all, a new branch must have an entirely
new name. Consider, however, that if a user names branches using some
similar scheme, they might wish to name a new branch by modifying the
name of an existing branch.
To avoid overloading completion for the argument, it seems reasonable to
complete only the local branch names and the valid "Do What I Mean"
remote branch names.
Add tests for the completion of the argument to -c/-C and -b/-B,
highlighting this preferred completion behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout,
-c/-C or -b/-B, the commands operate in a different mode: `git switch -c
<branch> <some-reference>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing the start-point, we ought to always complete all valid
references.
Add tests for the completion of the start-point to -c/-C and -b/-B.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --track option is provided to git switch or git checkout, and
no branch is specified by -c or -b, git will interpret the tracking
branch to determine the local branch name to use. This "Do What I Mean"
logic is similar but distinct from the default DWIM logic of
interpreting a unique remote branch name as a request to create and
track that branch.
For example, `git switch --track origin/master` is interpreted as
a request to create a local branch named master that is tracking
origin/master.
The current completion for git checkout in this regard is only somewhat
poor:
$git checkout --track <TAB>
HEAD
master
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
At least it still includes remote references. The clutter from including
all references isn't too bad.
However, git switch completion is terrible:
$git switch --track <TAB>
master
matching-branch
It only shows local branches, not even allowing any form of completion
of the remote references!
Add tests which highlight the expected behavior of completing --track on
its own.
Note that when -c/-C or -b/-B are provided we do expect completing more
references, but this will be discussed in a future change that addresses
these options specifically.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When completing words for git switch, the completion function correctly
disables the DWIM remote branch names when in the '--detach' mode. These
DWIM remote branch names will not work when the --detach option is
specified, so it does not make sense to complete them.
git checkout, however, does not disable the completion of DWIM remote
branch names in this case.
Add test cases for both git switch and git checkout showing the expected
behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with a single argument that is the name of a remote branch
that does not yet exist locally, both git switch and git checkout can
interpret this as a request to create a local branch that tracks that
remote branch. We call this behavior "Do What I Mean", or DWIM for
short.
To aid in using this DWIM, it makes sense for completion to list these
unique remote branch names when completing possible arguments for git
switch and git checkout. Indeed, both _git_checkout and _git_switch
implement support for completing such DWIM branch names.
In other words, in addition to the usual completions provided for git
switch, this "DWIM" logic means completion will include the names of
branches on remotes that are unique and thus there can be no ambiguity
of which remote to track when creating the local branch.
However, the DWIM logic is not always active. Many options, such as
--no-guess, --no-track, and --track disable this DWIM logic, as they
cause git switch and git checkout to behave in different modes.
Additionally, some completion users do not wish to have tab completion
include these remote names by default, and thus introduced
GIT_COMPLETION_CHECKOUT_NO_GUESS as an optional way to configure the
completion support to disable this feature of completion support.
For this reason, _git_checkout and _git_switch have many rules about
when to enable or disable completing of these remote refs. The two
commands follow similar but not identical rules.
Set aside the question of command modes that do not accept this DWIM
logic (--track, -c, --orphan, --detach) for now. Thinking just about the
main mode of git checkout and git switch, the following guidelines will
help explain the basic rules we ought to support when deciding whether
to list the remote branches for DWIM in completion.
1. if --guess is enabled, we should list DWIM remote branch names, even
if something else would disable it
2. if --no-guess, --no-track or GIT_COMPLETION_CHECKOUT_NO_GUESS=1,
then we should disable listing DWIM remote branch names.
3. Since the '--guess' option is a boolean option, a later --guess
should override --no-guess, and a later --no-guess should override
--guess.
Putting all of these together, add some tests that highlight the
expected behavior of this DWIM logic.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with no options, git switch only allows switching between
branches. The one exception to this is the "Do What I Mean" logic that
allows a unique remote branch name to be interpreted as a request to
create a branch of the same name that is tracking that remote branch.
Unfortunately, the logic for the completion of git switch results in
completing not just branch names, but also pseudorefs like HEAD, tags,
and fully specified <remote>/<branch> references.
For example, we currently complete the following:
$git switch <TAB>
HEAD
branch-in-other
master
master-in-other
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
Indeed, if one were to attempt to use git switch with some of these
provided options, git will reject the request:
$git switch HEAD
fatal: a branch is expected, got 'HEAD
$git switch matching-tag
fatal: a branch is expected, got tag 'matching-tag'
$git switch other/branch-in-other
fatal: a branch is expected, got remote branch 'other/branch-in-other'
Ideally, git switch without options ought to complete only words which
will be accepted. Without options, this means to list local branch names
and the unique remote branch names without their remote name pre-pended.
$git switch <TAB>
branch-in-other
master
master-in-other
matching-branch
Add a test case that highlights this subpar completion. Also add
a similar test for git checkout completion that shows that due to the
complex nature of git checkout, it must complete all references.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When clearing the builtin operations on re-sourcing in the ZSH case we
can use the native ${parameters} associative array keys values to get
the currently `__gitcomp_builtin_*` operations using pattern matching
instead of using sed.
As also stated in commit 94408dc7, introducing this change the usage of
sed has some overhead implications, while ZSH can do this check just
using its native syntax.
Signed-off-by: Marco Trevisan (Treviño) <mail@3v1n0.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original patch selection code was written for `git add -p`, and the
fundamental unit on which it works is a hunk.
We hacked around that to handle deletions back in 24ab81ae4d
(add-interactive: handle deletion of empty files, 2009-10-27). But `git
add -p` would never see a new file, since we only consider the set of
tracked files in the index.
However, since the same machinery was used for `git checkout -p` &
friends, we can see new files.
Handle this case specifically, adding a new prompt for it that is
modeled after the `deleted file` case.
This also fixes the problem where added _empty_ files could not be
staged via `git checkout -p`.
Reported-by: Merlin Büge <toni@bluenox07.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit introduced --ignore-date flag to rebase -i, but the
name is rather vague as it does not say whether the author date or the
committer date is ignored. Add an alias to convey the precise purpose.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the on-going effort to retire the apply rebase backend
teach the merge backend how to handle --ignore-date. We take care to
handle the combination of --ignore-date and
--committer-date-is-author-date in the same way as the apply backend.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of amend_author was to free() the malloc()'d string
obtained from get_author() when amending a commit. But we can
also use the variable to free() the author at our convenience.
Rename it to convey this meaning.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the on-going effort to retire the apply rebase backend teach
the merge backend how to handle --committer-date-is-author-date.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and 'merge'
each of which support a different set of options. In particuar the apply
backend supports a number of options implemented by 'git am' that are
not available to the merge backend. As part of an on going effort to
remove the apply backend this patch adds support for the
--ignore-whitespace option to the merge backend. This option treats
lines with only whitespace changes as unchanged and is implemented in
the merge backend by translating it to -Xignore-space-change.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To set the default hash algorithm you can set the `GIT_DEFAULT_HASH`
environment variable. In the documentation this variable is named
`GIT_DEFAULT_HASH_ALGORITHM`, which is incorrect.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The explanation of the `--show-pulls` option added in commit 8d049e182e
("revision: --show-pulls adds helpful merges", 2020-04-10) consists of
several paragraphs and we use "+" throughout to tie them together in one
long chain of list continuations. Only thing is, we're not in any kind
of list, so these pluses end up being rendered literally.
The preceding few paragraphs describe `--ancestry-path` and there we
*do* have a list, since we've started one with `--ancestry-path::`. In
fact, we have several such lists for all the various history-simplifying
options we're discussing earlier in this file.
Thus, we're missing a list both from a consistency point of view and
from a practical rendering standpoint.
Let's start a list for `--show-pulls` where we start actually discussing
the option, and keep the paragraphs preceding it out of that list. That
is, drop all those pluses before the new list we're adding here.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with asciidoc-8.4.5 (as found on CentOS/Red Hat 6), the
period in the "[[files-in-.gitignore-are-tracked]]" anchor is not
properly parsed as a section:
WARNING: gitfaq.txt: line 245: missing [[files-in-.gitignore-are-tracked]] section
The resulting XML file fails to validate with xmlto:
xmlto: /git/Documentation/gitfaq.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
/git/Documentation/gitfaq.xml:3: element refentry: validity error :
Element refentry content does not follow the DTD, expecting
(beginpage? , indexterm* , refentryinfo? , refmeta? , (remark | link
| olink | ulink)* , refnamediv+ , refsynopsisdiv? , (refsect1+ |
refsection+)), got (refmeta refnamediv refsynopsisdiv refsect1
refsect1 refsect1 refsect1 variablelist refsect1 refsect1 )
Document /git/Documentation/gitfaq.xml does not validate
Let's avoid breaking users of platforms which ship an old version of
asciidoc, since the cost to do so is quite low.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various doc fixes.
* ma/doc-fixes:
git-sparse-checkout.txt: add missing '
git-credential.txt: use list continuation
git-commit-graph.txt: fix list rendering
git-commit-graph.txt: fix grammo
date-formats.txt: fix list continuation
In standard C, '{ 0 }' can be used as an universal zero-initializer.
However, Sparse complains if this is used on a type where the first
member (possibly nested) is a pointer since Sparse purposely wants
to warn when '0' is used to initialize a pointer type.
Legitimaly, it's desirable to be able to use '{ 0 }' as an idiom
without these warnings [1,2]. To allow this, an option have now
been added to Sparse:
537e3e2dae univ-init: conditionally accept { 0 } without warnings
So, add this option to the SPARSE_FLAGS variable.
Note: The option have just been added to Sparse. So, to benefit
now from this patch it's needed to use the latest Sparse
source from kernel.org. The option will simply be ignored
by older versions of Sparse.
[1] https://lore.kernel.org/r/e6796c60-a870-e761-3b07-b680f934c537@ramsayjones.plus.com
[2] https://lore.kernel.org/r/xmqqd07xem9l.fsf@gitster.c.googlers.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.
This can be seen in the following command which does not terminate:
$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
whereas the v1 version does terminate as expected:
$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
fatal: the remote end hung up unexpectedly
Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.
A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.
Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will use PACKET_READ_RESPONSE_END to separate
messages proxied by remote-curl. To prepare for this, add the
PACKET_READ_RESPONSE_END enum value.
In switch statements that need a case added, die() or BUG() when a
PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how
PACKET_READ_DELIM is implemented (especially in cases where packets are
being forwarded).
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.
For a stateless connection, inspect packets before sending them and
error out if a packet line packet is incomplete.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `diff.relative` boolean option set to `true` shows only changes in
the current directory/value specified by the `path` argument of the
`relative` option and shows pathnames relative to the aforementioned
directory.
Teach `--no-relative` to override earlier `--relative`
Add for git-format-patch(1) options documentation `--relative` and
`--no-relative`
Signed-off-by: Laurent Arnoud <laurent@spkdev.net>
Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Printing a message directly to stdout could affect TAP processing
and is not really needed, as there is a standard way to skip all
tests that could be used instead, while printing an equivalent
message.
While at it; update the message to better reflect that since
a85efb5985 (t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool,
2019-11-22), the enabling variable should be a recognized boolean
(ex: true, false, 1, 0) and get rid of the prerequisite that used
to guard all the tests, since "skip_all" is just much faster and
idempotent.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to create a branch "foo" based on "origin/master" and give
git commit -b an extra unsupported argument "bar", it confusingly
reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
$ git checkout --track -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
That's wrong, because it very well understands that "origin/master" is
supposed to be the start point for the new branch and not "bar". Check
if we got a commit and show more fitting messages in that case instead:
$ git checkout -b foo origin/master bar
fatal: Cannot update paths and switch to branch 'foo' at the same time.
$ git checkout --track -b foo origin/master bar
fatal: '--track' cannot be used with updating paths
Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test git checkout -b with and without --track and demonstrate unexpected
error messages when it's given an extra (i.e. unsupported) path
argument. In both cases it reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
The problem is that the start point we gave for the new branch is
"origin/master" and "bar" is just some extra argument -- it could even
be a valid commit, which would make the message even more confusing. We
have more fitting error messages in git commit, but get confused; use
the text of the rights ones in the tests.
Reported-by: Dana Dahlstrom <dahlstrom@google.com>
Original-test-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
06f5608c14 (bisect--helper: `bisect_start` shell function partially in C,
2019-01-02) adds a lax parser for `git bisect start` which could result
in a segfault under a bad syntax call for start with custom terms.
Detect if there are enough arguments left in the command line to use for
--term-{old,good,new,bad} and abort with the same syntax error the original
implementation will show if not.
While at it, remove an unnecessary (and incomplete) check for unknown
arguments and make sure to add a test to avoid regressions.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:
sub pre_send_email {
open(my $fh, 'git hook list --porcelain pre-send-email |');
chomp(my @hooks = <$fh>);
close($fh);
foreach $hook (@hooks) {
system $hook
}
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.
Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
For example:
$ git config --list | grep ^hook
hook.pre-commit.command=baz
hook.pre-commit.command=~/bar.sh
hookcmd.baz.command=~/baz/from/hookcmd.sh
$ git hook list pre-commit
~/baz/from/hookcmd.sh
~/bar.sh
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
df70b190 (completion: make stash -p and alias for stash push -p,
2018-04-20) wanted to make sure "git stash -p <TAB>" offers the same
completion as "git stash push -p <TAB>", but it did so by forcing the
$subcommand to be "push" whenever then "-p" option is found on the
command line.
This harms any subcommand that can take the "-p" option---even when the
subcommand is explicitly given, e.g. "git stash show -p", the code added
by the change would overwrite the $subcommand the user gave us.
Fix it by making sure that the defaulting to "push" happens only when
there is no $subcommand given yet.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/prati0100/git-gui:
git-gui: Handle Ctrl + BS/Del in the commit msg
Subject: git-gui: fix syntax error because of missing semicolon
If the conflict candidate file name from the top of the stack is not a
prefix of the current candiate directory then we can discard it as no
matching directory can come up later. But we are not done checking the
candidate directory -- the stack might still hold a matching file name,
so stay in the loop and check the next candidate file name.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exercise the case of putting a conflict candidate file name back on the
stack because a matching directory might yet come up later.
Do that by factoring out the test code into a function to allow for more
concise notation in the form of parameters indicating names of trees
(with trailing slash) and blobs (without trailing slash) in no
particular order (they are sorted by git mktree). Then add the new test
case as a second function call.
Fix a typo in the test title while at it ("dublicate").
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first four bytes of the line, the pkt-len, indicates the total
length of the pkt-line in hexadecimal. Fix wrong pkt-len headers of
some pkt-line messages in `http-protocol.txt` and `pack-protocol.txt`.
Reviewed-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Jiuyang Xie <jiuyang.xjy@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several code paths in the checkout code which are traversed only
in this case, due to switch having different defaults from checkout.
Let's add a test that the combination of options works and produces the
expected behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call init_checkout_metadata in reset_tree, we want to pass the
object ID of the commit in question so that it can be passed to filters,
or if there is no commit, the tree. We anticipated this latter case,
which can occur elsewhere in the checkout code, but it cannot occur
here. The only case in which we do not have a commit object is when
invoking git switch with --orphan. Moreover, we can only hit this code
path without a commit object additionally with either --force or
--discard-changes.
In such a case, there is no point initializing the checkout metadata
with a commit or tree because (a) there is no commit, only the empty
tree, and (b) we will never use the data, since no files will be smudged
when checking out a branch with no files. Pass the all-zeros object ID
in this case, since we just need some value which is a valid pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.26 used protocol v2 as its default protocol, but soon after
release, users noticed that the protocol v2 negotiation code was prone
to fail when fetching from some remotes that are far ahead of others
(such as linux-next.git versus Linus's linux.git). That has been
fixed by 0b07eecf6e (Merge branch 'jt/v2-fetch-nego-fix',
2020-05-01), but to be cautious, we are using protocol v0 as the
default in 2.27 to buy some time for any other unanticipated issues to
surface.
To that end, let's ensure that users requesting the bleeding edge
using the feature.experimental flag *do* get protocol v2. This way,
we can gain experience with a wider audience for the new protocol
version and be more confident when it is time to enable it by default
for all users in some future Git version.
Implementation note: this isn't with the rest of the
feature.experimental options in repo-settings.c because those are tied
to a repository object, whereas this code path is used for operations
like "git ls-remote" that do not require a repository.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow deleting words backwards and forwards using Ctrl + Backspace and
Delete in the commit message buffer.
* il/ctrl-bs-del:
git-gui: Handle Ctrl + BS/Del in the commit msg
Document some of the flag options in refs_ref_iterator_begin, and explain how
ref_iterator_advance_fn should handle them.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading and writing .git/refs/* assumes that refs are stored in the 'files'
ref backend.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6c722cbe5a (bisect: allow CRLF line endings in "git bisect replay"
input, 2020-05-07) includes CR as a field separator, but relies on
it not being included in the last field, which breaks at least when
running under OpenBSD 6.7's sh.
Instead of just assume the CR will get swallowed, read the rest of
the line into an otherwise unused variable and ignore it everywhere
except on the call for git bisect start, where it matters.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'gitfaq.txt' was added in 2149b6748f (docs: add a FAQ, 2020-03-30),
it was added to the Makefile but not to command-list.txt.
Add it there also, so that the new FAQ is listed in the output of
`git help --guides`.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a BRE, that broke tests 30-32, 37-39, 42 at least with
OpenBSD 6.7; use a simpler ERE.
Fixes: d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seems to trigger a bug in at least OpenBSD's 6.7 sh where it is
interpreted as a history lookup and therefore fails 125-126, 128,
130.
Remove the subshell and get a space between ! and grep, so tests
pass successfully.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent attempt to make the test output nicer to view on CI
systems broke TAP output under bash. The effort has been reverted
to be re-attempted in the next cycle.
* jc/fix-tap-output-under-bash:
Revert "tests: when run in Bash, annotate test failures with file name/line number"
Revert "ci: add a problem matcher for GitHub Actions"
Revert "t/test_lib: avoid naked bash arrays in file_lineno"
Last-minute fix for our recent change to allow use of progress API
as a traceable region.
* ds/trace-log-progress-fix:
progress: call trace2_region_leave() only after calling _enter()
Instead of downloading Windows SDK for CI jobs for windows builds
from an external site (wingit.blob.core.windows.net), use the one
created in the windows-build job, to work around quota issues at
the external site.
* js/ci-sdk-download-fix:
ci: avoid pounding on the poor ci-artifacts container
When a binary file gets modified and renamed on both sides of history
to different locations, both files would be written to the working
tree but both would have the contents from "ours". This has been
corrected so that the path from each side gets their original content.
* en/merge-rename-rename-worktree-fix:
merge-recursive: fix rename/rename(1to2) for working tree with a binary
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
95acf11a3d ("diff: restrict when prefetching occurs", 2020-04-07) taught
diff to prefetch blobs in a more limited set of situations. These
limited situations include when the output format requires blob data,
and when inexact rename detection is needed.
There is an existing test case that tests inexact rename detection, but
it also uses an output format that requires blob data, resulting in the
inexact-rename-detection-only code not being tested. Update this test to
use the raw output format, which does not require blob data.
Thanks to Derrick Stolee for noticing this lapse in code coverage and
for doing the preliminary analysis [1].
[1] https://lore.kernel.org/git/853759d3-97c3-241f-98e1-990883cd204e@gmail.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.
Change the function parameter from `const char *linelen` to
`const char lenbuf_hex[4]`. Even though these two types behave
identically as function parameters, use the array notation to
semantically indicate exactly what this function is expecting as an
argument. Also, rename it from linelen to lenbuf_hex as the former
sounds like it should be an integral type which is misleading.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.
Extract the common fetch_pack() call out of the switch statement so that
code duplication is reduced and the logic is more clear for future
readers. While we're at it, rewrite the switch statement as an if-else
tower for increased clarity.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a merge with a single strategy, the result of evaluate_result() is
effectively not used and therefore is not needed, so avoid altogether.
On Windows, this optimization can halve the time required to perform a
recursive merge of a single commit with the LLVM repo.
Signed-off-by: Andrew Ng <andrew.ng@sony.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some platforms likes HP-UX, grep(1) doesn't understand "-a".
Let's switch to perl.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" that accepts requests by "git push" learned to
outsource some of the ref updates to the new "proc-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a pseudo reference (such as "refs/for/master/topic"), may
create or update one or more references. The real names of the
references will be stored in the report options. Parse report options
to create or update remote-tracking branches properly.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to test update of remote-tracking branches for special refs,
add new "remote.origin.fetch" settings and test cases.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new multi-valued config variable "receive.procReceiveRefs"
for `receive-pack` command, like the follows:
git config --system --add receive.procReceiveRefs refs/for
git config --system --add receive.procReceiveRefs refs/drafts
If the specific prefix strings match the reference names of the commands
which are sent from git client to `receive-pack`, these commands will be
executed by an external hook (named "proc-receive"), instead of the
internal `execute_commands` function.
For example, if it is set to "refs/for", pushing to a reference such as
"refs/for/master" will not create or update reference "refs/for/master",
but may create or update a pull request directly by running the hook
"proc-receive".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new function `ref_is_matched()` to reuse `ref_is_hidden()`. Will use
this function for `receive-pack` to check commands with specific
prefixes.
Test case t5512 covered this change.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add ABNF notation for capability 'report-status-v2' which extends
capability 'report-status' by adding additional option lines.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid). Current "report-status" protocol cannot report the status for
such reference rewrite.
Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.
If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:
ok ref/for/master/topic
But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.
ok refs/for/master/topic
option refname refs/changes/23/123/1
option new-oid <new-oid>
ok refs/for/master/topic
option refname refs/changes/24/124/1
option new-oid <new-oid>
The client will report two new created references to the end user.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`. Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission. A contributor who has no
write-permission, cannot push to the repository directly. So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways. We call this workflow as a
distributed workflow.
It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases. For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.
git push origin \
HEAD:refs/for/<branch-name>/<session>
The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar". The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".
We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks. Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.
So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command. Commands with this filed
turned on will be executed by a new handler (an hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.
Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format. In the following example, The letter "S" stands for
"receive-pack" and letter "H" stands for the hook.
# Version and features negotiation.
S: PKT-LINE(version=1\0push-options atomic...)
S: flush-pkt
H: PKT-LINE(version=1\0push-options...)
H: flush-pkt
# Send commands from server to the hook.
S: PKT-LINE(<old-oid> <new-oid> <ref>)
S: ... ...
S: flush-pkt
# Send push-options only if the 'push-options' feature is enabled.
S: PKT-LINE(push-option)
S: ... ...
S: flush-pkt
# Receive result from the hook.
# OK, run this command successfully.
H: PKT-LINE(ok <ref>)
# NO, I reject it.
H: PKT-LINE(ng <ref> <reason>)
# Fall through, let 'receive-pack' to execute it.
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option fall-through)
# OK, but has an alternate reference. The alternate reference name
# and other status can be given in options
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option refname <refname>)
H: PKT-LINE(option old-oid <old-oid>)
H: PKT-LINE(option new-oid <new-oid>)
H: PKT-LINE(option forced-update)
H: ... ...
H: flush-pkt
After receiving a command, the hook will execute the command, and may
create/update different reference. For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head". The alternate reference name and other
status are given in option lines.
The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines. Finally, the result of the execution of these commands
will be reported to end user.
The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Topic "proc-receive-hook" will change the workflow and output of
git-push. Add some basic test cases in t5411 before introducing the new
topic.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a new reference (not a head or tag), report it as a new
reference instead of a new branch.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Where we explain the 'reapply' command, we don't properly wrap it in
single quote marks like we do with the other commands: We omit the
closing mark ("'reapply") and this ends up being rendered literally as
"'reapply". Add the missing "'".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use list continuation to avoid the second and third paragraphs
rendering with a different indentation from the first one where we
describe the "url" attribute.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first list item follows immediately on the paragraph where we
introduce the list. This makes the "*" render literally as part of one
huge paragraph. (With AsciiDoc, everything is fine after that, but with
Asciidoctor, we get some minor follow-on errors.) Add an empty line --
with a list continuation ("+") -- to make the first list item render ok.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to mix up the possessive "its" and "it's" ("it is"). Correct
an instance of this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The blank line before the lone "+" means it isn't detected as a list
continuation, but instead renders literally, at least with AsciiDoc.
Drop the empty line and, while at it, add a closing period to the
preceding paragraph.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite, and to do that copies the common branch in
test-lib and expands it to include it in a special case for FreeBSD.
Instead; test for it using a previously added extension to test-tool and
use that, together with a function that identifies when regcomp/regexec
will be called with broken patterns to avoid any test that would otherwise
rely on undefined behaviour.
The description of the first test which wasn't accurate has been corrected,
and the test rearranged for clarity, including a helper function that avoids
overly long lines.
Only the affected engines will have their tests suppressed, also including
"fixed" if the PCRE optimization that uses LIBPCRE2 since b65abcafc7
(grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) is not
available.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite to avoid failures from the tests added in
4e2443b181 (log tests: test regex backends in "--encode=<enc>" tests,
2019-06-28), but hardcodes it to be only enabled in FreeBSD.
Instead of hardcoding the affected platform, teach the test-regex helper,
how to validate a pattern and report back, so it can be used to detect the
same issue in other affected systems (like DragonFlyBSD or macOS).
While at it, refactor the tool so it can report back the source of the
errors it founds, and can be invoked also in a --silent mode, when needed,
for backward compatibility. A missing flag has been added and the code
reformatted, as well as updates to the way the parameters are handled, for
consistency.
To minimize changes, it is assumed the regcomp error is of the right type
since we control the only caller, and is also assumed to affect both basic
and extended syntax (only basic is tested, but both behave the same in all
three affected platforms since they use the same function).
Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use fields from this struct in
receive_needs(), instead of local variables with the same name
and purpose.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to create_pack_file(),
so that this function, and the function it calls, can use all
the fields of the struct.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove the 'stateless_rpc' static
variable, as we can now use the field of 'struct upload_pack_data'
with the same name instead.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to check_non_tip(), so
that this function and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_ref(), so that
this function, and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, we are passing around that struct to many
functions, so let's also pass 'struct string_list symref' around
at the same time by moving it from a local variable in
upload_pack() into a field of 'struct upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use the 'struct packet_writer writer'
field from 'struct upload_pack_data' in receive_needs(),
instead of a local 'struct packet_writer writer' variable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
receive_needs(), so that this function and the functions it
calls can use all the fields of that struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
get_common_commits(), so that this function and the functions
it calls can use all the fields of that struct in followup
commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use 'struct upload_pack_data' in
upload_pack().
This will make it possible in followup commits to remove a lot
of static variables and local variables that have the same name
and purpose as fields in 'struct upload_pack_data'. This will
also make upload_pack() work in a more similar way as
upload_pack_v2().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move 'struct upload_pack_data' and the
related upload_pack_data_init() and upload_pack_data_clear()
functions towards the beginning of the file, so that this struct
and its related functions can then be used by upload_pack() in a
followup commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the want_obj and have_obj object
arrays into 'struct upload_pack_data'.
These object arrays are used by both upload_pack() and
upload_pack_v2(), for example when these functions call
create_pack_file(). We are going to use
'struct upload_pack_data' in upload_pack() in a followup
commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove 'struct object_array wants' from
'struct upload_pack_data', as it appears to be unused.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strftime(3) man page is outside of the Git suite. Refererence it as
we do other external man pages and avoid creating a broken link when
generating the HTML documentation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7c5c9b9c57 (commit-graph: error out on invalid commit oids in
'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on
receiving non-commit OIDs as input to '--stdin-commits'.
This behavior can be cumbersome to work around in, say, the case of
piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if
the caller does not want to cull out non-commits themselves. In this
situation, it would be ideal if 'git commit-graph write' wrote the graph
containing the inputs that did pertain to commits, and silently ignored
the remainder of the input.
Some options have been proposed to the effect of '--[no-]check-oids'
which would allow callers to have the commit-graph builtin do just that.
After some discussion, it is difficult to imagine a caller who wouldn't
want to pass '--no-check-oids', suggesting that we should get rid of the
behavior of complaining about non-commit inputs altogether.
If callers do wish to retain this behavior, they can easily work around
this change by doing the following:
git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
awk '
!/commit/ { print "not-a-commit:"$1 }
/commit/ { print $1 }
' |
git commit-graph write --stdin-commits
To make it so that valid OIDs that refer to non-existent objects are
indeed an error after loosening the error handling, perform an extra
lookup to make sure that object indeed exists before sending it to the
commit-graph internals.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the subsequent commit, we will introduce a dependency on
'graph_read_expect' from t5318.7. Preemptively move it below
'graph_read_expect()'s definition so that the test can call it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous handful of commits, both 'git commit-graph write
--reachable' and '--stdin-commits' learned to peel tags down to the
commits which they refer to before passing them into the commit-graph
internals.
This makes the call to 'lookup_commit_reference_gently()' inside of
'fill_oids_from_commits()' a noop, since all OIDs are commits by that
point.
As such, remove the call entirely, as well as the progress meter, which
has been split and moved out to the callers in the aforementioned
earlier commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.
In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.
Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.
Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With either '--stdin-commits' or '--stdin-packs', the commit-graph
builtin will read line-delimited input, and interpret it either as a
series of commit OIDs, or pack names.
In a subsequent commit, we will begin handling '--stdin-commits'
differently by processing each line as it comes in, instead of in one
shot at the end. To make adequate room for this additional logic, split
the '--stdin-commits' case from '--stdin-packs' by only storing the
input when '--stdin-packs' is given.
In the case of '--stdin-commits', feed each line to a new
'read_one_commit' helper, which (for now) will merely call
'parse_oid_hex'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a9f1f1f9f8 ("commit-slab.h: code split", 2018-05-19) split
commit-slab.h into commit-slab-decl.h and commit-slab-impl.h header
files, commit-slab-decl.h were left to use "COMMIT_SLAB_HDR_H",
while commit-slab-impl.h gained its own macro, "COMMIT_SLAB_IMPL_H".
As these two files use different include guards, there is nothing
broken, but let's update commit-slab-decl.h to match the convention
to name the include guard after the filename.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From e76eec3554 (ci: allow per-branch config for GitHub Actions,
2020-05-07), we started to allow contributors decide which branch
they want to build with GitHub Actions
by checking for a file named "ci/config/allow-ref".
In order to assist those contributors,
we provided a sample in "ci/config/allow-refs.sample",
and instructed them to drop the ".sample",
then commit that file to their repository.
We've misspelt the filename in that change.
Let's fix the spelling.
While we're at it, also instruct our contributors introduce that new
file to Git before commit, in case of they've never told Git before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On hppa these tests crash because the allocated stack space is too
small, even after it was doubled in b9a190789 (and the data size
doubled to match) to make it work on powerpc. For this arch just
skip these tests, which is enough to make the whole suite pass.
Fixes: https://bugs.debian.org/757402
Based-on-patch-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Price <gnprice@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 676eb0c1ce0d380478eb16bdc5a3f2a7bc01c1d2;
as we will be reverting the change to show these extra output
tokens under bash, the pattern would not match anything.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 303775a25f0b4ac5d6ad2e96eb4404c24209cad8;
instead of trying to salvage the tap-breaking change, let's
revert the whole thing for now.
A user of progress API calls start_progress() conditionally and
depends on the display_progress() and stop_progress() functions to
become no-op when start_progress() hasn't been called.
As we added a call to trace2_region_enter() to start_progress(), the
calls to other trace2 API calls from the progress API functions must
make sure that these trace2 calls are skipped when start_progress()
hasn't been called on the progress struct. Specifically, do not
call trace2_region_leave() from stop_progress() when we haven't
called start_progress(), which would have called the matching
trace2_region_enter().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this developer tested how the git-sdk-64-minimal artifact could be
served to all the GitHub workflow runs that need it, Azure Blobs looked
like a pretty good choice: it is reliable, fast and we already use it in
Git for Windows to serve components like OpenSSL, cURL, etc
It came as an unpleasant surprise just _how many_ times this artifact
was downloaded. It exploded the bandwidth to a point where the free tier
would no longer be enough, threatening to block other, essential Git for
Windows services.
Let's switch back to using the Build Artifacts of our trusty Azure
Pipeline for the time being.
To avoid unnecessary hammering of the Azure Pipeline artifacts, we use
the GitHub Action `actions/upload-artifact` in the `windows-build` job
and the GitHub Action `actions/download-artifact` in the `windows-test`
and `vs-test` jobs (the latter now depends on `windows-build` for that
reason, too).
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit b0a5a12a60 ("unpack-trees: allow check_updates() to work on a
different index", 2020-03-27) allowed check_updates() to work on a
different index, but it called get_progress() which was hardcoded to
work on o->result much like check_updates() had been. Update it to also
accept an index parameter and have check_updates() pass that parameter
along so that both are working on the same index.
Noticed-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach codepaths that show progress meter to also use the
start_progress() and the stop_progress() calls as a "region" to be
traced.
* es/trace-log-progress:
trace2: log progress time and throughput
Code cleanup and typofixes
* ds/bloom-cleanup:
completion: offer '--(no-)patch' among 'git log' options
bloom: use num_changes not nr for limit detection
bloom: de-duplicate directory entries
Documentation: changed-path Bloom filters use byte words
bloom: parse commit before computing filters
test-bloom: fix usage typo
bloom: fix whitespace around tab length
"git fsck" ensures that the paths recorded in tree objects are
sorted and without duplicates, but it failed to notice a case where
a blob is followed by entries that sort before a tree with the same
name. This has been corrected.
* rs/fsck-duplicate-names-in-trees:
fsck: report non-consecutive duplicate names in trees
"git p4" learned to recover from a (broken) state where a directory
and a file are recorded at the same path in the Perforce repository
the same way as their clients do.
* ao/p4-d-f-conflict-recover:
git-p4: recover from inconsistent perforce history
"rebase -i" segfaulted when rearranging a sequence that has a
fix-up that applies another fix-up (which may or may not be a
fix-up of yet another step).
* js/rebase-autosquash-double-fixup-fix:
rebase --autosquash: fix a potential segfault
"git bisect replay" had trouble with input files when they used
CRLF line ending, which has been corrected.
* cw/bisect-replay-with-dos:
bisect: allow CRLF line endings in "git bisect replay" input
ccd469450a (date.c: switch to reentrant {gm,local}time_r, 2019-11-28)
removes the only gmtime() call we had and moves to gmtime_r() which
doesn't have the same portability problems.
Remove the compat gmtime code since it is no longer needed, and confirm
by successfull running t4212 in FreeBSD 9.3 amd64 (the oldest I could
get a hold off).
Further work might be needed to ensure 32bit time_t systems (like FreeBSD
i386) will handle correctly the overflows tested in t4212, but that is
orthogonal to this change, and it doesn't change the current behaviour
as neither gmtime() or gmtime_r() will ever return NULL on those systems
because time_t is unsigned.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With a rename/rename(1to2) conflict, we attempt to do a three-way merge
of the file contents, so that the correct contents can be placed in the
working tree at both paths. If the file is a binary, however, no
content merging is possible and we should just use the original version
of the file at each of the paths.
Reported-by: Chunlin Zhang <zhangchunlin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While iterating references (to discover the set of commits to write to
the commit-graph with 'git commit-graph write --reachable'),
'add_ref_to_set' can save 'fill_oids_from_commits()' some time by
peeling the references beforehand.
Move peeling out of 'fill_oids_from_commits()' and into
'add_ref_to_set()' to use 'peel_ref()' instead of 'deref_tag()'. Doing
so allows the commit-graph machinery to use the peeled value from
'$GIT_DIR/packed-refs' instead of having to load and parse tags.
While we're at it, discard non-commit objects reachable from ref tips.
This would be done automatically by 'fill_oids_from_commits()', but such
functionality will be removed in a subsequent patch after the call to
'lookup_commit_reference_gently' is dropped (at which point a non-commit
object in the commits oidset will become an error).
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git commit-graph write --reachable' is invoked, the commit-graph
machinery calls 'for_each_ref()' to discover the set of reachable
commits.
Right now the 'add_ref_to_set' callback is not doing anything other than
adding an OID to the set of known-reachable OIDs. In a subsequent
commit, 'add_ref_to_set' will presumptively peel references. This
operation should be fast for repositories with an up-to-date
'$GIT_DIR/packed-refs', but may be slow in the general case.
So that it doesn't appear that 'git commit-graph write' is idling with
'--reachable' in the slow case, add a progress meter to provide some
output in the meantime.
In general, we don't expect a progress meter to appear at all, since
peeling references with a 'packed-refs' file is quick. If it's slow and
we do show a progress meter, the subsequent 'fill_oids_from_commits()'
will be fast, since all of the calls to
'lookup_commit_reference_gently()' will be no-ops.
Both progress meters are delayed, so it is unlikely that more than one
will appear. In either case, this intermediate state will go away in a
handful of patches, at which point there will be at most one progress
meter.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Serving a "git fetch" client over "git://" and "ssh://" protocols
using the on-wire protocol version 2 was buggy on the server end
when the client needs to make a follow-up request to
e.g. auto-follow tags.
* cc/upload-pack-v2-fetch-fix:
upload-pack: clear filter_options for each v2 fetch command
The code to skip unmerged paths in the index when sparse checkout
is in use would have made out-of-bound access of the in-core index
when the last path was unmerged, which has been corrected.
* ds/sparse-updates-oob-access-fix:
unpack-trees: avoid array out-of-bounds error
Instead of always building all branches at GitHub via Actions,
users can specify which branches to build.
* jk/ci-only-on-selected-branches:
ci: allow per-branch config for GitHub Actions
Teach "am", "commit", "merge" and "rebase", when they are run with
the "--quiet" option, to pass "--quiet" down to "gc --auto".
* jc/auto-gc-quiet:
auto-gc: pass --quiet down from am, commit, merge and rebase
auto-gc: extract a reusable helper from "git fetch"
Minor in-code comments and documentation updates around credential
API.
* cb/credential-doc-fixes:
credential: document protocol updates
credential: update gitcredentials documentation
credential: correct order of parameters for credential_match
credential: update description for credential_from_url_gently
The object walk with object filter "--filter=tree:0" can now take
advantage of the pack bitmap when available.
* tb/bitmap-walk-with-tree-zero-filter:
pack-bitmap: pass object filter to fill-in traversal
pack-bitmap.c: support 'tree:0' filtering
pack-bitmap.c: make object filtering functions generic
list-objects-filter: treat NULL filter_options as "disabled"
git-init(1)'s messages is subjected to i18n.
They should be tested by test_i18n* family.
Fix them.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible a user may complain about the way that Git interacts with
their interactive shell, e.g. autocompletion or shell prompt. In that
case, it's useful for us to know which shell they're using
interactively.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It may be useful to know which shell Git was built to try to point to,
in the event that shell-based Git commands are failing. $SHELL_PATH is
set during the build and used to launch the manpage viewer, as well as
by git-compat-util.h, and it's used during tests. 'git version
--build-options' is encouraged for use in bug reports, so it makes sense
to include this information there.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than teaching only one operation, like 'git fetch', how to write
down throughput to traces, we can learn about a wide range of user
operations that may seem slow by adding tooling to the progress library
itself. Operations which display progress are likely to be slow-running
and the kind of thing we want to monitor for performance anyways. By
showing object counts and data transfer size, we should be able to
make some derived measurements to ensure operations are scaling the way
we expect.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git p4 submit with the --prepare-p4-only option, the program
should prepare a single p4 changelist and notify the user that more
commits are pending and then stop processing.
A bug has been introduced by the p4-changelist hook feature that
causes the program to continue to try and process all pending
changelists at the same time.
The function applyCommit returns True when applying the commit
was successful and the program should continue. However, when the
optional flag --prepare-p4-only is set, the program should stop
after the first application.
Change the logic in the run method for P4Submit to check for the
flag --prepare-p4-only after successfully completing the applyCommit
method.
Be aware - this change will fix the existing test error in t9807.23
for --prepare-p4-only. However there is insufficent coverage for
this flag. If more than 1 commit is pending submission to P4, the
method will properly prepare the P4 changelist, however it will
still exit the application with an exitcode of 1.
The current documentation does not define what the exit code should be
in this condition.
(See: https://git-scm.com/docs/git-p4#Documentation/git-p4.txt---prepare-p4-only)
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Control+BackSpace: Delete word to the left of the cursor.
- Control+Delete : Delete word to the right of the cursor.
Originally introduced by BRIEF and Turbo Vision between 1985 and 1992,
they were adopted by most CUA-Compliant UIs, including those of: OS/2,
Windows, Mac OS, Qt, GTK, Open/Libre Office, Gecko, and GNU Emacs.
In both cases Tk already implements the functionality bound to other key
combination, so we use that.
Graphical examples:
Deleting to the left:
v------ pointer
X_WORD____X
^-----^------ selection
Deleting to the right:
v--------- pointer
X_WORD_X
^--^------ selection
Signed-off-by: Ismael Luceno <ismael.luceno@tttech-auto.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Explanations about the configuration variables for git-grep are
duplicated in "Documentation/git-grep.txt" and
"Documentation/config/grep.txt", which can make maintenance difficult.
The first also contains a definition not present in the latter
(grep.fullName). To avoid problems like this, let's unify the
information in the second file and include it in the first.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that when GIT_TRACE_CURL is set, Git prints out "Authorization:
Basic <redacted>" instead of the base64-encoded authorization details.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous changes to the line-log machinery focused on making the
first result appear faster. This was achieved by no longer walking the
entire commit history before returning the early results. There is still
another way to improve the performance: walk most commits much faster.
Let's use the changed-path Bloom filters to reduce time spent computing
diffs.
Since the line-log computation requires opening blobs and checking the
content-diff, there is still a lot of necessary computation that cannot
be replaced with changed-path Bloom filters. The part that we can reduce
is most effective when checking the history of a file that is deep in
several directories and those directories are modified frequently. In
this case, the computation to check if a commit is TREESAME to its first
parent takes a large fraction of the time. That is ripe for improvement
with changed-path Bloom filters.
We must ensure that prepare_to_use_bloom_filters() is called in
revision.c so that the bloom_filter_settings are loaded into the struct
rev_info from the commit-graph. Of course, some cases are still
forbidden, but in the line-log case the pathspec is provided in a
different way than normal.
Since multiple paths and segments could be requested, we compute the
struct bloom_key data dynamically during the commit walk. This could
likely be improved, but adds code complexity that is not valuable at
this time.
There are two cases to care about: merge commits and "ordinary" commits.
Merge commits have multiple parents, but if we are TREESAME to our first
parent in every range, then pass the blame for all ranges to the first
parent. Ordinary commits have the same condition, but each is done
slightly differently in the process_ranges_[merge|ordinary]_commit()
methods. By checking if the changed-path Bloom filter can guarantee
TREESAME, we can avoid that tree-diff cost. If the filter says "probably
changed", then we need to run the tree-diff and then the blob-diff if
there was a real edit.
The Linux kernel repository is a good testing ground for the performance
improvements claimed here. There are two different cases to test. The
first is the "entire history" case, where we output the entire history
to /dev/null to see how long it would take to compute the full line-log
history. The second is the "first result" case, where we find how long
it takes to show the first value, which is an indicator of how quickly a
user would see responses when waiting at a terminal.
To test, I selected the paths that were changed most frequently in the
top 10,000 commits using this command (stolen from StackOverflow [1]):
git log --pretty=format: --name-only -n 10000 | sort | \
uniq -c | sort -rg | head -10
which results in
121 MAINTAINERS
63 fs/namei.c
60 arch/x86/kvm/cpuid.c
59 fs/io_uring.c
58 arch/x86/kvm/vmx/vmx.c
51 arch/x86/kvm/x86.c
45 arch/x86/kvm/svm.c
42 fs/btrfs/disk-io.c
42 Documentation/scsi/index.rst
(along with a bogus first result). It appears that the path
arch/x86/kvm/svm.c was renamed, so we ignore that entry. This leaves the
following results for the real command time:
| | Entire History | First Result |
| Path | Before | After | Before | After |
|------------------------------|--------|--------|--------|--------|
| MAINTAINERS | 4.26 s | 3.87 s | 0.41 s | 0.39 s |
| fs/namei.c | 1.99 s | 0.99 s | 0.42 s | 0.21 s |
| arch/x86/kvm/cpuid.c | 5.28 s | 1.12 s | 0.16 s | 0.09 s |
| fs/io_uring.c | 4.34 s | 0.99 s | 0.94 s | 0.27 s |
| arch/x86/kvm/vmx/vmx.c | 5.01 s | 1.34 s | 0.21 s | 0.12 s |
| arch/x86/kvm/x86.c | 2.24 s | 1.18 s | 0.21 s | 0.14 s |
| fs/btrfs/disk-io.c | 1.82 s | 1.01 s | 0.06 s | 0.05 s |
| Documentation/scsi/index.rst | 3.30 s | 0.89 s | 1.46 s | 0.03 s |
It is worth noting that the least speedup comes for the MAINTAINERS file
which is
* edited frequently,
* low in the directory heirarchy, and
* quite a large file.
All of those points lead to spending more time doing the blob diff and
less time doing the tree diff. Still, we see some improvement in that
case and significant improvement in other cases. A 2-4x speedup is
likely the more typical case as opposed to the small 5% change for that
file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch made it possible to perform line-level filtering
during history traversal instead of in an expensive preprocessing
step, but it still requires some simpler preprocessing steps, notably
topo-ordering. However, nowadays we have commit-graphs storing
generation numbers, which make it possible to incrementally traverse
the history in topological order, without the preparatory limit_list()
and sort_in_topological_order() steps; see b45424181e (revision.c:
generation-based topo-order algorithm, 2018-11-01).
This patch combines the two, so we can do both the topo-ordering and
the line-level filtering during history traversal, eliminating even
those simpler preprocessing steps, and thus further reducing the delay
before showing the first commit modifying the given line range.
The 'revs->limited' flag plays the central role in this, because, due
to limitations of the current implementation, the generation
number-based topo-ordering is only enabled when this flag remains
unset. Line-level log, however, always sets this flag in
setup_revisions() ever since the feature was introduced in 12da1d1f6f
(Implement line-history search (git log -L), 2013-03-28). The reason
for setting 'limited' is unclear, though, because the line-level log
itself doesn't directly depend on it, and it doesn't affect how the
limit_list() function limits the revision range. However, there is an
indirect dependency: the line-level log requires topo-ordering, and
the "traditional" sort_in_topological_order() requires an already
limited commit list since e6c3505b44 (Make sure we generate the whole
commit list before trying to sort it topologically, 2005-07-06). The
new, generation numbers-based topo-ordering doesn't require a limited
commit list anymore.
So don't set 'revs->limited' for line-level log, unless it is really
necessary, namely:
- The user explicitly requested parent rewriting, because that is
still done in the line_log_filter() preprocessing step (see
previous patch), which requires sort_in_topological_order() and in
turn limit_list() as well.
- A commit-graph file is not available or it doesn't yet contain
generation numbers. In these cases we had to fall back on
sort_in_topological_order() and in turn limit_list(). The
existing condition with generation_numbers_enabled() has already
ensured that the 'limited' flag is set in these cases; this patch
just makes sure that the line-level log sets 'revs->topo_order'
before that condition.
While the reduced delay before showing the first commit is measurable
in git.git, it takes a bigger repository to make it clearly noticable.
In both cases below the line ranges were chosen so that they were
modified rather close to the starting revisions, so the effect of this
change is most noticable.
# git.git
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
After:
real 0m0.058s
user 0m0.050s
sys 0m0.005s
# linux.git
$ time git --no-pager log \
-L:build_restore_work_registers:arch/mips/mm/tlbex.c -1 v5.2
Before:
real 0m1.129s
user 0m1.061s
sys 0m0.069s
After:
real 0m0.096s
user 0m0.087s
sys 0m0.009s
Additional testing by Derrick Stolee: Since this patch improves
the performance for the first result, I repeated the experiment
from the previous patch on the Linux kernel repository, reporting
real time here:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 0.71 s
After: 0.05 s
Now, we have dropped the full topo-order of all ~910,000 commits
before reporting the first result. The remaining performance
improvements then are:
1. Update the parent-rewriting logic to be incremental similar to
how "git log --graph" behaves.
2. Use changed-path Bloom filters to reduce the time spend in the
tree-diff to see if the path(s) changed.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As diff_tree_oid() computes a diff, it will terminate early if the
total number of changed paths is strictly larger than max_changes.
This includes the directories that changed, not just the file paths.
However, only the file paths are reflected in the resulting diff
queue's "nr" value.
Use the "num_changes" from diffopt to check if the diff terminated
early. This is incredibly important, as it can result in incorrect
filters! For example, the first commit in the Linux kernel repo
reports only 471 changes, but since these are nested inside several
directories they expand to 513 "real" changes, and in fact the
total list of changes is not reported. Thus, the computed filter
for this commit is incorrect.
Demonstrate the subtle difference by using one fewer file change
in the 'get bloom filter for commit with 513 changes' test. Before,
this edited 513 files inside "bigDir" which hit this inequality.
However, dropping the file count by one demonstrates how the
previous inequality was incorrect but the new one is correct.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current line-level log implementation performs a preprocessing
step in prepare_revision_walk(), during which the line_log_filter()
function filters and rewrites history to keep only commits modifying
the given line range. This preprocessing affects both responsiveness
and correctness:
- Git doesn't produce any output during this preprocessing step.
Checking whether a commit modified the given line range is
somewhat expensive, so depending on the size of the given revision
range this preprocessing can result in a significant delay before
the first commit is shown.
- Limiting the number of displayed commits (e.g. 'git log -3 -L...')
doesn't limit the amount of work during preprocessing, because
that limit is applied during history traversal. Alas, by that
point this expensive preprocessing step has already churned
through the whole revision range to find all commits modifying the
revision range, even though only a few of them need to be shown.
- It rewrites parents, with no way to turn it off. Without the user
explicitly requesting parent rewriting any parent object ID shown
should be that of the immediate parent, just like in case of a
pathspec-limited history traversal without parent rewriting.
However, after that preprocessing step rewrote history, the
subsequent "regular" history traversal (i.e. get_revision() in a
loop) only sees commits modifying the given line range.
Consequently, it can only show the object ID of the last ancestor
that modified the given line range (which might happen to be the
immediate parent, but many-many times it isn't).
This patch addresses both the correctness and, at least for the common
case, the responsiveness issues by integrating line-level log
filtering into the regular revision walking machinery:
- Make process_ranges_arbitrary_commit(), the static function in
'line-log.c' deciding whether a commit modifies the given line
range, public by removing the static keyword and adding the
'line_log_' prefix, so it can be called from other parts of the
revision walking machinery.
- If the user didn't explicitly ask for parent rewriting (which, I
believe, is the most common case):
- Call this now-public function during regular history traversal,
namely from get_commit_action() to ignore any commits not
modifying the given line range.
Note that while this check is relatively expensive, it must be
performed before other, much cheaper conditions, because the
tracked line range must be adjusted even when the commit will
end up being ignored by other conditions.
- Skip the line_log_filter() call, i.e. the expensive
preprocessing step, in prepare_revision_walk(), because, thanks
to the above points, the revision walking machinery is now able
to filter out commits not modifying the given line range while
traversing history.
This way the regular history traversal sees the unmodified
history, and is therefore able to print the object ids of the
immediate parents of the listed commits. The eliminated
preprocessing step can greatly reduce the delay before the first
commit is shown, see the numbers below.
- However, if the user did explicitly ask for parent rewriting via
'--parents' or a similar option, then stick with the current
implementation for now, i.e. perform that expensive filtering and
history rewriting in the preprocessing step just like we did
before, leaving the initial delay as long as it was.
I tried to integrate line-level log filtering with parent rewriting
into the regular history traversal, but, unfortunately, several
subtleties resisted... :) Maybe someday we'll figure out how to do
that, but until then at least the simple and common (i.e. without
parent rewriting) 'git log -L:func:file' commands can benefit from the
reduced delay.
This change makes the failing 'parent oids without parent rewriting'
test in 't4211-line-log.sh' succeed.
The reduced delay is most noticable when there's a commit modifying
the line range near the tip of a large-ish revision range:
# no parent rewriting requested, no commit-graph present
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m9.570s
user 0m9.494s
sys 0m0.076s
After:
real 0m0.718s
user 0m0.674s
sys 0m0.044s
A significant part of the remaining delay is spent reading and parsing
commit objects in limit_list(). With the help of the commit-graph we
can eliminate most of that reading and parsing overhead, so here are
the timing results of the same command as above, but this time using
the commit-graph:
Before:
real 0m8.874s
user 0m8.816s
sys 0m0.057s
After:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
The next patch will further reduce the remaining delay.
To be clear: this patch doesn't actually optimize the line-level log,
but merely moves most of the work from the preprocessing step to the
history traversal, so the commits modifying the line range can be
shown as soon as they are processed, and the traversal can be
terminated as soon as the given number of commits are shown.
Consequently, listing the full history of a line range, potentially
all the way to the root commit, will take the same time as before (but
at least the user might start reading the output earlier).
Furthermore, if the most recent commit modifying the line range is far
away from the starting revision, then that initial delay will still be
significant.
Additional testing by Derrick Stolee: In the Linux kernel repository,
the MAINTAINERS file was changed ~3,500 times across the ~915,000
commits. In addition to that edit frequency, the file itself is quite
large (~18,700 lines). This means that a significant portion of the
computation is taken up by computing the patch-diff of the file. This
patch improves the real time it takes to output the first result quite
a bit:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 3.88 s
After: 0.71 s
If we drop the "-n 1" in the command, then there is no change in
end-to-end process time. This is because the command still needs to
walk the entire commit history, which negates the point of this
patch. This is expected.
As a note for future reference, the ~4.3 seconds in the old code
spends ~2.6 seconds computing the patch-diffs, and the rest of the
time is spent walking commits and computing diffs for which paths
changed at each commit. The changed-path Bloom filters could improve
the end-to-end computation time (i.e. no "-n 1" in the command).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing a changed-path Bloom filter, we need to take the
files that changed from the diff computation and extract the parent
directories. That way, a directory pathspec such as "Documentation"
could match commits that change "Documentation/git.txt".
However, the current code does a poor job of this process. The paths
are added to a hashmap, but we do not check if an entry already
exists with that path. This can create many duplicate entries and
cause the filter to have a much larger length than it should. This
means that the filter is more sparse than intended, which helps the
false positive rate, but wastes a lot of space.
Properly use hashmap_get() before hashmap_add(). Also be sure to
include a comparison function so these can be matched correctly.
This has an effect on a test in t0095-bloom.sh. This makes sense,
there are ten changes inside "smallDir" so the total number of
paths in the filter should be 11. This would result in 11 * 10 bits
required, and with 8 bits per byte, this results in 14 bytes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the tests in 't4211-line-log.sh' really check which parent
object IDs are shown in the output, either implicitly as part of
"Merge: ..." lines [1] or explicitly via the '%p' or '%P' format
specifiers in a custom pretty format.
Add two tests to 't4211-line-log.sh' to check which parent object IDs
are shown, one without and one with explicitly requested parent
rewriting, IOW without and with the '--parents' option.
The test without '--parents' is marked as failing, because without
that option parent rewriting should not be performed, and thus the
parent object ID should be that of the immediate parent, just like in
case of a pathspec-limited history traversal without parent rewriting.
The current line-level log implementation, however, performs parent
rewriting unconditionally and without a possibility to turn it off,
and, consequently, it shows the object ID of the most recent ancestor
that modified the given line range.
In both of these new tests we only really care about the object IDs of
the listed commits and their parents, but not the diffs of the line
ranges; the diffs have already been thoroughly checked in the previous
tests.
[1] While one of the tests ('-M -L ':f:b.c' parallel-change') does
list a merge commit, both of its parents happen to modify the
given line range and are listed as well, so the implications of
parent rewriting remained hidden and untested.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Documentation/technical/commit-graph-format.txt, the definition
of the BIDX chunk specifies the length is a number of 8-byte words.
During development we discovered that using 8-byte words in the
Murmur3 hash algorithm causes issues with big-endian versus little-
endian machines. Thus, the hash algorithm was adapted to work on a
byte-by-byte basis. However, this caused a change in the definition
of a "word" in bloom.h. Now, a "word" is a single byte, which allows
filters to be as small as two bytes. These length-two filters are
demonstrated in t0095-bloom.sh, and a larger filter of length 25 is
demonstrated as well.
The original point of using 8-byte words was for alignment reasons.
It also presented opportunities for extremely sparse Bloom filters
when there were a small number of changes at a commit, creating a
very low false-positive rate. However, modifying the format at this
point is unlikely to be a valuable exercise. Also, this use of
single-byte granularity does present opportunities to save space.
It is unclear if 8-byte alignment of the filters would present any
meaningful performance benefits.
Modify the format document to reflect reality.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the unused fields 'status', 'arg_alloc', 'arg_nr' and 'args'
from 'struct line_log_data'. They were already part of the struct
when it was introduced in commit 12da1d1f6 (Implement line-history
search (git log -L), 2013-03-28), but as far as I can tell none of
them have ever been actually used.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing changed-path Bloom filters for a commit, we need to
know if the commit has a parent or not. If the commit is not parsed,
then its parent pointer will be NULL.
As far as I can tell, the only opportunity to reach this code
without parsing the commit is inside "test-tool bloom
get_filter_for_commit" but it is best to be safe.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tree entries are sorted in path order, meaning that directory names get
a slash ('/') appended implicitly. Git fsck checks if trees contains
consecutive duplicates, but due to that ordering there can be
non-consecutive duplicates as well if one of them is a directory and the
other one isn't. Such a tree cannot be fully checked out.
Find these duplicates by recording candidate file names on a stack and
check candidate directory names against that stack to find matches.
Suggested-by: Brandon Williams <bwilliamseng@gmail.com>
Original-test-by: Brandon Williams <bwilliamseng@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce allows you commit files and directories with the same name,
so you could have files //depot/foo and //depot/foo/bar both checked
in. A p4 sync of a repository in this state fails. Deleting one of
the files recovers the repository.
When this happens we want git-p4 to recover in the same way as
perforce.
Note that Perforce has this change in their 2017.1 version:
Bugs fixed in 2017.1
#1489051 (Job #2170) **
Submitting a file with the same name as an existing depot
directory path (or vice versa) will now be rejected.
so people hopefully will not creating damaged Perforce repos
anymore, but "git p4" needs to be able to interact with already
corrupt ones.
Signed-off-by: Andrew Oakley <andrew@adoakley.name>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When selecting a batch of pack-files to repack in the "git
multi-pack-index repack" command, Git should respect the
repack.packKeptObjects config option. When false, this option says that
the pack-files with an associated ".keep" file should not be repacked.
This config value is "false" by default.
There are two cases for selecting a batch of objects. The first is the
case where the input batch-size is zero, which specifies "repack
everything". The second is with a non-zero batch size, which selects
pack-files using a greedy selection criteria. Both of these cases are
updated and tested.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "repack" subcommand of "git multi-pack-index" command
creates new packfile(s), it does not call the "git repack"
command but instead directly calls the "git pack-objects"
command, and the configuration variables meant for the "git
repack" command, like "repack.usedaeltabaseoffset", are ignored.
Check the configuration variables used by "git repack" ourselves
in "git multi-index-pack" and pass the corresponding options to
underlying "git pack-objects".
Note that `repack.writeBitmaps` configuration is ignored, as the
pack bitmap facility is useful only with a single packfile.
Signed-off-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rearranging the todo list so that the fixups/squashes are reordered
just after the commits they intend to fix up, we use two arrays to
maintain that list: `next` and `tail`.
The idea is that `next[i]`, if set to a non-negative value, contains the
index of the item that should be rearranged just after the `i`th item.
To avoid having to walk the entire `next` chain when appending another
fixup/squash, we also store the end of the `next` chain in `tail[i]`.
The logic we currently use to update these array items is based on the
assumption that given a fixup/squash item at index `i`, we just found
the index `i2` indicating the first item in that fixup chain.
However, as reported by Paul Ganssle, that need not be true: the special
form `fixup! <commit-hash>` is allowed to point to _another_ fixup
commit in the middle of the fixup chain.
Example:
* 0192a To fixup
* 02f12 fixup! To fixup
* 03763 fixup! To fixup
* 04ecb fixup! 02f12
Note how the fourth commit targets the second commit, which is already a
fixup that targets the first commit.
Previously, we would update `next` and `tail` under our assumption that
every `fixup!` commit would find the start of the `fixup!`/`squash!`
chain. This would lead to a segmentation fault because we would actually
end up with a `next[i]` pointing to a `fixup!` but the corresponding
`tail[i]` pointing nowhere, which would the lead to a segmentation
fault.
Let's fix this by _inserting_, rather than _appending_, the item. In
other words, if we make a given line successor of another line, we do
not simply forget any previously set successor of the latter, but make
it a successor of the former.
In the above example, at the point when we insert 04ecb just after
02f12, 03763 would already be recorded as a successor of 04ecb, and we
now "squeeze in" 04ecb.
To complete the idea, we now no longer assume that `next[i]` pointing to
a line means that `last[i]` points to a line, too. Instead, we extend
the concept of `last` to cover also partial `fixup!`/`squash!` chains,
i.e. chains starting in the middle of a larger such chain.
In the above example, after processing all lines, `last[0]`
(corresponding to 0192a) would point to 03763, which indeed is the end
of the overall `fixup!` chain, and `last[1]` (corresponding to 02f12)
would point to 04ecb (which is the last `fixup!` targeting 02f12, but it
has 03763 as successor, i.e. it is not the end of overall `fixup!`
chain).
Reported-by: Paul Ganssle <paul@ganssle.io>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because of the request/response model of protocol v2, the
upload_pack_v2() function is sometimes called twice in the same
process, while 'struct list_objects_filter_options filter_options'
was declared as static at the beginning of 'upload-pack.c'.
This made the check in list_objects_filter_die_if_populated(), which
is called by process_args(), fail the second time upload_pack_v2() is
called, as filter_options had already been populated the first time.
To fix that, filter_options is not static any more. It's now owned
directly by upload_pack(). It's now also part of 'struct
upload_pack_data', so that it's owned indirectly by upload_pack_v2().
In the long term, the goal is to also have upload_pack() use
'struct upload_pack_data', so adding filter_options to this struct
makes more sense than to have it owned directly by upload_pack_v2().
This fixes the first of the 2 bugs documented by d0badf8797
(partial-clone: demonstrate bugs in partial fetch, 2020-02-21).
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loop in warn_conflicted_path() that checks for the count of
entries with the same path uses "i+count" for the array
entry. However, the loop only verifies that the value of count is
below the array size. Fix this by adding i to the condition.
I hit this condition during a test of the in-tree sparse-checkout
feature, so it is exercised by the end of the series.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
[jc: readability fix]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We advertise that the bisect log can be corrected in your editor
before being fed to "git bisect replay", but some editors may
turn the line endings to CRLF.
Update the parser of the input lines so that the CR at the end of
the line gets ignored.
Were anyone to intentionally be using terms/revs with embedded CRs,
replaying such bisects will no longer work with this change. I suspect
that this is incredibly rare.
Signed-off-by: Christopher Warrington <chwarr@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to
'submodule--helper.c' and call the latter via 'git-submodule.sh'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally a failure a user is seeing may be related to a specific
hook which is being run, perhaps without the user realizing. While the
contents of hooks can be sensitive - containing user data or process
information specific to the user's organization - simply knowing that a
hook is being run at a certain stage can help us to understand whether
something is going wrong.
Without a definitive list of hook names within the code, we compile our
own list from the documentation. This is likely prone to bitrot, but
designing a single source of truth for acceptable hooks is too much
overhead for this small change to the bugreport tool.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* We need a `final_new_line` to make our source code as text file, per
POSIX and C specification.
* `bloom_filters` should be limited to interal linkage only
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document protocol changes after CVE-2020-11008, including the removal of
references to the override of attributes which is no longer recommended
after CVE-2020-5260 and that might be removed in the future.
While at it do some improvements for clarity and consistency.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the expected effect of all attributes and how the helpers
are expected to handle them and the context where they operate.
While at it, space the descriptions for clarity, and add a paragraph
mentioning the early termination in the list processing of helpers,
to complement the one about the special "quit" attribute.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on the workflows of individual developers, it can either be
convenient or annoying that our GitHub Actions CI jobs are run on every
branch. As an example of annoying: if you carry many half-finished
work-in-progress branches and rebase them frequently against master,
you'd get tons of failure reports that aren't interesting (not to
mention the wasted CPU).
This commit adds a new job which checks a special branch within the
repository for CI config, and then runs a shell script it finds there to
decide whether to skip the rest of the tests. The default will continue
to run tests for all refs if that branch or script is missing.
There have been a few alternatives discussed:
One option is to carry information in the commit itself about whether it
should be tested, either in the tree itself (changing the workflow YAML
file) or in the commit message (a "[skip ci]" flag or similar). But
these are frustrating and error-prone to use:
- you have to manually apply them to each branch that you want to mark
- it's easy for them to leak into other workflows, like emailing patches
We could likewise try to get some information from the branch name. But
that leads to debates about whether the default should be "off" or "on",
and overriding still ends up somewhat awkward. If we default to "on",
you have to remember to name your branches appropriately to skip CI. And
if "off", you end up having to contort your branch names or duplicate
your pushes with an extra refspec.
By comparison, this commit's solution lets you specify your config once
and forget about it, and all of the data is off in its own ref, where it
can be changed by individual forks without touching the main tree.
There were a few design decisions that came out of on-list discussion.
I'll summarize here:
- we could use GitHub's API to retrieve the config ref, rather than a
real checkout (and then just operate on it via some javascript). We
still have to spin up a VM and contact GitHub over the network from
it either way, so it ends up not being much faster. I opted to go
with shell to keep things similar to our other tools (and really
could implement allow-refs in any language you want). This also makes
it easy to test your script locally, and to modify it within the
context of a normal git.git tree.
- we could keep the well-known refname out of refs/heads/ to avoid
cluttering the branch namespace. But that makes it awkward to
manipulate. By contrast, you can just "git checkout ci-config" to
make changes.
- we could assume the ci-config ref has nothing in it except config
(i.e., a branch unrelated to the rest of git.git). But dealing with
orphan branches is awkward. Instead, we'll do our best to efficiently
check out only the ci/config directory using a shallow partial clone,
which allows your ci-config branch to be just a normal branch, with
your config changes on top.
- we could provide a simpler interface, like a static list of ref
patterns. But we can't get out of spinning up a whole VM anyway, so
we might as well use that feature to make the config as flexible as
possible. If we add more config, we should be able to reuse our
partial-clone to set more outputs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These commands take the --quiet option for their own operation, but
they forget to pass the option down when they invoke "git gc --auto"
internally.
Teach them to do so using the run_auto_gc() helper we added in the
previous step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1991006c (fetch: convert argv_gc_auto to struct argv_array,
2014-08-16), we taught "git fetch --quiet" to pass the "--quiet"
option down to "gc --auto". This issue, however, is not limited to
"fetch":
$ git grep -e 'gc.*--auto' \*.c
finds hits in "am", "commit", "merge", and "rebase" and these
commands do not pass "--quiet" down to "gc --auto" when they
themselves are told to be quiet.
As a preparatory step, let's introduce a helper function
run_auto_gc(), that the caller can pass a boolean "quiet",
and redo the fix to "git fetch" using the helper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In two tests introduced by 4fa3f00abb ("fetch-pack: in protocol v2,
in_vain only after ACK", 2020-04-28) and 2f0a093dd6 ("fetch-pack: in
protocol v2, reset in_vain upon ACK", 2020-04-28), the count of objects
downloaded is checked by grepping for a specific message in the packet
trace. However, this is flaky as that specific message may be delivered
over 2 or more packet lines.
Instead, grep over stderr, just like the "fetch creating new shallow
root" test in the same file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an issue in 'Common Issues' section which addresses the confusion
between performing a 'fetch' and a 'pull'.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add issue in 'Common Issues' section which addresses the problem of
Git tracking files/paths mentioned in '.gitignore'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent patches, we are going to update a progress meter when
'add_ref_to_set()' is called, and need a convenient way to pass a
'struct progress *' in from the caller.
Introduce 'refs_cb_data' as a catch-all for parameters that
'add_ref_to_set' may need, and wrap the existing single parameter in
that struct.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the beginning in 118250728e (credential: apply helper config,
2011-12-10), the declaration for that function used a different order
than the implementation.
All callers use the same order than the implementation, so update
the declaration in credential.h to match.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c44088ecc4 (credential: treat URL without scheme as invalid, 2020-04-18)
changes the implementation for this function to return -1 if protocol is
missing.
Update blurb to match implementation.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).
And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.
But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------
[...]
5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2%
The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patch, we made it easy to define other filters that
exclude all objects of a certain type. Use that in order to implement
bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is
equal to 0.
The general case is not helped by bitmaps, since for values of 'n > 0',
the object filtering machinery requires a full-blown tree traversal in
order to determine the depth of a given tree. Caching this is
non-obvious, too, since the same tree object can have a different depth
depending on the context (e.g., a tree was moved up in the directory
hierarchy between two commits).
But, the 'n = 0' case can be helped, and this patch does so. Running
p5310.11 in this tree and on master with the kernel, we can see that
this case is helped substantially:
Test master this tree
--------------------------------------------------------------------------------
5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4%
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.
In the future, we would like to add support for filters that behave as
if they exclude a certain type of object, for e.g., the tree depth
filter with depth 0.
To prepare for this, make some of the functions used for filtering more
generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
they can work over arbitrary object types.
To that end, create 'find_tip_objects' and
'filter_bitmap_exclude_type', and redefine the aforementioned functions
in terms of those.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most callers, we have an actual list_objects_filter_options struct,
and if no filtering is desired its "choice" element will be
LOFC_DISABLED. However, some code may have only a pointer to such a
struct which may be NULL (because _their_ callers didn't care about
filtering, either). Rather than forcing them to handle this explicitly
like:
if (filter_options)
traverse_commit_list_filtered(filter_options, revs,
show_commit, show_object,
show_data, NULL);
else
traverse_commit_list(revs, show_commit, show_object,
show_data);
let's just treat a NULL filter_options the same as LOFC_DISABLED. We
only need a small change, since that option struct is converted into a
real filter only in the "init" function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix alignment issues that were likely introduced due to an editor
using tab lengths of 4 instead of 8.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'commit_shallow_file()' and 'rollback_shallow_file()' were
introduced, they did not have a documenting comment, when they could
have benefited from one.
Add a brief note about what these functions do, and make a special note
that they reset stat-validity checks.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, some functions will be moved from 'commit.c' to have
prototypes in a new 'shallow.h' and their implementations in
'shallow.c'.
Three functions in 'commit.c' use 'commit_graft_pos()' (they are
'register_commit_graft()', 'lookup_commit_graft()', and
'unregister_shallow()'). The first two of these will stay in 'commit.c',
but the latter will move to 'shallow.c', and thus needs
'commit_graft_pos' to be non-static.
Prepare for that by making 'commit_graft_pos' non-static so that it can
be called from both 'commit.c' and 'shallow.c'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both test_submodule_switch_recursing_with_args() and
test_submodule_forced_switch_recursing_with_args() call the internal
function test_submodule_recursing_with_args_common() with the final
argument of `--recurse-submodules`. Consolidate this duplication by
appending the argument in test_submodule_recursing_with_args_common().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the shell scripts in this codebase, the usual style is to include a
space between the function name and the (). Add these missing spaces to
conform to the usual style of the code.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Looking at the value of %(push:remoteref) only handles the case when an
explicit push refspec is passed. But it does not handle the fallback
cases of looking at the configuration value of `push.default`.
In particular, doing something like
git config push.default current
git for-each-ref --format='%(push)'
git for-each-ref --format='%(push:remoteref)'
prints a useful tracking ref for the first for-each-ref, but an empty
string for the second.
Since the intention of %(push:remoteref), from 9700fae5ee (for-each-ref:
let upstream/push report the remote ref name) is to get exactly which
branch `git push` will push to, even in the fallback cases, fix this.
To get the meaning of %(push:remoteref), `ref-filter.c` calls
`remote_ref_for_branch`. We simply add a new static helper function,
`branch_get_push_remoteref` that follows the logic of
`branch_get_push_1`, and call it from `remote_ref_for_branch`.
We also update t/6300-for-each-ref.sh to handle all `push.default`
strategies. This involves testing `push.default=simple` twice, once
where there is a matching upstream branch and once when there is none.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_state()` shell functions in C and also add a
subcommand `--bisect-state` to `git-bisect--helper` to call them from
git-bisect.sh .
Using `--bisect-state` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and will be called by some
other methods.
`bisect_head()` is only called from `bisect_state()`, thus it is not
required to introduce another subcommand.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the subcommand to `git bisect--helper` and call it from
git-bisect.sh.
With the conversion of `bisect_auto_next()` from shell to C in a
previous commit, `bisect_start()` can now be fully ported to C.
So let's complete the `--bisect-start` subcommand of
`git bisect--helper` so that it fully implements `bisect_start()`,
and let's use this subcommand in `git-bisect.sh` instead of
`bisect_start()`.
This removes the signal handling we had in `bisect_start()` as we
don't really need it. While at it, "trap" is changed to "handle".
As "trap" is a reference to the shell "trap" builtin, which isn't
used anymore.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_next()` and the `bisect_auto_next()` shell functions
in C and add the subcommands to `git bisect--helper` to call them from
git-bisect.sh .
bisect_auto_next() function returns an enum bisect_error type as whole
`git bisect` can exit with an error code when bisect_next() does.
Using `--bisect-next` and `--bisect-auto-next` subcommands is a
temporary measure to port shell function to C so as to use the existing
test suite. As more functions are ported, `--bisect-auto-next`
subcommand will be retired and will be called by some other methods.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_autostart()` shell function in C and add the
C implementation from `bisect_next()` which was previously left
uncovered. Also add a subcommand `--bisect-autostart` to
`git bisect--helper` be called from `bisect_state()` from
git-bisect.sh .
Using `--bisect-autostart` subcommand is a temporary measure to port
shell function to C so as to use the existing test suite. As more
functions are ported, this subcommand will be retired and
bisect_autostart() will be called directly by `bisect_state()`.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor code adding a new `write_in_file()` function
that opens a file for writing a message and closes it.
This helper will be used in later steps and makes the code
simpler and easier to understand.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following 'enum bisect_error' vocabulary, return variable 'res' is
always non-positive.
Let's use '-res' instead of 'abs(res)' to make the code clearer.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a `cmd_*()` function, return `error()` cannot be used
because that translates to `-1` and `cmd_*()` functions need
to return exit codes.
Let's fix switch default return.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some asynchronous operations, we build a chain of callbacks to
execute when the operation is done. These callbacks are held in $after,
and a new callback can be added by appending to $after. Once the
operation is done, $after is executed as a script.
But if we don't append a semi-colon after the procedure calls, they will
appear to Tcl as arguments to the previous procedure's arguments. So,
for example, if $after is "foo", and we just append "bar", then $after
becomes "foo bar", and bar will be treated as an argument to foo. If foo
does not accept any optional arguments, it would result in Tcl throwing
an error. If instead we do append a semi-colon, $after will look like
"foo;bar;", and these will be treated as two separate procedure calls.
Before d9c6469 (git-gui: update status bar to track operations,
2019-12-01), this problem was masked because ui_ready/ui_status did
accept an optional argument. In d9c6469, ui_ready stopped accepting an
optional argument, and this error started showing up.
Another instance of this problem is when a call to ui_status without a
trailing semicolon. ui_status never accepted an optional argument to
begin with, but the issue never managed to surface.
So, fix these errors by making sure we always append a semi-colon after
procedure calls when multiple callbacks are involved in $after.
Helped-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Ansgar Röber <ansgar.roeber@rwth-aachen.de>
/* use date as a heuristic when generations are equal */
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.