A new refs backend "reftable" to replace the traditional
combination of packed-refs files and one-file-per-ref loose refs
has been implemented and integrated for improved performance and
atomicity.
* hn/reftable:
SQUASH??? whitespace breakage fix
Add "test-tool dump-reftable" command.
Add reftable testing infrastructure
vcxproj: adjust for the reftable changes
Add GIT_DEBUG_REFS debugging mechanism
Hookup unittests for the reftable library.
Reftable support for git-core
Add reftable library
Add .gitattributes for the reftable/ directory
Iterate over the "refs/" namespace in for_each_[raw]ref
Move REF_LOG_ONLY to refs-internal.h
Treat REVERT_HEAD as a pseudo ref
Treat CHERRY_PICK_HEAD as a pseudo ref
Treat BISECT_HEAD as a pseudo ref
Make refs_ref_exists public
Write pseudorefs through ref backends.
checkout: add '\n' to reflog message
lib-t6000.sh: write tag using git-update-ref
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.
* js/default-branch-name:
Document how the default branch name can be overridden
fmt-merge-msg: learn about the possibly-configured default branch name
clone: learn about the possibly-configured default branch name
submodule: use the (possibly overridden) default branch name
testsvn: respect `core.defaultBranchName`
send-pack/transport-helper: respect `core.defaultBranchName`
remote: respect `core.defaultBranchName`
init: allow overriding the default branch name for new repositories
The "hooks defined in config" topic.
Ready???
* es/config-hooks:
hook: add --porcelain to list command
hook: add list command
hook: scaffolding for git-hook subcommand
doc: propose hooks managed by the config
The wrapper to call into zlib followed our long tradition to use
"unsigned long" for sizes of regions in memory, which have been
updated to use "size_t".
* mk/use-size-t-in-zlib:
zlib.c: use size_t for size
Rewrite of the remainder of "git bisect" script in C continues.
* mr/bisect-in-c-2:
bisect--helper: retire `--bisect-autostart` subcommand
bisect--helper: retire `--write-terms` subcommand
bisect--helper: retire `--check-expected-revs` subcommand
bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C
bisect--helper: retire `--next-all` subcommand
bisect--helper: retire `--bisect-clean-state` subcommand
bisect--helper: finish porting `bisect_start()` to C
bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C
bisect--helper: reimplement `bisect_autostart` shell function in C
bisect--helper: introduce new `write_in_file()` function
bisect--helper: use '-res' in 'cmd_bisect__helper' return
bisect--helper: fix `cmd_*()` function switch default return
"git receive-pack" that accepts requests by "git push" learned to
outsource most of the ref updates to the new "proc-receive" hook.
* jx/proc-receive-hook:
doc: add documentation for the proc-receive hook
transport: parse report options for tracking refs
t5411: test updates of remote-tracking branches
receive-pack: new config receive.procReceiveRefs
refs.c: refactor to reuse ref_is_hidden()
receive-pack: feed report options to post-receive
doc: add document for capability report-status-v2
New capability "report-status-v2" for git-push
receive-pack: add new proc-receive hook
t5411: add basic test cases for proc-receive hook
transport: not report a non-head push as a branch
"git grep" has been tweaked to be limited to the sparse checkout
paths.
Review needed on 4/6; otherwise looking sane.
cf. <CABPp-BGdEyEeajYZj_rdxp=MyEQdszuyjVTax=hhYj3fOtRQUQ@mail.gmail.com>
* mt/grep-sparse-checkout:
config: add setting to ignore sparsity patterns in some cmds
grep: honor sparse checkout patterns
config: correctly read worktree configs in submodules
t/helper/test-config: facilitate addition of new cli options
t/helper/test-config: return exit codes consistently
doc: grep: unify info on configuration variables
The "%(push:remoteref)" placeholder in the "--format=" argument of
"git format-patch" (and friends) only showed what got explicitly
configured, not what ref at the receiving end would be updated when
"git push" was used, as it ignored the default behaviour (e.g. update
the same ref as the source).
* dr/push-remoteref-fix:
remote.c: fix handling of %(push:remoteref)
"git rebase -i" learns a bit more options.
Not quite.
cf. <nycvar.QRO.7.76.6.2005290437350.56@tvgsbejvaqbjf.bet>
* pw/rebase-i-more-options:
rebase: add --reset-author-date
rebase -i: support --ignore-date
sequencer: rename amend_author to author_to_free
rebase -i: support --committer-date-is-author-date
rebase -i: add --ignore-whitespace flag
The changed-path Bloom filter is improved using ideas from an
independent implementation.
* sg/commit-graph-cleanups:
commit-graph: persist existence of changed-paths
commit-graph: change test to die on parse, not load
bloom: enforce a minimum size of 8 bytes
commit-graph: check all leading directories in changed path Bloom filters
commit-graph: check chunk sizes after writing
commit-graph: simplify chunk writes into loop
commit-graph: unify the signatures of all write_graph_chunk_*() functions
commit-graph: place bloom_settings in context
commit-graph: simplify write_commit_graph_file() #2
commit-graph: simplify write_commit_graph_file() #1
commit-graph: simplify parse_commit_graph() #2
commit-graph: simplify parse_commit_graph() #1
commit-graph: clean up #includes
diff.h: drop diff_tree_oid() & friends' return value
commit-slab: add a function to deep free entries on the slab
commit-graph-format.txt: all multi-byte numbers are in network byte order
commit-graph: fix parsing the Chunk Lookup table
tree-walk.c: don't match submodule entries for 'submod/anything'
The effort to avoid using test_must_fail on non-git command continues.
Getting very close.
cf. <20200618181533.GA633383@coredump.intra.peff.net>
* dl/test-must-fail-fixes-5:
lib-submodule-update: use callbacks in test_submodule_switch_common()
lib-submodule-update: prepend "git" to $command
lib-submodule-update: consolidate --recurse-submodules
lib-submodule-update: add space after function name
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
A misdesigned strbuf_write_fd() function has been retired.
* rs/retire-strbuf-write-fd:
strbuf: remove unreferenced strbuf_write_fd method.
bugreport.c: replace strbuf_write_fd with write_in_full
The way refnames are anonymized has been updated and a way to help
debugging using the anonymized output hsa been added.
* jk/fast-export-anonym:
fast-export: allow dumping the path mapping
fast-export: anonymize "master" refname
fast-export: allow dumping the refname mapping
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.
* ak/commit-graph-to-slab:
commit-graph: minimize commit_graph_data_slab access
commit: move members graph_pos, generation to a slab
commit-graph: introduce commit_graph_data_slab
object: drop parsed_object_pool->commit_count
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.
* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"
"git status" learned to report the status of sparse checkout.
* en/sparse-status:
git-prompt: include sparsity state as well
wt-status: show sparse checkout status as well
CMake support to build with MSVC for Windows bypassing the Makefile.
Almost there.
* ss/cmake-build:
ci: modification of main.yml to use cmake for vs-build job
cmake: support for building git on windows with msvc and clang.
cmake: support for building git on windows with mingw
cmake: support for testing git when building out of the source tree
cmake: support for testing git with ctest
cmake: installation support for git
cmake: generate the shell/perl/python scripts and templates, translations
Introduce CMake support for configuring Git
Allow runtime upgrade of the repository format version, which needs
to be done carefully.
There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.
* xl/upgrade-repo-format:
check_repository_format_gently(): refuse extensions for old repositories
sparse-checkout: upgrade repository to version 1 when enabling extension
fetch: allow adding a filter after initial clone
repository: add a helper function to perform repository format upgrade
When using an algorithm other than SHA-1, we need the remote helper to
advertise support for the object-format extension and provide
information back to us so that we can properly parse refs and return
data. Ensure that the test remote helper understands these extensions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like with the dumb HTTP transport, there isn't a way to explicitly
specify the hash algorithm when dealing with a bundle, so detect the
algorithm based on the length of the object IDs in the prerequisites and
ref advertisements.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack by default reads the repository to determine the object
format. However, when outside of a repository, it's necessary to specify
the hash algorithm in use so that the pack can be properly indexed. Add
an --object-format argument when invoking git index-pack outside of a
repository.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we speak protocol v2 in this test, we must pass the object-format
header if the algorithm is not SHA-1. Otherwise, git upload-pack fails
because the hash algorithm doesn't match and not because we've failed to
speak the protocol correctly. Pass the header so that our assertions
test what we're really interested in.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're using an algorithm other than SHA-1, we need to specify the
algorithm in use so we don't get a failure with an "unknown format"
message. Add a wrapper function that specifies this header if required.
Skip specifying this header for SHA-1 to test that it works both with an
without this header.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make this test work with SHA-256, offer an object-format
capability so that both sides use the same algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-sha1-array uses the_hash_algo under the hood. Since t0064 wants to
use the value that is correct for the hash algorithm that we're testing,
make sure the test helper initializes the repository to set
the_hash_algo correctly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally, the remote-curl transport helper is aware of the hash
algorithm we're using because we're in a repo with the appropriate hash
algorithm set. However, when using git ls-remote outside of a
repository, we won't have initialized the hash algorithm properly, so
use hash_to_hex_algop to print the ref corresponding to the algorithm
we've detected.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When outside a repository, git index-pack is unable to guess the hash
algorithm in use for a pack, since packs don't contain any information
on the algorithm in use. Pass an option to index-pack to help it out in
this test.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack is usually run in a repository, but need not be. Since
packs don't contains information on the algorithm in use, instead
relying on context, add an option to index-pack to tell it which one
we're using in case someone runs it outside of a repository. Since
using --stdin necessarily implies a repository, don't allow specifying
an object format if it's provided to prevent users from passing an
option that won't work. Add documentation for this option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the info/refs file for a repository, we have no explicit
way to detect which hash algorithm is in use because the file doesn't
provide one. Detect the hash algorithm in use by the size of the first
object ID.
If we have an empty repository, we don't know what the hash algorithm is
on the remote side, so default to whatever the local side has
configured. Without doing this, we cannot clone an empty repository
since we don't know its hash algorithm. Test this case appropriately,
since we currently have no tests for cloning an empty repository with
the dumb HTTP protocol.
We anonymize the URL like elsewhere in the function in case the user has
decided to include a secret in the URL.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_write_fd was only used in bugreport.c. Since that file now uses
write_in_full, this method is no longer needed. In addition, strbuf_write_fd
did not guard against exceeding MAX_IO_SIZE for the platform, nor
provided error handling in the event of a failure if only partial data
was written to the file descriptor. Since already write_in_full has this
capability and is in general use, it should be used instead. The change
impacts strbuf.c and strbuf.h.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_write_fd method did not provide checks for buffers larger
than MAX_IO_SIZE. Replacing with write_in_full ensures the entire
buffer will always be written to disk or report an error and die.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a `GIT_TEST_*` environment variable and a `core.` config
setting (with the former taking precendence over the latter) to allow
overriding what name Git uses by default as main branch of new
repositories.
Now that all kinds of Git operations have learned to respect those,
let's document them.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working with an anonymized repo, it can be useful to be able to
refer to particular paths. E.g., reproducing a bug with "git rev-list --
foo.c" in the original repo would need to replace "foo.c" with its
anonymized counterpart to produce the same effect.
We recently taught fast-export to dump the refname mapping. Let's do the
same thing for paths, which can reuse most of the same infrastructure.
Note that the output format isn't unambiguous here (because paths could
contain spaces). That's OK because this is meant to be examined by a
human.
We could also just introduce a "dump mapping" file that shows every
mapping we make. But it would be a bit more awkward to work with, as the
user would have to sort through more data to find the parts they're
interested in (and there are likely to be many more paths than refnames,
making it annoying for people who just want to dump the refnames).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:
- it helped to have some known reference point between the original
and anonymized repository
- since it's historically the default branch name, it doesn't leak any
information
Now that we can ask fast-export to dump the anonymized ref mapping, we
have a much better tool for the first one (because it works for _any_
ref, not just master).
For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.
Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. But this gives us a good
opportunity to further test the new dumping feature.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.
Let's make it possible to dump the mapping separate from the output
stream. This can be used by a bug reporter to modify their reproduction
recipe without revealing the original names (see the example in the
documentation).
The implementation is slightly non-obvious. There's no point in the
program where we know the complete set of refs we're going to anonymize.
Nor do we have a complete set of anonymized refs after finishing (we
have a set of anonymized ref path components, but no knowledge of how
those are assembled into complete refs). So we lazily write to the dump
file as we anonymize each name, and keep a list of ones that we've
output in order to avoid duplicates.
Some possible alternatives:
- we could just output the mapping of anonymized components (e.g.,
that "foo" became "ref123"). That works OK when you have short
refnames (e.g., "refs/heads/foo" becomes "refs/heads/ref123"), but
longer names would require the user to look up each component to
assemble the result. For example, "refs/remotes/origin/jk/foo" might
become "refs/remotes/refs37/refs56/refs102".
- instead of dumping the mapping, the same problem could be solved by
allowing the user to leave some refs alone. So if you want to
reproduce "git rev-list branch~17..HEAD" in the anonymized repo, we
could allow something like:
git tag anon-one branch
git tag anon-two HEAD
git fast-export --anonymize --all \
--no-anonymize-ref=anon-one \
--no-anonymize-ref=anon-two \
>stream
and then presumably "git rev-list anon-one~17..anon-two" would
behave the same in the re-imported repository. This is more
convenient in some ways, but it does require modifying the
original repository. And the concept doesn't easily extend to
other fields (e.g., pathnames, which will be addressed in a
subsequent patch).
- we could dump before/after commit hashes; combined with rev-parse,
that could convert these cases (as well as ones using raw hashes).
But we don't actually know the anonymized commit hashes; we're just
generating a stream that will produce them in the anonymized repo.
- likewise, we probably could insert object names or other markers
into commit messages, blob contents, etc, in order to let a user
with the original repo figure out which parts correspond. But using
this gets complicated (I have to find my commits in the result with
"git log --all --grep" or similar). It also makes it less clear that
the anonymized repo didn't leak any information (because we are
relying on object ids being unguessable).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_pull() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Leaking in cmd_*() isn't a big deal, but sets a bad example for other
users of is_descendant_of().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ref_newer() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git branches have been qualified as topic branches, integration branches,
development branches, feature branches, release branches and so on.
Git has a branch that is the master *for* development, but it is not
the master *of* any "slave branch": Git does not have slave branches,
and has never had, except for a single testcase that claims otherwise. :)
Independent of any future change to the naming of the "master" branch,
removing this sole appearance of the term is a strict improvement: it
avoids divisive language, and talking about "feature branch" clarifies
which developer workflow the test is trying to emulate.
Reported-by: Till Maas <tmaas@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.
* ss/submodule-set-branch-in-c:
submodule: port subcommand 'set-branch' from shell to C
"git merge-base --is-ancestor" is taught to take advantage of the
commit graph.
* ds/merge-base-is-ancestor-optim:
commit-reach: use fast logic in repo_in_merge_base
commit-reach: create repo_is_descendant_of()
Code clean-up around "git branch" with a minor bugfix.
* dl/branch-cleanup:
branch: don't mix --edit-description
t3200: test for specific errors
t3200: rename "expected" to "expect"
The `diff-files' command and related commands which call `cmd_diff_files()',
consider the "intent-to-add" files as a part of the index when comparing the
work-tree against it. This was previously addressed in [1] and [2] by turning
the option `--ita-invisible-in-index' (introduced in [3]) on by default.
For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as
as new, `ita_invisible_in_index' will be enabled by default here as well.
[1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26)
[2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in
index", 2016-10-24)
[3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24)
Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment in cmd_diff() states that if one tree-ish and no blobs are
provided, (the "N=1, M=0" case), it will provide a diff between the tree
and the cache. This is incorrect because a diff happens between the
tree-ish and the working tree. Remove the `--cached` in the comment so
that the correct behavior is shown. Add a new section describing the
"N=1, M=0, --cached" behavior.
Next, describe the "N=0, M=0, --cached" case, similar to the above since
it is undocumented.
Finally, fix some spacing issues. Add spaces between each section for
consistency and readability. Also, change tabs within the comment into
spaces.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current git prompt includes a lot of possible state information from
cherry-picks, merges, bisects, and various flavors of rebases. Add
sparsity as another state flavor (though one which can be present
simultaneously with any of rebase/cherry-pick/merge/bisect). This extra
state is shown with an extra
|SPARSE
substring before the other states, providing a prompt that looks like:
(branchname|SPARSE|REBASE 6/10)
The reason for showing the "|SPARSE" substring before other states is to
emphasize those other states. Sparsity is probably not going to change
much within a repository, while temporary operations will. So we want
the state changes related to temporary operations to be listed last, to
make them appear closer to where the user types and make them more
likely to be noticed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of the early feedback of folks trying out sparse-checkouts at
$dayjob is that sparse checkouts can sometimes be disorienting; users
can forget that they had a sparse-checkout and then wonder where files
went. Add some output to 'git status' in the form of a simple line that
states:
You are in a sparse checkout with 35% of files present.
where, obviously, the exact figure changes depending on what percentage
of files from the index do not have the SKIP_WORKTREE bit set.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run a test helper function in test_submodule_switch_common(), we
sometimes specify a whole helper function as the $command. When we do
this, in some test cases, we just mark the whole function with
`test_must_fail`. However, it's possible that the helper function might
fail earlier or later than expected due to an introduced bug. If this
happens, then the test case will still report as passing but it should
really be marked as failing since it didn't actually display the
intended behaviour.
Instead of invoking $command as one monolithic helper function, break it
up into three parts:
1. $command which is always a git command.
2. $before which is a callback function that runs just prior to
$command.
3. $after which is a callback function that runs just after
$command.
If the command requires a filename argument, specify it as `\$arg` since
that variable will be set and the whole $command string will be eval'd.
Unfortunately, there is no way to get rid of the eval as some of the
commands that are passed (such as the `git pull` tests) require that no
additional arguments are passed so we must have some mechanism for the
caller to specify whether or not it wants the filename argument.
The $before and $after callback functions will be passed the filename as
the first arg. These callback functions are optional and, if missing,
will be replaced with `true`. Also, in the case where we have a
`test_must_fail` test, $after will not be executed, similar to how the
helper functions currently behave when the git command fails and exits
the &&-chain.
Finally, as an added bonus, `test_must_fail` will only run on $command
which is guaranteed to be a git command.
An alternate design was considered where $OVERWRITING_FAIL is set from
test_submodule_switch_common() and exposed to the helper function. This
approach was considered too difficult to understand due to the fact that
using a signalling magic environment variable might be too indirect.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up in the codepath that serves "git fetch" continues.
* cc/upload-pack-data-3:
upload-pack: refactor common code into do_got_oid()
upload-pack: move oldest_have to upload_pack_data
upload-pack: pass upload_pack_data to got_oid()
upload-pack: pass upload_pack_data to ok_to_give_up()
upload-pack: pass upload_pack_data to send_acks()
upload-pack: pass upload_pack_data to process_haves()
upload-pack: change allow_unadvertised_object_request to an enum
upload-pack: move allow_unadvertised_object_request to upload_pack_data
upload-pack: move extra_edge_obj to upload_pack_data
upload-pack: move shallow_nr to upload_pack_data
upload-pack: pass upload_pack_data to send_unshallow()
upload-pack: pass upload_pack_data to deepen_by_rev_list()
upload-pack: pass upload_pack_data to deepen()
upload-pack: pass upload_pack_data to send_shallow_list()
"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.
* ct/diff-with-merge-base-clarification:
Documentation: usage for diff combined commits
git diff: improve range handling
t/t3430: avoid undefined git diff behavior
"git clean" code clean-up that resulted in a fix of recent
performance regression.
* en/clean-cleanups:
clean: optimize and document cases where we recurse into subdirectories
clean: consolidate handling of ignored parameters
dir, clean: avoid disallowed behavior
dir: fix a few confusing comments
The effect of sparse checkout settings on submodules is documented.
* en/sparse-with-submodule-doc:
git-sparse-checkout: clarify interactions with submodules
The same worktree directory must be registered only once, but
"git worktree move" allowed this invariant to be violated, which
has been corrected.
* es/worktree-duplicate-paths:
worktree: make "move" refuse to move atop missing registered worktree
worktree: generalize candidate worktree path validation
worktree: prune linked worktree referencing main worktree path
worktree: prune duplicate entries referencing same worktree path
worktree: make high-level pruning re-usable
worktree: give "should be pruned?" function more meaningful name
worktree: factor out repeated string literal
The interface to redact sensitive information in the trace output
has been simplified.
* jt/redact-all-cookies:
http: redact all cookies, teach GIT_TRACE_REDACT=0
Further code clean-up.
* cc/upload-pack-data-2:
upload-pack: move pack_objects_hook to upload_pack_data
upload-pack: move allow_sideband_all to upload_pack_data
upload-pack: move allow_ref_in_want to upload_pack_data
upload-pack: move allow_filter to upload_pack_data
upload-pack: move keepalive to upload_pack_data
upload-pack: pass upload_pack_data to upload_pack_config()
upload-pack: change multi_ack to an enum
upload-pack: move multi_ack to upload_pack_data
upload-pack: move filter_capability_requested to upload_pack_data
upload-pack: move use_sideband to upload_pack_data
upload-pack: move static vars to upload_pack_data
upload-pack: annotate upload_pack_data fields
upload-pack: actually use some upload_pack_data bitfields
Use of negative pathspec, while collecting paths including
untracked ones in the working tree, was broken.
* en/do-match-pathspec-fix:
dir: fix treatment of negated pathspecs
The behaviour of "sparse-checkout" in the state "git clone
--no-checkout" left was changed accidentally in 2.27, which has
been corrected.
* en/sparse-checkout:
sparse-checkout: avoid staging deletions of all files
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.
* js/reflog-anonymize-for-clone-and-fetch:
clone/fetch: anonymize URLs in the reflog
Reduce memory usage during "diff --quiet" in a worktree with too
many stat-unmatched paths.
* jk/diff-memuse-optim-with-stat-unmatch:
diff: discard blob data from stat-unmatched pairs
In an earlier patch, multiple struct acccesses to `graph_pos` and
`generation` were auto-converted to multiple method calls.
Since the values are fixed and commit-slab access costly, we would be
better off with storing the values as a local variable and reusing it.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We remove members `graph_pos` and `generation` from the struct commit.
The default assignments in init_commit_node() are no longer valid,
which is fine as the slab helpers return appropriate default values and
the assignments are removed.
We will replace existing use of commit->generation and commit->graph_pos
by commit_graph_data_slab helpers using
`contrib/coccinelle/commit.cocci'.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct commit is used in many contexts. However, members
`generation` and `graph_pos` are only used for commit-graph related
operations and otherwise waste memory.
This wastage would have been more pronounced as we transition to
generation number v2, which uses 64-bit generation number instead of
current 32-bits.
As they are often accessed together, let's introduce struct
commit_graph_data and move them to a commit_graph_data slab.
While the overall test suite runs just as fast as master,
(series: 26m48s, master: 27m34s, faster by 2.87%), certain commands
like `git merge-base --is-ancestor` were slowed by 40% as discovered
by Szeder Gábor [1]. After minimizing commit-slab access, the slow down
persists but is closer to 20%.
Derrick Stolee believes the slow down is attributable to the underlying
algorithm rather than the slowness of commit-slab access [2] and we will
follow-up in a later series.
[1]: https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2]: https://lore.kernel.org/git/13db757a-9412-7f1e-805c-8a028c4ab2b1@gmail.com/
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool->commit_count to keep count of
commits per repository and was used to assign commit->index.
However, commit-slab code requires commit->index values to be unique
and a global count would be correct, rather than a per-repo count.
Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.
As commit_count has no use anymore, let's also drop it from the struct.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This command dumps individual tables or a stack of of tables.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add GIT_TEST_REFTABLE environment var to control default ref storage
* Add test_prerequisite REFTABLE.
* Skip some tests that are incompatible:
* t3210-pack-refs.sh - does not apply
* t9903-bash-prompt - The bash mode reads .git/HEAD directly
* t1450-fsck.sh - manipulates .git/ directly to create invalid state
Major test failures:
* t1400-update-ref.sh - Reads from .git/{refs,logs} directly
* t1404-update-ref-errors.sh - Manipulates .git/refs/ directly
* t1405 - inspecs .git/ directly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows Git to be compiled via Visual Studio again after integrating
the `hn/reftable` branch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When set in the environment, GIT_DEBUG_REFS makes git print operations and
results as they flow through the ref storage backend. This helps debug
discrepancies between different ref backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unittests are under reftable/*_test.c, so all of the reftable code stays in
one directory. They are called from t/helpers/test-reftable.c in t0031-reftable.sh
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For background, see the previous commit introducing the library.
This introduces the refs/reftable-backend.c containing reftable powered ref
storage backend.
It can be activated by passing --ref-storage=reftable to "git init".
TODO:
* Fix worktree commands
* Spots marked XXX
Example use: see t/t0031-reftable.sh
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable is a format for storing the ref database. Its rationale and
specification is in the preceding commit.
This imports the upstream library as one big commit. For understanding
the code, it is suggested to read the code in the following order:
* The specification under Documentation/technical/reftable.txt
* reftable.h - the public API
* record.{c,h} - reading and writing records
* block.{c,h} - reading and writing blocks.
* writer.{c,h} - writing a complete reftable file.
* merged.{c,h} and pq.{c,h} - reading a stack of reftables
* stack.{c,h} - writing and compacting stacks of reftable on the
filesystem.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This happens implicitly in the files/packed ref backend; making it
explicit simplifies adding alternate ref storage backends, such as
reftable.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
REF_LOG_ONLY is used in the transaction preparation: if a symref is involved in
a transaction, the referent of the symref should be updated, and the symref
itself should only be updated in the reflog. Other ref backends will need to
duplicate this logic.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check for existence and delete CHERRY_PICK_HEAD through pseudo ref functions.
This will help cherry-pick work with alternate ref storage backends.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the git-bisect.sh as bisect--helper inspected the file system directly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pseudorefs store transient data in in the repository. Examples are HEAD,
CHERRY_PICK_HEAD, etc.
These refs have always been read through the ref backends, but they were written
in a one-off routine that wrote an object ID or symref directly into
.git/<pseudo_ref_name>.
This causes problems when introducing a new ref storage backend. To remedy this,
extend the ref backend implementation with a write_pseudoref_fn and
update_pseudoref_fn.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reftable precisely reproduces the given message. This leads to differences,
because the files backend implicitly adds a trailing '\n' to all messages.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters were released in v2.27.0, but have a
significant drawback. A user can opt-in to writing the changed-path
filters using the "--changed-paths" option to "git commit-graph write"
but the next write will drop the filters unless that option is
specified.
This becomes even more important when considering the interaction with
gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
features.experimental). These config options trigger commit-graph writes
that the user did not signal, and hence there is no --changed-paths
option available.
Allow a user that opts-in to the changed-path filters to persist the
property of "my commit-graph has changed-path filters" automatically. A
user can drop filters using the --no-changed-paths option.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
43d3561 (commit-graph write: don't die if the existing graph is corrupt,
2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment
variable. This was created to verify that commit-graph was not loaded
when writing a new non-incremental commit-graph.
An upcoming change wants to load a commit-graph in some valuable cases,
but we want to maintain that we don't trust the commit-graph data when
writing our new file. Instead of dying on load, instead die if we ever
try to parse a commit from the commit-graph. This functionally verifies
the same intended behavior, but allows a more advanced feature in the
next change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original design of changed-path Bloom filters included an 8-byte
block size for filter lengths. This was changed mid-way through the
submission process, and now the length stored in the commit-graph has
one-byte granularity.
This can cause some issues for very small filters. The analysis for
false positive rates assume large filters, so rounding errors become
less important at that scale. When there are only a few paths changed,
a filter that has size only a few bytes could have very different
behavior. In fact, this is evidenced in the Git repository due to the
code organization and careful patch creation that leads to many commits
with very small filters. These small filters frequently have
false-positive rates in the 8-10% range or higher.
The previous change improved the false-positive rate using multiple
Bloom keys when the path has multiple directory components. However,
that does not help at all for files at root. It is typical to have
several commits that change only the README at root, and those commits
would be likely to have these artificially high false-positive rates.
Correct this issue by creating a minimum filters size of 8 bytes. This
requires the very small commits (with fewer than six changes, including
non-root directories) to have a larger filter. In principle, this
violates the bits_per_entry value of struct bloom_filter_settings.
However, it does not actually create a functional problem.
As for compatibility, this only affects new versions writing filters for
commits that do not yet have a filter. Old version will write the
smaller filters and this version will persist and properly read that
data. Now, the new files will be generated slightly larger.
Bytes before Bytes after Difference
--------------------------------------------------
git 4,021,078 4,275,311 +6.32%
linux 72,212,101 73,909,286 +2.35%
tensorflow 7,596,359 7,691,646 +1.25%
This has a measurable improvement in the false-positive rate and the
end-to-end run time for these repos. The table below compares the average
false-positive rate and runtime of
git rev-list HEAD -- "$path"
before and after this change for 5000+ randomly* selected paths from
each repository:
Average false Average Average
positive rate runtime runtime
before after before after difference
------------------------------------------------------------------
git 0.786% 0.227% 0.0387s 0.0289s -25.5%
linux 0.0296% 0.0174% 0.0766s 0.0706s -7.8%
tensorflow 0.6977% 0.0268% 0.0420s 0.0384s -8.5%
*Path selection was done with the following pipeline:
git ls-tree -r --name-only HEAD | sort -R | head -n 5000
These relatively-small increases in file size appear to be a fair price
to pay for these performance improvements.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file 'dir/subdir/file' can only be modified if its leading
directories 'dir' and 'dir/subdir' are modified as well.
So when checking modified path Bloom filters looking for commits
modifying a path with multiple path components, then check not only
the full path in the Bloom filters, but all its leading directories as
well. Take care to check these paths in "deepest first" order,
because it's the full path that is least likely to be modified, and
the Bloom filter queries can short circuit sooner.
This can significantly reduce the average false positive rate, by
about an order of magnitude or three(!), and can further speed up
pathspec-limited revision walks. The table below compares the average
false positive rate and runtime of
git rev-list HEAD -- "$path"
before and after this change for 5000+ randomly* selected paths from
each repository:
Average false Average Average
positive rate runtime runtime
before after before after difference
------------------------------------------------------------------
git 3.220% 0.7853% 0.0558s 0.0387s -30.6%
linux 2.453% 0.0296% 0.1046s 0.0766s -26.8%
tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2%
*Path selection was done with the following pipeline:
git ls-tree -r --name-only HEAD | sort -R | head -n 5000
The improvements in runtime are much smaller than the improvements in
average false positive rate, as we are clearly reaching diminishing
returns here. However, all these timings depend on that accessing
tree objects is reasonably fast (warm caches). If we had a partial
clone and the tree objects had to be fetched from a promisor remote,
e.g.:
$ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git
$ git -C webkit.git -c core.modifiedPathBloomFilters=1 \
commit-graph write --reachable
$ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/
$ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \
rev-list HEAD -- "$path"
then checking all leading path component can reduce the runtime from
over an hour to a few seconds (and this is with the clone and the
promisor on the same machine).
This adjusts the tracing values in t4216-log-bloom.sh, which provides a
concrete way to notice the improvement.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In my experience while experimenting with new commit-graph chunks,
early versions of the corresponding new write_commit_graph_my_chunk()
functions are, sadly but not surprisingly, often buggy, and write more
or less data than they are supposed to, especially if the chunk size
is not directly proportional to the number of commits. This then
causes all kinds of issues when reading such a bogus commit-graph
file, raising the question of whether the writing or the reading part
happens to be buggy this time.
Let's catch such issues early, already when writing the commit-graph
file, and check that each write_graph_chunk_*() function wrote the
amount of data that it was expected to, and what has been encoded in
the Chunk Lookup table. Now that all commit-graph chunks are written
in a loop we can do this check in a single place for all chunks, and
any chunks added in the future will get checked as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_commit_graph_file() we now have one block of code filling the
array of 'struct chunk_info' with the IDs and sizes of chunks to be
written, and an other block of code calling the functions responsible
for writing individual chunks. In case of optional chunks like Extra
Edge List an Base Graphs List there is also a condition checking
whether that chunk is necessary/desired, and that same condition is
repeated in both blocks of code. Other, newer chunks have similar
optional conditions.
Eliminate these repeated conditions by storing the function pointers
responsible for writing individual chunks in the 'struct chunk_info'
array as well, and calling them in a loop to write the commit-graph
file. This will open up the possibility for a bit of foolproofing in
the following patch.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the write_graph_chunk_*() helper functions to have the same
signature:
- Return an int error code from all these functions.
write_graph_chunk_base() already has an int error code, now the
others will have one, too, but since they don't indicate any
error, they will always return 0.
- Drop the hash size parameter of write_graph_chunk_oids() and
write_graph_chunk_data(); its value can be read directly from
'the_hash_algo' inside these functions as well.
This opens up the possibility for further cleanups and foolproofing in
the following two patches.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Place an instance of struct bloom_settings into the struct
write_commit_graph_context. This allows simplifying the function
prototype of write_graph_chunk_bloom_data(). This will allow us
to combine the function prototypes and use function pointers to
simplify write_commit_graph_file().
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repo_is_descendant_of() method is aware of the existence of the
commit-graph file. It checks for generation_numbers_enabled() before
deciding on using can_all_from_reach() or repo_in_merge_bases()
depending on the situation. The reason here is that can_all_from_reach()
uses a depth-first search that is limited by the minimum generation
number of the target commits, and that algorithm can be very slow when
generation numbers are not present. The alternative uses
paint_down_to_common() which will walk the entire merge-base boundary,
which is typically slower.
This method is used by commands like "git tag --contains" and "git
branch --contains" for very fast results when a commit-graph file
exists. Unfortunately, it is _not_ used in commands like "git merge-base
--is-ancestor" which is doing an even simpler request.
This issue was raised recently [1] with respect to a change to how
generation numbers are stored, but was also reported much earlier [2]
before commit-reach.c existed to simplify these reachability queries.
[1] https://lore.kernel.org/git/20200607195347.GA8232@szeder.dev/
[2] https://lore.kernel.org/git/87608bawoa.fsf@evledraar.gmail.com/
The root cause is that builtin/merge-base.c has a method
handle_is_ancestor() that calls in_merge_bases(), an older version of
repo_in_merge_bases(). It would be better if we have every caller to
in_merge_bases() use the logic in can_all_from_reach() when possible.
This is where things get a little tricky: repo_is_descendant_of() calls
repo_in_merge_bases() in the non-generation numbers enabled case! If we
simply update repo_in_merge_bases() to call repo_is_descendant_of()
instead of repo_in_merge_bases_many(), then we will get a recursive call
loop. Thankfully, this is caught by the test suite in the default mode
(i.e. GIT_TEST_COMMIT_GRAPH=0).
The trick, then, is to make the non-generation number case for
repo_is_descendant_of() call repo_in_merge_bases_many() directly,
skipping the non-_many version. This allows us to take advantage of this
faster code path, when possible.
The easiest way to measure the performance impact is to test the
following command on the Linux kernel repository:
git merge-base --is-ancestor <A> <B>
| A | B | Time Before | Time After |
|------|------|-------------|------------|
| v3.0 | v5.7 | 0.459s | 0.028s |
| v4.0 | v5.7 | 0.267s | 0.021s |
| v5.0 | v5.7 | 0.074s | 0.013s |
Note that each of these samples return success. The old code performed
the same operation when <A> and <B> are swapped. However,
can_all_from_reach() will return immediately if the generation numbers
show that <A> has larger generation number than <B>. Thus, the time for
the swapped case is universally 0.004s in each case.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next change will make repo_in_merge_bases() depend on the logic in
is_descendant_of(), but we need to make the method independent of
the_repository first.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git branch` accepts `--edit-description` in conjunction with other
arguments. However, `--edit-description` is its own mode, similar to
`--set-upstream-to`, which is also made mutually exclusive with other
modes. Prevent `--edit-description` from being mixed with other modes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the "--set-upstream-to" and "--unset-upstream" tests, specific error
conditions are being tested. However, there is no way of ensuring that a
test case is failing because of some specific error.
Check stderr of failing commands to ensure that they are failing in the
expected way.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clean up style of test by changing some filenames from "expected" to
"expect", which follows typical test convention.
Also, change a space-indent into a tab-indent.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6b1db43109 ("clean: teach clean -d to preserve ignored paths",
2017-05-23) added the following code block (among others) to git-clean:
if (remove_directories)
dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
The reason for these flags is well documented in the commit message, but
isn't obvious just from looking at the code. Add some explanations to
the code to make it clearer.
Further, it appears git-2.26 did not correctly handle this combination
of flags from git-clean. With both these flags and without
DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into
all untracked AND ignored directories. git-2.26.0 clearly was not doing
that. I don't know the full reasons for that or whether git < 2.27.0
had additional unknown bugs because of that misbehavior, because I don't
feel it's worth digging into. As per the huge changes and craziness
documented in commit 8d92fb2927 ("dir: replace exponential algorithm
with a linear one", 2020-04-01), the old algorithm was a mess and was
thrown out. What I can say is that git-2.27.0 correctly recurses into
untracked AND ignored directories with that combination.
However, in clean's case we don't need to recurse into ignored
directories; that is just a waste of time. Thus, when git-2.27.0
started correctly handling those flags, we got a performance regression
report. Rather than relying on other bugs in fill_directory()'s former
logic to provide the behavior of skipping ignored directories, make use
of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in
commit eec0f7f2b7 ("status: add option to show ignored files
differently", 2017-10-30) for this purpose.
Reported-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I spent a long time trying to figure out how and whether the code worked
with different values of ignore, ignore_only, and remove_directories.
After lots of time setting up lots of testcases, sifting through lots of
print statements, and walking through the debugger, I finally realized
that one piece of code related to how it was all setup was found in
clean.c rather than dir.c. Make a change that would have made it easier
for me to do the extra testing by putting this handling in one spot.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.h documented quite clearly that DIR_SHOW_IGNORED and
DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this
effect by the definition of both enum values. However, a command like
git clean -fx $DIR
would set both values for dir.flags. I _think_ it happened to work
because:
* As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect
if DIR_SHOW_IGNORED_TOO is set.
* As coded, I believe DIR_SHOW_IGNORED would just happen to take
precedence over DIR_SHOW_IGNORED_TOO in the code as currently
constructed.
Which is a long way of saying "we just got lucky".
Fix clean.c to avoid setting these mutually exclusive values at the same
time, and add a check to dir.c that will throw a BUG() to prevent anyone
else from making this mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ignoring the sparse-checkout feature momentarily, if one has a submodule and
creates local branches within it with unpushed changes and maybe adds some
untracked files to it, then we would want to avoid accidentally removing such
a submodule. So, for example with git.git, if you run
git checkout v2.13.0
then the sha1collisiondetection/ submodule is NOT removed even though it
did not exist as a submodule until v2.14.0. Similarly, if you only had
v2.13.0 checked out previously and ran
git checkout v2.14.0
the sha1collisiondetection/ submodule would NOT be automatically
initialized despite being part of v2.14.0. In both cases, git requires
submodules to be initialized or deinitialized separately. Further, we
also have special handling for submodules in other commands such as
clean, which requires two --force flags to delete untracked submodules,
and some commands have a --recurse-submodules flag.
sparse-checkout is very similar to checkout, as evidenced by the similar
name -- it adds and removes files from the working copy. However, for
the same avoid-data-loss reasons we do not want to remove a submodule
from the working copy with checkout, we do not want to do it with
sparse-checkout either. So submodules need to be separately initialized
or deinitialized; changing sparse-checkout rules should not
automatically trigger the removal or vivification of submodules.
I believe the previous wording in git-sparse-checkout.txt about
submodules was only about this particular issue. Unfortunately, the
previous wording could be interpreted to imply that submodules should be
considered active regardless of sparsity patterns. Update the wording
to avoid making such an implication. It may be helpful to consider two
example situations where the differences in wording become important:
In the future, we want users to be able to run commands like
git clone --sparse=moduleA --recurse-submodules $REPO_URL
and have sparsity paths automatically set up and have submodules *within
the sparsity paths* be automatically initialized. We do not want all
submodules in any path to be automatically initialized with that
command.
Similarly, we want to be able to do things like
git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN
and search through $REV for $PATTERN within the recorded sparsity
patterns. We want it to recurse into submodules within those sparsity
patterns, but do not want to recurse into directories that do not match
the sparsity patterns in search of a possible submodule.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preliminary clean-ups around refs API, plus file format
specification documentation for the reftable backend.
* hn/refs-cleanup:
reftable: define version 2 of the spec to accomodate SHA256
reftable: clarify how empty tables should be written
reftable: file format documentation
refs: improve documentation for ref iterator
t: use update-ref and show-ref to reading/writing refs
refs.h: clarify reflog iteration order
Teach .github/workflows/main.yml to use CMake for VS builds.
Modified the vs-test step to match windows-test step. This speeds
up the vs-test. Calling git-cmd from powershell and then calling git-bash
to perform the tests slows things down(factor of about 6). So git-bash
is directly called from powershell to perform the tests using prove.
NOTE: Since GitHub keeps the same directory for each job
(with respect to path) absolute paths are used in the bin-wrapper
scripts.
GitHub has switched to CMake 3.17.1 which changed the behaviour of
FindCURL module. An extra definition (-DCURL_NO_CURL_CMAKE=ON) has been
added to revert to the old behaviour.
In the configuration phase CMake looks for the required libraries for
building git (eg zlib,libiconv). So we extract the libraries before we
configure.
To check for ICONV_OMITS_BOM libiconv.dll needs to be in the working
directory of script or path. So we copy the dlls before we configure.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds support for Visual Studio and Clang builds
The minimum required version of CMake is upgraded to 3.15 because
this version offers proper support for Clang builds on Windows.
Libintl is not searched for when building with Visual Studio or Clang
because there is no binary compatible version available yet.
NOTE: In the link options invalidcontinue.obj has to be included.
The reason for this is because by default, Windows calls abort()'s
instead of setting errno=EINVAL when invalid arguments are passed to
standard functions.
This commit explains it in detail:
4b623d80f7
On Windows the default generator is Visual Studio,so for Visual Studio
builds do this:
cmake `relative-path-to-srcdir`
NOTE: Visual Studio generator is a multi config generator, which means
that Debug and Release builds can be done on the same build directory.
For Clang builds do this:
On bash
CC=clang cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]
On cmd
set CC=Clang
cmake `relative-path-to-srcdir` -G Ninja
-DCMAKE_BUILD_TYPE=[Debug or Release]
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch facilitates building git on Windows with CMake using MinGW
NOTE: The funtions unsetenv and hstrerror are not checked in Windows
builds.
Reasons
NO_UNSETENV is not compatible with Windows builds.
lines 262-264 compat/mingw.h
compat/mingw.h(line 25) provides a definition of hstrerror which
conflicts with the definition provided in
git-compat-util.h(lines 733-736).
To use CMake on Windows with MinGW do this:
cmake `relative-path-to-srcdir` -G "MinGW Makefiles"
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch allows git to be tested when performin out of source builds.
This involves changing GIT_BUILD_DIR in t/test-lib.sh to point to the
build directory. Also some miscellaneous copies from the source directory
to the build directory.
The copies are:
t/chainlint.sed needed by a bunch of test scripts
po/is.po needed by t0204-gettext-rencode-sanity
mergetools/tkdiff needed by t7800-difftool
contrib/completion/git-prompt.sh needed by t9903-bash-prompt
contrib/completion/git-completion.bash needed by t9902-completion
contrib/svn-fe/svnrdump_sim.py needed by t9020-remote-svn
NOTE: t/test-lib.sh is only modified when tests are run not during
the build or configure.
The trash directory is still srcdir/t
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch provides an alternate way to test git using ctest.
CTest ships with CMake, so there is no additional dependency being
introduced.
To perform the tests with ctest do this after building:
ctest -j[number of jobs]
NOTE: -j is optional, the default number of jobs is 1
Each of the jobs does this:
cd t/ && sh t[something].sh
The reason for using CTest is that it logs the output of the tests
in a neat way, which can be helpful during diagnosis of failures.
After the tests have run ctest generates three log files located in
`build-directory`/Testing/Temporary/
These log files are:
CTestCostData.txt:
This file contains the time taken to complete each test.
LastTestsFailed.log:
This log file contains the names of the tests that have failed in the
run.
LastTest.log:
This log file contains the log of all the tests that have run.
A snippet of the file is given below.
10/901 Testing: D:/my/git-master/t/t0009-prio-queue.sh
10/901 Test: D:/my/git-master/t/t0009-prio-queue.sh
Command: "sh.exe" "D:/my/git-master/t/t0009-prio-queue.sh"
Directory: D:/my/git-master/t
"D:/my/git-master/t/t0009-prio-queue.sh"
Output:
----------------------------------------------------------
ok 1 - basic ordering
ok 2 - mixed put and get
ok 3 - notice empty queue
ok 4 - stack order
passed all 4 test(s)
1..4
<end of output>
Test time = 1.11 sec
NOTE: Testing only works when building in source for now.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Install the built binaries and scripts using CMake
This is very similar to `make install`.
By default the destination directory(DESTDIR) is /usr/local/ on Linux
To set a custom installation path do this:
cmake `relative-path-to-srcdir`
-DCMAKE_INSTALL_PREFIX=`preferred-install-path`
Then run `make install`
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the placeholder substitution to generate scripted
Porcelain commands, e.g. git-request-pull out of
git-request-pull.sh
Generate shell/perl/python scripts and template using CMake instead of
using sed like the build procedure in the Makefile does.
The text translations are only build if `msgfmt` is found in your path.
NOTE: The scripts and templates are generated during configuration.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the moment, the recommended way to configure Git's builds is to
simply run `make`. If that does not work, the recommended strategy is to
look at the top of the `Makefile` to see whether any "Makefile knob" has
to be turned on/off, e.g. `make NO_OPENSSL=YesPlease`.
Alternatively, Git also has an `autoconf` setup which allows configuring
builds via `./configure [<option>...]`.
Both of these options are fine if the developer works on Unix or Linux.
But on Windows, we have to jump through hoops to configure a build
(read: we force the user to install a full Git for Windows SDK, which
occupies around two gigabytes (!) on disk and downloads about three
quarters of a gigabyte worth of Git objects).
The build infrastructure for Git is written around being able to run
make, which is not supported natively on Windows.
To help Windows developers a CMake build script is introduced here.
With a working support CMake, developers on Windows need only install
CMake, configure their build, load the generated Visual Studio solution
and immediately start modifying the code and build their own version of
Git. Likewise, developers on other platforms can use the convenient GUI
tools provided by CMake to configure their build.
So let's start building CMake support for Git.
This is only the first step, and to make it easier to review, it only
allows for configuring builds on the platform that is easiest to
configure for: Linux.
The CMake script checks whether the headers are present(eg. libgen.h),
whether the functions are present(eg. memmem), whether the funtions work
properly (eg. snprintf) and generate the required compile definitions
for the platform. The script also searches for the required libraries,
if it fails to find the required libraries the respective executables
won't be built.(eg. If libcurl is not found then git-remote-http won't
be built). This will help building Git easier.
With a CMake script an out of source build of git is possible resulting
in a clean source tree.
Note: this patch asks for the minimum version v3.14 of CMake (which is
not all that old as of time of writing) because that is the first
version to offer a platform-independent way to generate hardlinks as
part of the build. This is needed to generate all those hardlinks for
the built-in commands of Git.
Signed-off-by: Sibi Siddharthan <sibisiddharthan.github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sparse checkout is enabled, some users expect the output of certain
commands (such as grep, diff, and log) to be also restricted within the
sparsity patterns. This would allow them to effectively work only on the
subset of files in which they are interested; and allow some commands to
possibly perform better, by not considering uninteresting paths. For
this reason, we taught grep to honor the sparsity patterns, in the
previous patch. But, on the other hand, allowing grep and the other
commands mentioned to optionally ignore the patterns also make for some
interesting use cases. E.g. using grep to search for a function
documentation that resides outside the sparse checkout.
In any case, there is no current way for users to configure the behavior
they want for these commands. Aiming to provide this flexibility, let's
introduce the sparse.restrictCmds setting (and the analogous
--[no]-restrict-to-sparse-paths global option). The default value is
true. For now, grep is the only one affected by this setting, but the
goal is to have support for more commands, in the future.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the main uses for a sparse checkout is to allow users to focus on
the subset of files in a repository in which they are interested. But
git-grep currently ignores the sparsity patterns and reports all matches
found outside this subset, which kind of goes in the opposite direction.
There are some use cases for ignoring the sparsity patterns and the next
commit will add an option to obtain this behavior, but here we start by
making grep honor the sparsity boundaries in every case where this is
relevant:
- git grep in worktree
- git grep --cached
- git grep $REVISION
For the worktree and cached cases, we iterate over paths without the
SKIP_WORKTREE bit set, and limit our searches to these paths. For the
$REVISION case, we limit the paths we search to those that match the
sparsity patterns. (We do not check the SKIP_WORKTREE bit for the
$REVISION case, because $REVISION may contain paths that do not exist in
HEAD and thus for which we have no SKIP_WORKTREE bit to consult. The
sparsity patterns tell us how the SKIP_WORKTREE bit would be set if we
were to check out $REVISION, so we consult those. Also, we don't use the
sparsity patterns with the worktree or cached cases, both because we
have a bit we can check directly and more efficiently, and because
unmerged entries from a merge or a rebase could cause more files to
temporarily be present than the sparsity patterns would normally
select.)
Note that there is a special case here: `git grep $TREE`. In this case,
we cannot know whether $TREE corresponds to the root of the repository
or some sub-tree, and thus there is no way for us to know which sparsity
patterns, if any, apply. So the $TREE case will not use sparsity
patterns or any SKIP_WORKTREE bits and will instead always search all
files within the $TREE.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the steps in do_git_config_sequence() is to load the
worktree-specific config file. Although the function receives a git_dir
string, it relies on git_pathdup(), which uses the_repository->git_dir,
to make the path to the file. Furthermore, it also checks that
extensions.worktreeConfig is set through the
repository_format_worktree_config variable, which refers to
the_repository only. Thus, when a submodule has worktree-specific
settings, a command executed in the superproject that recurses into the
submodule won't find the said settings.
This will be especially important in the next patch: git-grep will learn
to honor sparse checkouts and, when running with --recurse-submodules,
the submodule's sparse checkout settings must be loaded. As these
settings are stored in the config.worktree file, they would be ignored
without this patch. So let's fix this by reading the right
config.worktree file and extensions.worktreeConfig setting, based on the
git_dir and commondir paths given to do_git_config_sequence(). Also
add a test to avoid any regressions.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test-config parses its arguments in an if-else chain, with one arm for
each available subcommand. Every arm expects (and checks) that argv
corresponds to something like "config <subcommand> [<subcommand args>]".
This means that whenever we want to change the syntax to accommodate a
new argument before <subcommand> (as we will do in the next patch), we
also need to increment the indexes accessing argv everywhere in the
if-else chain. This makes patches adding new options much noisier than
they need to be, besides being error-prone. So let's skip the "config"
argument in argv and argc to take the extra complexity out of such
patches (as the following one).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test-config helper may exit with a variety of at least four
different codes, to reflect the status of the requested operations.
These codes are sometimes checked in the tests, but not all of the codes
are returned consistently by the helper: 1 will usually refer to a
"value not found", but usage errors can also return 1 or 128. Moreover,
128 is also expected on errors within the configset functions. These
inconsistent uses of the exit codes can lead to false positives in the
tests. Although all tests which expect errors and check the helper's
exit code currently also check the output, it's still better to
standardize the exit codes and avoid future problems in new tests.
While we are here, let's also check that we have the expected argc for
configset_get_value and configset_get_value_multi, before trying to use
argv.
Note: this change is implemented with the unification of the exit
labels. This might seem unnecessary, for now, but it will benefit the
next patch, which will increase the cleanup section.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since all invocations of test_submodule_forced_switch() are git
commands, automatically prepend "git" before invoking
test_submodule_switch_common().
Similarly, many invocations of test_submodule_switch() are also git
commands so automatically prepend "git" before invoking
test_submodule_switch_common() as well.
Finally, for invocations of test_submodule_switch() that invoke a custom
function, rename the old function to test_submodule_switch_func().
This is necessary because in a future commit, we will be adding some
logic that needs to distinguish between an invocation of a plain git
comamnd and an invocation of a test helper function.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As 'upload-pack.c' is now using 'struct upload_pack_data'
thoroughly, let's refactor some common code into a new
do_got_oid() function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'oldest_have' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to got_oid(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to ok_to_give_up(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_acks(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to process_haves(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move a static variable
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's change allow_unadvertised_object_request,
which is now part of 'upload_pack_data', from an 'unsigned int'
to an enum.
This will make it clear which values this variable can take.
While at it let's change this variable name to 'allow_uor' to
make it shorter.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_unadvertised_object_request'
static variable into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'extra_edge_obj' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's properly initialize and clear 'extra_edge_obj'
in the appropriate 'upload_pack_data' initialization and
clearing functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'shallow_nr' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_unshallow(), so
that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen_by_rev_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to deepen(), so that
this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_shallow_list(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to send part of its packfile response as URIs.
An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a fetch results in a packfile being downloaded, a .keep file is
generated, so that the packfile can be preserved (from, say, a running
"git repack") until refs are written referring to the contents of the
packfile.
In a subsequent patch, a successful fetch using protocol v2 may result
in more than one .keep file being generated. Therefore, teach
fetch_pack() and the transport mechanism to support multiple .keep
files.
Implementation notes:
- builtin/fetch-pack.c normally does not generate .keep files, and thus
is unaffected by this or future changes. However, it has an
undocumented "--lock-pack" feature, used by remote-curl.c when
implementing the "fetch" remote helper command. In keeping with the
remote helper protocol, only one "lock" line will ever be written;
the rest will result in warnings to stderr. However, in practice,
warnings will never be written because the remote-curl.c "fetch" is
only used for protocol v0/v1 (which will not generate multiple .keep
files). (Protocol v2 uses the "stateless-connect" command, not the
"fetch" command.)
- connected.c has an optimization in that connectivity checks on a ref
need not be done if the target object is in a pack known to be
self-contained and connected. If there are multiple packfiles, this
optimization can no longer be done.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent patches will change how the output of pack-objects is
processed, so extract that processing into its own function.
Currently, at most 1 character can be buffered (in the "buffered" local
variable). One of those patches will require a larger buffer, so replace
that "buffered" local variable with a buffer array.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current C Git implementation expects Git servers to follow a
specific order of sections when transmitting protocol v2 responses, but
this is not explicit in the documentation. Make the order explicit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach http-fetch the ability to download packfiles directly, given a
URL, and to verify them.
The http_pack_request suite has been augmented with a function that
takes a URL directly. With this function, the hash is only used to
determine the name of the temporary file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_main() in http-fetch.c will grow in a future patch, so refactor the
HTTP walking part into its own function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git fetches a pack using dumb HTTP, (among other things) it invokes
index-pack on a ".pack.temp" packfile, specifying the filename as an
argument.
A future commit will require the aforementioned invocation of index-pack
to also generate a "keep" file. To use this, we either have to use
index-pack's naming convention (because --keep requires the pack's
filename to end with ".pack") or to pass the pack through stdin. Of the
two, it is simpler to pass the pack through stdin.
Thus, teach http to pass --stdin to index-pack. As a bonus, the code is
now simpler.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatting the commit message for merge commits, Git appends "into
<branch-name>" unless the current branch is the default branch.
Now that we can configure what the default branch name should be, we
will want to respect that setting in that scenario rather than using the
compiled-in default branch name.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.
Now that we can configure what the default branch name should be, we
will want `git clone` to respect that setting.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To allow for overriding the default branch name, we have introduced a
config setting. With this patch, the `git submodule` command learns
about this, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the default branch name can now be configured, the `testsvn`
remote helper needs to be told about it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When mentioning the default branch name in an error message, we want to
go with the preference specified by the user.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When guessing the default branch name of a remote, and there are no refs
to guess from, we want to go with the preference specified by the user
for the fall-back.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a growing number of projects trying to avoid the non-inclusive
name `master` in their repositories. For existing repositories, this
requires manual work. For new repositories, the only way to do that
automatically is by copying all of Git's template directory, then
hard-coding the desired default branch name into the `.git/HEAD` file,
and then configuring `init.templateDir` to point to those copied
template files.
To make this process much less cumbersome, let's introduce support for
`core.defaultBranchName`. That way, users won't need to keep their
copied template files up to date, and won't interfere with default hooks
installed by their administrators.
While at it, also let users set the default branch name via the
environment variable `GIT_TEST_DEFAULT_BRANCH_NAME`, in preparation for
adjusting Git's test suite to a more inclusive default branch name. As
is common in Git, the `GIT_TEST_*` variable takes precedence over the
config setting.
Note: we use the prefix `core.` instead of `init.` because we want to
adjust also `git clone`, `git fmt-merge-msg` and other commands over the
course of the next commits to respect this setting.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" takes special care to avoid creating a new worktree
at a location already registered to an existing worktree even if that
worktree is missing (which can happen, for instance, if the worktree
resides on removable media). "git worktree move", however, is not so
careful when validating the destination location and will happily move
the source worktree atop the location of a missing worktree. This leads
to the anomalous situation of multiple worktrees being associated with
the same path, which is expressly forbidden by design. For example:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
$ git worktree remove ../bar
fatal: validation failed, cannot remove working tree:
'.../bar' does not point back to '.git/worktrees/bar'
Fix this shortcoming by enhancing "git worktree move" to perform the
same additional validation of the destination directory as done by "git
worktree add".
While at it, add a test to verify that "git worktree move" won't move a
worktree atop an existing (non-worktree) path -- a restriction which has
always been in place but was never tested.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" checks that the specified path is a valid location
for a new worktree by ensuring that the path does not already exist and
is not already registered to another worktree (a path can be registered
but missing, for instance, if it resides on removable media). Since "git
worktree add" is not the only command which should perform such
validation ("git worktree move" ought to also), generalize the the
validation function for use by other callers, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree prune" detects when multiple entries are associated with
the same path and prunes the duplicates, however, it does not detect
when a linked worktree points at the path of the main worktree.
Although "git worktree add" disallows creating a new worktree with the
same path as the main worktree, such a case can arise outside the
control of Git even without the user mucking with .git/worktree/<id>/
administrative files. For instance:
$ git clone foo.git
$ git -C foo worktree add ../bar
$ rm -rf bar
$ mv foo bar
$ git -C bar worktree list
.../bar deadfeeb [master]
.../bar deadfeeb [bar]
Help the user recover from such corruption by extending "git worktree
prune" to also detect when a linked worktree is associated with the path
of the main worktree.
Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fundamental restriction of linked working trees is that there must
only ever be a single worktree associated with a particular path, thus
"git worktree add" explicitly disallows creation of a new worktree at
the same location as an existing registered worktree. Nevertheless,
users can still "shoot themselves in the foot" by mucking with
administrative files in .git/worktree/<id>/. Worse, "git worktree move"
is careless[1] and allows a worktree to be moved atop a registered but
missing worktree (which can happen, for instance, if the worktree is on
removable media). For instance:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
Help users recover from this form of corruption by teaching "git
worktree prune" to detect when multiple worktrees are associated with
the same path.
[1]: A subsequent commit will fix "git worktree move" validation to be
more strict.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level logic for removing a worktree is well encapsulated in
delete_git_dir(). However, high-level details related to pruning a
worktree -- such as dealing with verbosity and dry-run mode -- are not
encapsulated. Factor out this high-level logic into its own function so
it can be re-used as new worktree corruption detectors are added.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Readers of the name prune_worktree() are likely to expect the function
to actually prune a worktree, however, it only answers the question
"should this worktree be pruned?". Give it a name more reflective of its
true purpose to avoid such confusion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The autosquash-and-exec test used "git diff HEAD^!" to mean
"git diff HEAD^ HEAD". Use these directly instead of relying
on the undefined but actual-current behavior of "HEAD^!".
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Version appends a hash ID to the file header, making it slightly larger.
This commit also changes "SHA-1" into "object ID" in many places.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format allows for some ambiguity, as a lone footer also starts
with a valid file header. However, the current JGit code will barf on
this. This commit codifies this behavior into the standard.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn Pearce explains:
Some repositories contain a lot of references (e.g. android at 866k,
rails at 31k). The reftable format provides:
- Near constant time lookup for any single reference, even when the
repository is cold and not in process or kernel cache.
- Near constant time verification if a SHA-1 is referred to by at least
one reference (for allow-tip-sha1-in-want).
- Efficient lookup of an entire namespace, such as `refs/tags/`.
- Support atomic push `O(size_of_update)` operations.
- Combine reflog storage with ref storage.
This file format spec was originally written in July, 2017 by Shawn
Pearce. Some refinements since then were made by Shawn and by Han-Wen
Nienhuys based on experiences implementing and experimenting with the
format. (All of this was in the context of our work at Google and
Google is happy to contribute the result to the Git project.)
Imported from JGit[1]'s current version (c217d33ff,
"Documentation/technical/reftable: improve repo layout", 2020-02-04)
of Documentation/technical/reftable.md and converted to asciidoc by
running
pandoc -t asciidoc -f markdown reftable.md >reftable.txt
using pandoc 2.2.1. The result required the following additional
minor changes:
- removed the [TOC] directive to add a table of contents, since
asciidoc does not support it
- replaced git-scm.com/docs links with linkgit: directives that link
to other pages within Git's documentation
[1] https://eclipse.googlesource.com/jgit/jgit
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite support for GIT_CURL_VERBOSE in terms of GIT_TRACE_CURL.
Looking good.
* jt/curl-verbose-on-trace-curl:
http, imap-send: stop using CURLOPT_VERBOSE
t5551: test that GIT_TRACE_CURL redacts password
Code clean-up.
* cc/upload-pack-data:
upload-pack: use upload_pack_data fields in receive_needs()
upload-pack: pass upload_pack_data to create_pack_file()
upload-pack: remove static variable 'stateless_rpc'
upload-pack: pass upload_pack_data to check_non_tip()
upload-pack: pass upload_pack_data to send_ref()
upload-pack: move symref to upload_pack_data
upload-pack: use upload_pack_data writer in receive_needs()
upload-pack: pass upload_pack_data to receive_needs()
upload-pack: pass upload_pack_data to get_common_commits()
upload-pack: use 'struct upload_pack_data' in upload_pack()
upload-pack: move 'struct upload_pack_data' around
upload-pack: move {want,have}_obj to upload_pack_data
upload-pack: remove unused 'wants' from upload_pack_data
The code to parse "git bisect start" command line was lax in
validating the arguments.
* cb/bisect-helper-parser-fix:
bisect--helper: avoid segfault with bad syntax in `start --term-*`
On-the-wire protocol v2 easily falls into a deadlock between the
remote-curl helper and the fetch-pack process when the server side
prematurely throws an error and disconnects. The communication has
been updated to make it more robust.
* dl/remote-curl-deadlock-fix:
stateless-connect: send response end packet
pkt-line: define PACKET_READ_RESPONSE_END
remote-curl: error on incomplete packet
pkt-line: extern packet_length()
transport: extract common fetch_pack() call
remote-curl: remove label indentation
remote-curl: fix typo
Code simplification and test coverage enhancement.
* bc/filter-process:
t2060: add a test for switch with --orphan and --discard-changes
builtin/checkout: simplify metadata initialization
The command line completion script (in contrib/) tried to complete
"git stash -p" as if it were "git stash push -p", but it was too
aggressive and also affected "git stash show -p", which has been
corrected.
* vs/complete-stash-show-p-fix:
completion: don't override given stash subcommand with -p
The check in "git fsck" to ensure that the tree objects are sorted
still had corner cases it missed unsorted entries.
* rs/fsck-duplicate-names-in-trees:
fsck: detect more in-tree d/f conflicts
t1450: demonstrate undetected in-tree d/f conflict
t1450: increase test coverage of in-tree d/f detection
fsck: fix a typo in a comment
"git bugreport" learns to report what shell is in use.
* es/bugreport-shell:
bugreport: include user interactive shell
help: add shell-path to --build-options
As FreeBSD is not the only platform whose regexp library reports
a REG_ILLSEQ error when fed invalid UTF-8, add logic to detect that
automatically and skip the affected tests.
* cb/t4210-illseq-auto-detect:
t4210: detect REG_ILLSEQ dynamically and skip affected tests
t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ)
"git log -L..." now takes advantage of the "which paths are touched
by this commit?" info stored in the commit-graph system.
* ds/line-log-on-bloom:
line-log: integrate with changed-path Bloom filters
line-log: try to use generation number-based topo-ordering
line-log: more responsive, incremental 'git log -L'
t4211-line-log: add tests for parent oids
line-log: remove unused fields from 'struct line_log_data'
While the MyFirstContribution guide exists and has received some use and
positive reviews, it is still not as discoverable as it could be. Add a
reference to it from the GitHub pull request template, where many
brand-new contributors may look. Also add a reference to it in
SubmittingPatches, which is the central source of guidance for patch
contribution.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each worktree removed by "git worktree prune", it reports the reason
for the removal. All reasons share the common prefix "Removing
worktrees/%s:". As new removal reasons are added, this prefix needs to
be duplicated, which is error-prone and potentially cumbersome.
Therefore, factor out the common prefix.
Although this change seems to increase the "sentence lego quotient", it
should be reasonably safe, as the reason for removal is a distinct
clause, not strictly related to the prefix. Moreover, the "worktrees" in
"Removing worktrees/%s:" is a path literal which ought not be localized,
so by factoring it out, we can more easily avoid exposing that path
fragment to translators.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unify the 'chunk_ids' and 'chunk_sizes' arrays into an array of
'struct chunk_info'. This will allow more cleanups in the following
patches.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In write_commit_graph_file() one block of code fills the array of
chunk IDs, another block of code fills the array of chunk offsets,
then the chunk IDs and offsets are written to the Chunk Lookup table,
and finally a third block of code writes the actual chunks. In case
of optional chunks like Extra Edge List and Base Graphs List there is
also a condition checking whether that chunk is necessary/desired, and
that same condition is repeated in all those three blocks of code.
This patch series is about to add more optional chunks, so there would
be even more repeated conditions.
Those chunk offsets are relative to the beginning of the file, so they
inherently depend on the size of the Chunk Lookup table, which in turn
depends on the number of chunks that are to be written to the
commit-graph file. IOW at the time we set the first chunk's ID we
can't yet know its offset, because we don't yet know how many chunks
there are.
Simplify this by initially filling an array of chunk sizes, not
offsets, and calculate the offsets based on the chunk sizes only
later, while we are writing the Chunk Lookup table. This way we can
fill the arrays of chunk IDs and sizes in one go, eliminating one set
of repeated conditions.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Chunk Lookup table stores the chunks' starting offset in the
commit-graph file, not their sizes. Consequently, the size of a chunk
can only be calculated by subtracting its offset from the offset of
the subsequent chunk (or that of the terminating label). This is
currenly implemented in a bit complicated way: as we iterate over the
entries of the Chunk Lookup table, we check the id of each chunk and
store its starting offset, then we check the id of the last seen chunk
and calculate its size using its previously saved offset. At the
moment there is only one chunk for which we calculate its size, but
this patch series will add more, and the repeated chunk id checks are
not that pretty.
Instead let's read ahead the offset of the next chunk on each
iteration, so we can calculate the size of each chunk right away,
right where we store its starting offset.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we iterate over all entries of the Chunk Lookup table we make
sure that we don't attempt to read past the end of the mmap-ed
commit-graph file, and check in each iteration that the chunk ID and
offset we are about to read is still within the mmap-ed memory region.
However, these checks in each iteration are not really necessary,
because the number of chunks in the commit-graph file is already known
before this loop from the just parsed commit-graph header.
So let's check that the commit-graph file is large enough for all
entries in the Chunk Lookup table before we start iterating over those
entries, and drop those per-iteration checks. While at it, take into
account the size of everything that is necessary to have a valid
commit-graph file, i.e. the size of the header, the size of the
mandatory OID Fanout chunk, and the size of the signature in the
trailer as well.
Note that this necessitates the change of the error message as well,
and, consequently, have to update the 'detect incorrect chunk count'
test in 't5318-commit-graph.sh' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our CodingGuidelines says that it's sufficient to include one of
'git-compat-util.h' and 'cache.h', but both 'commit-graph.c' and
'commit-graph.h' include both. Let's include only 'git-compat-util.h'
to loose a bunch of unnecessary dependencies; but include 'hash.h',
because 'commit-graph.h' does require the definition of 'struct
object_id'.
'commit-graph.h' explicitly includes 'repository.h' and
'string-list.h', but only needs the declaration of a few structs from
them. Drop these includes and forward-declare the necessary structs
instead.
'commit-graph.c' includes 'dir.h', but doesn't actually use anything
from there, so let's drop that #include as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ll_diff_tree_oid() has only ever returned 0 [1], so it's return value
is basically useless. It's only caller diff_tree_oid() has only ever
returned the return value of ll_diff_tree_oid() as-is [2], so its
return value is just as useless. Most of diff_tree_oid()'s callers
simply ignore its return value, except:
- diff_root_tree_oid() is a thin wrapper around diff_tree_oid() and
returns with its return value, but all of diff_root_tree_oid()'s
callers ignore its return value.
- rev_compare_tree() and rev_same_tree_as_empty() do look at the
return value in a condition, but, since the return value is always
0, the former's < 0 condition is never fulfilled, while the
latter's >= 0 condition is always fulfilled.
So let's drop the return value of ll_diff_tree_oid(), diff_tree_oid()
and diff_root_tree_oid(), and drop those conditions from
rev_compare_tree() and rev_same_tree_as_empty() as well.
[1] ll_diff_tree_oid() and its ancestors have been returning only 0
ever since it was introduced as diff_tree() in 9174026cfe (Add
"diff-tree" program to show which files have changed between two
trees., 2005-04-09).
[2] diff_tree_oid() traces back to diff-tree.c:main() in 9174026cfe as
well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
clear_##slabname() frees only the memory allocated for a commit slab
itself, but entries in the commit slab might own additional memory
outside the slab that should be freed as well. We already have (at
least) one such commit slab, and this patch series is about to add one
more.
To free all additional memory owned by entries on the commit slab the
user of such a slab could iterate over all commits it knows about,
peek whether there is a valid entry associated with each commit, and
free the additional memory, if any. Or it could rely on intimate
knowledge about the internals of the commit slab implementation, and
could itself iterate directly through all entries in the slab, and
free the additional memory. Or it could just leak the additional
memory...
Introduce deep_clear_##slabname() to allow releasing memory owned by
commit slab entries by invoking the 'void free_fn(elemtype *ptr)'
function specified as parameter for each entry in the slab.
Use it in get_shallow_commits() in 'shallow.c' to replace an
open-coded iteration over a commit slab's entries.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph format specifies that "All 4-byte numbers are in
network order", but the commit-graph contains 8-byte integers as well
(file offsets in the Chunk Lookup table), and their byte order is
unspecified.
Clarify that all multi-byte integers are in network byte order.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file format specifies that the chunks may be in any
order. However, if the OID Lookup chunk happens to be the last one in
the file, then any command attempting to access the commit-graph data
will fail with:
fatal: invalid commit position. commit-graph is likely corrupt
In this case the error is wrong, the commit-graph file does conform to
the specification, but the parsing of the Chunk Lookup table is a bit
buggy, and leaves the field holding the number of commits in the
commit-graph zero-initialized.
The number of commits in the commit-graph is determined while parsing
the Chunk Lookup table, by dividing the size of the OID Lookup chunk
with the hash size. However, the Chunk Lookup table doesn't actually
store the size of the chunks, but it stores their starting offset.
Consequently, the size of a chunk can only be calculated by
subtracting the starting offsets of that chunk from the offset of the
subsequent chunk, or in case of the last chunk from the offset
recorded in the terminating label. This is currenly implemented in a
bit complicated way: as we iterate over the entries of the Chunk
Lookup table, we check the ID of each chunk and store its starting
offset, then we check the ID of the last seen chunk and calculate its
size using its previously saved offset if necessary (at the moment
it's only necessary for the OID Lookup chunk). Alas, while parsing
the Chunk Lookup table we only interate through the "real" chunks, but
never look at the terminating label, thus don't even check whether
it's necessary to calulate the size of the last chunk. Consequently,
if the OID Lookup chunk is the last one, then we don't calculate its
size and turn don't run the piece of code determining the number of
commits in the commit graph, leaving the field holding that number
unchanged (i.e. zero-initialized), eventually triggering the sanity
check in load_oid_from_graph().
Fix this by iterating through all entries in the Chunk Lookup table,
including the terminating label.
Note that this is the minimal fix, suitable for the maintenance track.
A better fix would be to simplify how the chunk sizes are calculated,
but that is a more invasive change, less suitable for 'maint', so that
will be done in later patches.
This additional flexibility of scanning more chunks breaks a test for
"git commit-graph verify" so alter that test to mutate the commit-graph
to have an even lower chunk count.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Submodules should be handled the same as regular directories with
respect to the presence of a trailing slash, i.e. commands like:
git diff rev1 rev2 -- $path
git rev-list HEAD -- $path
should produce the same output whether $path is 'submod' or 'submod/'.
This has been fixed in commit 74b4f7f277 (tree-walk.c: ignore trailing
slash on submodule in tree_entry_interesting(), 2014-01-23).
Unfortunately, that commit had the unintended side effect to handle
'submod/anything' the same as 'submod' and 'submod/' as well, e.g.:
$ git log --oneline --name-only -- sha1collisiondetection/whatever
4125f78222 sha1dc: update from upstream
sha1collisiondetection
07a20f569b Makefile: fix unaligned loads in sha1dc with UBSan
sha1collisiondetection
23e37f8e9d sha1dc: update from upstream
sha1collisiondetection
86cfd61e6b sha1dc: optionally use sha1collisiondetection as a submodule
sha1collisiondetection
Fix this by rejecting submodules as partial pathnames when their
trailing slash is followed by anything.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 0b4396f068 (git-p4: make python2.7 the oldest supported version,
2019-12-13), git-p4 was updated to only support 2.7 and newer. Since
Python 2.6 is pretty much ancient history, update CodingGuidelines to
show that 2.7 is the oldest version supported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 48a8c26c62 (Documentation: avoid poor-man's small caps GIT,
2013-01-21), the documentation was amended to spell Git's name as Git
when talking about the system as a whole. However, t/README was skipped
over when the treatment was applied.
Bring t/README into conformance with the CodingGuidelines by casing
"Git" properly.
While we're at it, fix a small typo. Change "the git internal" to "the
Git internals".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the provided free_commit_graph() to properly free the commit graph
in fuzz-commit-graph. Otherwise, the fuzzer itself leaks memory when the
struct contains pointers to allocated memory.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In trace output (when GIT_TRACE_CURL is true), redact the values of all
HTTP cookies by default. Now that auth headers (since the implementation
of GIT_TRACE_CURL in 74c682d3c6 ("http.c: implement the GIT_TRACE_CURL
environment variable", 2016-05-24)) and cookie values (since this
commit) are redacted by default in these traces, also allow the user to
inhibit these redactions through an environment variable.
Since values of all cookies are now redacted by default,
GIT_REDACT_COOKIES (which previously allowed users to select individual
cookies to redact) now has no effect.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_match_pathspec() started life as match_pathspec_depth_1() and for
correctness was only supposed to be called from match_pathspec_depth().
match_pathspec_depth() was later renamed to match_pathspec(), so the
invariant we expect today is that do_match_pathspec() has no direct
callers outside of match_pathspec().
Unfortunately, this intention was lost with the renames of the two
functions, and additional calls to do_match_pathspec() were added in
commits 75a6315f74 ("ls-files: add pathspec matching for submodules",
2016-10-07) and 89a1f4aaf7 ("dir: if our pathspec might match files
under a dir, recurse into it", 2019-09-17). Of course,
do_match_pathspec() had an important advantge over match_pathspec() --
match_pathspec() would hardcode flags to one of two values, and these
new callers needed to pass some other value for flags. Also, although
calling do_match_pathspec() directly was incorrect, there likely wasn't
any difference in the observable end output, because the bug just meant
that fill_diretory() would recurse into unneeded directories. Since
subsequent does-this-path-match checks on individual paths under the
directory would cause those extra paths to be filtered out, the only
difference from using the wrong function was unnecessary computation.
The second of those bad calls to do_match_pathspec() was involved -- via
either direct movement or via copying+editing -- into a number of later
refactors. See commits 777b420347 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19),
8d92fb2927 ("dir: replace exponential algorithm with a linear one",
2020-04-01), and 95c11ecc73 ("Fix error-prone fill_directory() API; make
it only return matches", 2020-04-01). The last of those introduced the
usage of do_match_pathspec() on an individual file, and thus resulted in
individual paths being returned that shouldn't be.
The problem with calling do_match_pathspec() instead of match_pathspec()
is that any negated patterns such as ':!unwanted_path` will be ignored.
Add a new match_pathspec_with_flags() function to fulfill the needs of
specifying special flags while still correctly checking negated
patterns, add a big comment above do_match_pathspec() to prevent others
from misusing it, and correct current callers of do_match_pathspec() to
instead use either match_pathspec() or match_pathspec_with_flags().
One final note is that DO_MATCH_LEADING_PATHSPEC needs special
consideration when working with DO_MATCH_EXCLUDE. The point of
DO_MATCH_LEADING_PATHSPEC is that if we have a pathspec like
*/Makefile
and we are checking a directory path like
src/module/component
that we want to consider it a match so that we recurse into the
directory because it _might_ have a file named Makefile somewhere below.
However, when we are using an exclusion pattern, i.e. we have a pathspec
like
:(exclude)*/Makefile
we do NOT want to say that a directory path like
src/module/component
is a (negative) match. While there *might* be a file named 'Makefile'
somewhere below that directory, there could also be other files and we
cannot pre-emptively rule all the files under that directory out; we
need to recurse and then check individual files. Adjust the
DO_MATCH_LEADING_PATHSPEC logic to only get activated for positive
pathspecs.
Reported-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, extensions were recognized regardless of repository format
version. If the user sets an undefined "extensions" value on a
repository of version 0 and that value is used by a future git version,
they might get an undesired result.
Because all extensions now also upgrade repository versions, tightening
the check would help avoid this for future extensions.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions' configuration variable gets special meaning in the new
repository version, so when enabling the extension we should upgrade the
repository to version 1.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Retroactively adding a filter can be useful for existing shallow clones as
they allow users to see earlier change histories without downloading all
git objects in a regular --unshallow fetch.
Without this patch, users can make a clone partial by editing the
repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1
git config extensions.partialClone origin
git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such
edits can be error-prone, teach Git to perform the required configuration
change automatically instead.
Note that this change does not modify the existing git behavior which
recognizes setting extensions.partialClone without changing
repositoryFormatVersion.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In version 1 of repository format, "extensions" gained special meaning
and it is safer to avoid upgrading when there are pre-existing
extensions.
Make list-objects-filter to use the helper function instead of setting
repository version directly as a prerequisite of exposing the upgrade
capability.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse-checkout's purpose is to update the working tree to have it
reflect a subset of the tracked files. As such, it shouldn't be
switching branches, making commits, downloading or uploading data, or
staging or unstaging changes. Other than updating the worktree, the
only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the
index. In particular, this sets up a nice invariant: running
sparse-checkout will never change the status of any file in `git status`
(reflecting the fact that we only set the SKIP_WORKTREE bit if the file
is safe to delete, i.e. if the file is unmodified).
Traditionally, we did a _really_ bad job with this goal. The
predecessor to sparse-checkout involved manual editing of
.git/info/sparse-checkout and running `git read-tree -mu HEAD`. That
command would stage and unstage changes and overwrite dirty changes in
the working tree.
The initial implementation of the sparse-checkout command was no better;
it simply invoked `git read-tree -mu HEAD` as a subprocess and had the
same caveats, though this issue came up repeatedly in review comments
and workarounds for the problems were put in place before the feature
was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6].
[1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/
[4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/
[6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/
However, these workarounds, in addition to disabling the feature in a
number of important cases, also missed one special case. I'll get back
to it later.
In the 2.27.0 cycle, the disabling of the feature was lifted by finally
replacing the internal equivalent of `git read-tree -mu HEAD` with
something that did what we wanted: the new update_sparsity() function in
unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index
and updates the working tree to match. This new function handles all
the cases that were problematic for the old implementation, except that
it breaks the same special case that avoided the workarounds of the old
implementation, but broke it in a different way.
So...that brings us to the special case: a git clone performed with
--no-checkout. As per the meaning of the flag, --no-checkout does not
check out any branch, with the implication that you aren't on one and
need to switch to one after the clone. Implementationally, HEAD is
still set (so in some sense you are partially on a branch), but
* the index is "unborn" (non-existent)
* there are no files in the working tree (other than .git/)
* the next time git switch (or git checkout) is run it will run
unpack_trees with `initial_checkout` flag set to true.
It is not until you run, e.g. `git switch <somebranch>` that the index
will be written and files in the working tree populated.
With this special --no-checkout case, the traditional `read-tree -mu
HEAD` behavior would have done the equivalent of acting like checkout --
switch to the default branch (HEAD), write out an index that matches
HEAD, and update the working tree to match. This special case slipped
through the avoid-making-changes checks in the original sparse-checkout
command and thus continued there.
After update_sparsity() was introduced and used (see commit f56f31af03
("sparse-checkout: use new update_sparsity() function", 2020-03-27)),
the behavior for the --no-checkout case changed: Due to git's
auto-vivification of an empty in-memory index (see do_read_index() and
note that `must_exist` is false), and due to sparse-checkout's
update_working_directory() code to always write out the index after it
was done, we got a new bug. That made it so that sparse-checkout would
switch the repository from a clone with an "unborn" index (i.e. still
needing an initial_checkout), to one that had a recorded index with no
entries. Thus, instead of all the files appearing deleted in `git
status` being known to git as a special artifact of not yet being on a
branch, our recording of an empty index made it suddenly look to git as
though it was definitely on a branch with ALL files staged for deletion!
A subsequent checkout or switch then had to contend with the fact that
it wasn't on an initial_checkout but had a bunch of staged deletions.
Make sure that sparse-checkout changes nothing in the index other than
the SKIP_WORKTREE bit; in particular, when the index is unborn we do not
have any branch checked out so there is no sparsification or
de-sparsification work to do. Simply return from
update_working_directory() early.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 897d68e7af (Makefile: use curl-config --cflags, 2020-03-26), we
taught the build process to use `curl-config --cflags` to make sure that
it can find cURL's headers.
In the MSVC build, this is completely bogus because we're running in a
Git for Windows SDK whose `curl-config` supports the _GCC_ build.
Let's just ignore each and every `-I<path>` option where `<path>` points
to GCC/Clang specific headers.
Reported by Jeff Hostetler in
https://github.com/microsoft/git/issues/275.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.
Let's scrub them before writing them to the reflog.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'pack_objects_hook' static
variable into this struct.
It is used by code common to protocol v0 and protocol v2.
While at it let's also free() it in upload_pack_data_clear().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_sideband_all' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_ref_in_want' static
variable into this struct.
It is used only by protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'allow_filter' static variable
into this struct.
It is used by both protocol v0 and protocol v2 code.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'keepalive' static variable
into this struct.
It is used by code common to protocol v0 and protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to upload_pack_config(),
so that this function can use all the fields of the struct.
This will be used in followup commits to move static variables
that are set in upload_pack_config() into 'upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's take this opportunity to change the
'multi_ack' variable, which is now part of 'upload_pack_data',
to an enum.
This will make it clear which values this variable can take.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the multi_ack static variable into
this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the filter_capability_requested
static variable into this struct.
It is only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling it into
upload_pack_data and just letting v2 code ignore it as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'use_sideband' static variable
into this struct.
This variable is used by both v0 and v2 protocols.
While at it, let's update the comment near the variable
definition.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the 'no_done', 'daemon_mode' and
'timeout' variables into this struct.
They are only used by protocol v0 code since protocol v2 assumes
certain baseline capabilities, but rolling them into
upload_pack_data and just letting v2 code ignore them as it does
now is more coherent and cleaner.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's annotate fields from this struct to let
people know which ones are used only for protocol v0 and which
ones only for protocol v2.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's actually start using some bitfields of
that struct. These bitfields were introduced in 3145ea957d
("upload-pack: introduce fetch server command", 2018-03-15), but
were never used.
We could instead have just removed the following bitfields
from the struct:
unsigned use_thin_pack : 1;
unsigned use_ofs_delta : 1;
unsigned no_progress : 1;
unsigned use_include_tag : 1;
but using them makes it possible to remove a number of static
variables with the same name and purpose from 'upload-pack.c'.
This is a behavior change, as we accidentally used to let values
in those bitfields propagate from one v2 "fetch" command to
another for ssh/git/file connections (but not for http). That's
fixing a bug, but one nobody is likely to see, because it would
imply the client sending different capabilities for each request.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following lines were not covered in a recent line-coverage test
against Git:
builtin/commit-graph.c
5b6653e5 244) progress = start_delayed_progress(
5b6653e5 268) stop_progress(&progress);
These statements are executed when both '--stdin-commits' and
'--progress' are passed. Introduce a trio of tests that exercise various
combinations of these options to ensure that these lines are covered.
More importantly, this is exercising a (somewhat) previously-ignored
feature of '--stdin-commits', which is that it respects '--progress'.
Prior to 5b6653e523 (builtin/commit-graph.c: dereference tags in
builtin, 2020-05-13), dereferencing input from '--stdin-commits' was
done inside of commit-graph.c.
Now that an additional progress meter may be generated from outside of
commit-graph.c, add a corresponding test to make sure that it also
respects '--[no]-progress'.
The other location that generates progress meter output (from d335ce8f24
(commit-graph.c: show progress of finding reachable commits,
2020-05-13)) is already covered by any test that passes '--reachable'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of tests in t5318 use 'test_line_count = 0 ...' to make sure
that some command does not write any output. While correct, it is more
idiomatic to use 'test_must_be_empty' instead. Switch the former
invocations to use the latter instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some repositories in the wild have commits that record nonsense
committer timezone (e.g. rails.git); "git fast-import" learned an
option to pass these nonsense timestamps intact to allow recreating
existing repositories as-is.
* en/fast-import-looser-date:
fast-import: add new --date-format=raw-permissive format
The error message from "git checkout -b foo -t bar baz" was
confusing.
* rs/checkout-b-track-error:
checkout: improve error messages for -b with extra argument
checkout: add tests for -b and --track
We've adopted a convention that any on-stack structure can be
initialized to have zero values in all fields with "= { 0 }", even
when the first field happens to be a pointer, but sparse complained
that a null pointer should be spelled NULL for a long time. Start
using -Wno-universal-initializer option to squelch it.
* lo/sparse-universal-zero-init:
sparse: allow '{ 0 }' to be used without warnings
"feature.experimental" configuration variable is to let volunteers
easily opt into a set of newer features, which use of the v2
transport protocol is now a part of.
* jn/experimental-opts-into-proto-v2:
config: let feature.experimental imply protocol.version=2
The "--prepare-p4-only" option is supposed to stop after replaying
one changeset, but kept going (by mistake?)
* bk/p4-prepare-p4-only-fix:
git-p4.py: fix --prepare-p4-only error with multiple commits
When performing a tree-level diff against the working tree, we may find
that our index stat information is dirty, so we queue a filepair to be
examined later. If the actual content hasn't changed, we call this a
stat-unmatch; the stat information was out of date, but there's no
actual diff. Normally diffcore_std() would detect and remove these
identical filepairs via diffcore_skip_stat_unmatch(). However, when
"--quiet" is used, we want to stop the diff as soon as we see any
changes, so we check for stat-unmatches immediately in diff_change().
That check may require us to actually load the file contents into the
pair of diff_filespecs. If we find that the pair isn't a stat-unmatch,
then no big deal; we'd likely load the contents later anyway to generate
a patch, do rename detection, etc, so we want to hold on to it. But if
it is a stat-unmatch, then we have no more use for that data; the whole
point is that we're going discard the pair. However, we never free the
allocated diff_filespec data.
In most cases, keeping that data isn't a problem. We don't expect a lot
of stat-unmatch entries, and since we're using --quiet, we'd quit as
soon as we saw such a real change anyway. However, there are extreme
cases where it makes a big difference:
1. We'd generally mmap() the working tree half of the pair. And since
the OS may limit the total number of maps, we can run afoul of this
in large repositories. E.g.:
$ cd linux
$ git ls-files | wc -l
67959
$ sysctl vm.max_map_count
vm.max_map_count = 65530
$ git ls-files | xargs touch ;# everything is stat-dirty!
$ git diff --quiet
fatal: mmap failed: Cannot allocate memory
It should be unusual to have so many files stat-dirty, but it's
possible if you've just run a script like "sed -i" or similar.
After this patch, the above correctly exits with code 0.
2. Even if you don't hit mmap limits, the index half of the pair will
have been pulled from the object database into heap memory. Again
in a clone of linux.git, running:
$ git ls-files | head -n 10000 | xargs touch
$ git diff --quiet
peaks at 145MB heap before this patch, and 94MB after.
This patch solves the problem by freeing any diff_filespec data we
picked up during the "--quiet" stat-unmatch check in diff_changes.
Nobody is going to need that data later, so there's no point holding on
to it. There are a few things to note:
- we could skip queueing the pair entirely, which could in theory save
a little work. But there's not much to save, as we need a
diff_filepair to feed to diff_filespec_check_stat_unmatch() anyway.
And since we cache the result of the stat-unmatch checks, a later
call to diffcore_skip_stat_unmatch() call will quickly skip over
them. The diffcore code also counts up the number of stat-unmatched
pairs as it removes them. It's doubtful any callers would care about
that in combination with --quiet, but we'd have to reimplement the
logic here to be on the safe side. So it's not really worth the
trouble.
- I didn't write a test, because we always produce the correct output
unless we run up against system mmap limits, which are both
unportable and expensive to test against. Measuring peak heap
would be interesting, but our perf suite isn't yet capable of that.
- note that diff without "--quiet" does not suffer from the same
problem. In diffcore_skip_stat_unmatch(), we detect the stat-unmatch
entries and drop them immediately, so we're not carrying their data
around.
- you _can_ still trigger the mmap limit problem if you truly have
that many files with actual changes. But it's rather unlikely. The
stat-unmatch check avoids loading the file contents if the sizes
don't match, so you'd need a pretty trivial change in every single
file. Likewise, inexact rename detection might load the data for
many files all at once. But you'd need not just 64k changes, but
that many deletions and additions. The most likely candidate is
perhaps break-detection, which would load the data for all pairs and
keep it around for the content-level diff. But again, you'd need 64k
actually changed files in the first place.
So it's still possible to trigger this case, but it seems like "I
accidentally made all my files stat-dirty" is the most likely case
in the real world.
Reported-by: Jan Christoph Uhde <Jan@UhdeJc.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are multiple repositories in the wild with random, invalid
timezones. Most notably is a commit from rails.git with a timezone of
"+051800"[1]. A few searches will find other repos with that same
invalid timezone as well. Further, Peff reports that GitHub relaxed
their fsck checks in August 2011 to accept any timezone value[2], and
there have been multiple reports to filter-repo about fast-import
crashing while trying to import their existing repositories since they
had timezone values such as "-7349423" and "-43455309"[3].
The existing check on timezone values inside fast-import may prove
useful for people who are crafting fast-import input by hand or with a
new script. For them, the check may help them avoid accidentally
recording invalid dates. (Note that this check is rather simplistic and
there are still several forms of invalid dates that fast-import does not
check for: dates in the future, timezone values with minutes that are
not divisible by 15, and timezone values with minutes that are 60 or
greater.) While this simple check may have some value for those users,
other users or tools will want to import existing repositories as-is.
Provide a --date-format=raw-permissive format that will not error out on
these otherwise invalid timezones so that such existing repositories can
be imported.
[1] 4cf94979c9
[2] https://lore.kernel.org/git/20200521195513.GA1542632@coredump.intra.peff.net/
[3] https://github.com/newren/git-filter-repo/issues/88
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:ruester/git-po-de:
l10n: de.po: Fix typo in the German translation of octopus
l10n: de.po: Update German translation for Git 2.27.0
f1e3df3169 (t: increase test coverage of signature verification output,
2020-03-04) adds GPG dependent tests to t4202 and t6200 that were found
problematic with at least OpenBSD 6.7.
Using an escaped '|' for alternations works only in some implementations
of grep (e.g. GNU and busybox).
It is not part of POSIX[1] and not supported by some BSD, macOS, and
possibly other POSIX compatible implementations.
Use `grep -E`, and write it using extended regular expression.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --orphan option is used to create a local branch which is detached
from the current history. In git switch, it always resets to the empty
tree, and thus the only completion we can provide is a branch name.
Follow the same rules for -c/-C (and -b/-B) when completing the argument
to --orphan.
In the case of git switch, after we complete the argument, there is
nothing more we can complete for git switch, so do not even try. Nothing
else would be valid.
In the case of git checkout, --orphan takes a start point which it uses
to determine the checked out tree, even though it created orphaned
history.
Update the previously added test cases as they are now passing.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous commit added several test cases highlighting the subpar
completion logic for -c/-C and -b/-B when completing git switch and git
checkout.
In order to distinguish completing the argument vs the start-point for
this option, we now use the wordlist to determine the previous full word
on the command line.
If it's -c or -C (-b/-B for checkout), then we know that we are
completing the argument for the branch name.
Given that a user who already knows the branch name they want to
complete will simply not use completion, it makes sense to complete the
small subset of local branches when completing the argument for -c/-C.
In all other cases, if -c/-C are on the command line but are not the
most recent option, then we must be completing a start-point, and should
allow completing against all references.
Update the -c/-C and -b/-B tests to indicate they now pass.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current completion for the --track option of git switch and git checkout
is sub par. In addition to the DWIM logic of a bare branch name, --track
has DWIM logic to convert specified remote/branch names into a local
branch tracking that remote. For example
$git switch --track origin/master
This will create a local branch name master, that tracks the master
branch of the origin remote.
In fact, git switch --track on its own will not accept other forms of
references. These must instead be specified manually via the -c/-C/-b/-B
options.
Introduce __git_remote_heads() and the "remote-heads" mode for
__git_complete_refs. Use this when the --track option is provided while
completing in _git_switch and _git_checkout. Just as in the --detach
case, we never enable DWIM mode for --track, because it doesn't make
sense.
It should be noted that completion support is still a bit sub par when
it comes to handling -c/-C and --orphan. This will be resolved in
a future change.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like git switch, we should not complete DWIM remote branch names
if --detach has been specified. To avoid this, refactor _git_checkout in
a similar way to _git_switch.
Note that we don't simply clear dwim_opt when we find -d or --detach, as
we will be adding other modes and checks, making this flow easier to
follow.
Update the previously failing tests to show that the breakage has been
resolved.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new --mode option to __git_complete_refs, which allows changing
the behavior to call __git_heads instead of __git_refs.
By passing --mode=heads, __git_complete_refs will only output local
branches. This enables using "--mode=heads --dwim" to enable listing
local branches and the remote unique branch names for DWIM.
Refactor completion support to use the new mode option, rather than
calling __git_heads directly. This has the advantage that we can now
correctly allow local branches along with suitable DWIM refs, rather
than only allowing DWIM when we complete all references.
Choose what mode it uses when calling __git_complete_refs. If -d or
--detach have been provided, then simply complete all refs, but
*without* the DWIM option as these DWIM names won't work properly in
--detach mode.
Otherwise, call __git_complete_refs with the default dwim_opt value and
use the new "heads" mode.
In this way, the basic support for completing just "git switch <TAB>"
will result in only local branches and remote unique names for DWIM.
The basic no-options tests for git switch, as well as several of the
-c/-C tests now pass, so remove the known breakage tags.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new helper, __git_find_last_on_cmdline is introduced, similar to the
already existing __git_find_on_cmdline, but which operates in reverse,
finding the *last* matching word of the provided wordlist.
Use this in a new __git_checkout_default_dwim_mode() function that will
determine when to enable listing of DWIM remote branches.
The __git_find_last_on_cmdline() function is used to determine which
--guess or --no-guess is in effect. If either one is provided, then we
unconditionally enable or disable the DWIM mode based on the last
provided option.
If neither --guess nor --no-guess is provided, then we check for
--no-track, and finally for GIT_COMPLETION_CHECKOUT_NO_GUESS=1.
This function is then used in _git_switch and _git_checkout to improve
the handling for when we enable listing of these DWIM remote branches.
This new logic is more robust, as we will correctly identify superseded
options, and ensure that both _git_switch and _git_checkout enable DWIM
in similar ways.
We can now update a few tests to indicate they pass. A few of the tests
previously added to highlight issues with the old DWIM logic still fail.
This is because of a separate issue related to the default completion
behavior of git switch, which will be addressed in a future change.
Additionally, due to this change, a few tests for the -b/-B handling of
git checkout now fail. This is a minor regression, and will be fixed by
a following change that improves the overall handling of -b/-B. Mark
these tests as known breakages for now.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_complete_refs is the main function used for completing references.
It is primarily used as a wrapper around __git_refs, and is easier to
extend since its arguments are option-like.
One major downside of __git_complete_refs and __git_refs currently, is
the lack of ability to complete only a subset of refs such as branches
(refs/heads) or tags (refs/tags).
Normally, a caller might just decide to use __git_heads() or
__git_tags(). However, in the case of git-switch, it is useful to
complete both branches *and* DWIM remote branch names.
Due to the complexity and implementation of __git_refs, it is not easy
to extend it to support listing only a subset of references.
Instead, we can extend __git_complete_refs to do this. For this to be
done, we must first ensure that "--dwim" support is not tied to calling
__git_refs.
Instead of passing $dwim into __git_refs, we can implement
a __gitcomp_direct_append function which can append to COMPREPLY after
a call to __gitcomp_direct.
If --dwim is passed to __git_complete_refs, use __gitcomp_direct_append
to add the output of __git_dwim_remote_heads to the completion list.
In this way, --dwim support is now independent of calling __git_refs.
A future change will add an additional option to control what set of
references __git_complete_refs will output.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
__git_refs() has the ability to report unique remote names for
supporting completion of remote branch names for the DWIMery of git
checkout and git switch.
For git checkout, this is fine, because it always supports completing
all local references.
However, git switch by default only supports either switching branches
or using this DWIMery to create a local branch tracking the remote
branch.
Future work to cleanup and improve completion support for git switch
will be aided if the remote branch names can be completed separately
from __git_refs.
Extract this logic to a function __git_dwim_remote_heads(), and use it
in __git_refs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The __git_complete_refs uses the "--track" option to specify when to
enable listing of unique remote branches which are used by the DWIM
logic of git checkout and git switch.
Using the term '--track' here is confusing because the git commands
themselves have '--track' as an argument. Additionally, the completion
logic for _git_switch also checks for --track. Keeping the meaning of
track_opt and --track for __git_complete_refs straight from the --track
git switch and git checkout option is difficult when reading this code.
Use the option '--dwim' instead, indicating this is about enabling or
disabling logic related to DWIM mode. Also rename the local variable
track_opt to dwim_opt to further reduce the confusion when reading the
completion code for _git_switch.
Because it is plausible for users to have developed their own
completions which rely on __git_complete_ref, keep --track as a synonym
for --dwim, even though we no longer use it in any of the core git
completion logic. Add a comment explaining why it remains as an
alternative spelling for --dwim.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to -c/-C, --orphan takes an argument which is the branch name to
use. We ought to complete this branch name using similar rules as to how
we complete new branch names for -c/-C and -b/-B. Namely, limit the
total number of options provided by completing to the local branches.
Additionally, git switch --orphan does not take any start point and will
always create using the empty-tree. Thus, after the branch name is
completed, git switch --orphan should not complete any references.
Add test cases showing the expected behavior of --orphan, for both the
argument and starting point.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout
(-c/-C or -b/-B), the commands switch to a different mode: `git switch
-c <branch> <some-referance>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing git switch or git checkout, it makes sense to complete
the branch name differently from the start point.
When completing a branch, one might consider that we do not have
anything worth completing. After all, a new branch must have an entirely
new name. Consider, however, that if a user names branches using some
similar scheme, they might wish to name a new branch by modifying the
name of an existing branch.
To avoid overloading completion for the argument, it seems reasonable to
complete only the local branch names and the valid "Do What I Mean"
remote branch names.
Add tests for the completion of the argument to -c/-C and -b/-B,
highlighting this preferred completion behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the branch creation argument for git switch or git checkout,
-c/-C or -b/-B, the commands operate in a different mode: `git switch -c
<branch> <some-reference>` means to create a branch named <branch> at
the commit referred to by <some-reference>.
When completing the start-point, we ought to always complete all valid
references.
Add tests for the completion of the start-point to -c/-C and -b/-B.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --track option is provided to git switch or git checkout, and
no branch is specified by -c or -b, git will interpret the tracking
branch to determine the local branch name to use. This "Do What I Mean"
logic is similar but distinct from the default DWIM logic of
interpreting a unique remote branch name as a request to create and
track that branch.
For example, `git switch --track origin/master` is interpreted as
a request to create a local branch named master that is tracking
origin/master.
The current completion for git checkout in this regard is only somewhat
poor:
$git checkout --track <TAB>
HEAD
master
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
At least it still includes remote references. The clutter from including
all references isn't too bad.
However, git switch completion is terrible:
$git switch --track <TAB>
master
matching-branch
It only shows local branches, not even allowing any form of completion
of the remote references!
Add tests which highlight the expected behavior of completing --track on
its own.
Note that when -c/-C or -b/-B are provided we do expect completing more
references, but this will be discussed in a future change that addresses
these options specifically.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When completing words for git switch, the completion function correctly
disables the DWIM remote branch names when in the '--detach' mode. These
DWIM remote branch names will not work when the --detach option is
specified, so it does not make sense to complete them.
git checkout, however, does not disable the completion of DWIM remote
branch names in this case.
Add test cases for both git switch and git checkout showing the expected
behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with a single argument that is the name of a remote branch
that does not yet exist locally, both git switch and git checkout can
interpret this as a request to create a local branch that tracks that
remote branch. We call this behavior "Do What I Mean", or DWIM for
short.
To aid in using this DWIM, it makes sense for completion to list these
unique remote branch names when completing possible arguments for git
switch and git checkout. Indeed, both _git_checkout and _git_switch
implement support for completing such DWIM branch names.
In other words, in addition to the usual completions provided for git
switch, this "DWIM" logic means completion will include the names of
branches on remotes that are unique and thus there can be no ambiguity
of which remote to track when creating the local branch.
However, the DWIM logic is not always active. Many options, such as
--no-guess, --no-track, and --track disable this DWIM logic, as they
cause git switch and git checkout to behave in different modes.
Additionally, some completion users do not wish to have tab completion
include these remote names by default, and thus introduced
GIT_COMPLETION_CHECKOUT_NO_GUESS as an optional way to configure the
completion support to disable this feature of completion support.
For this reason, _git_checkout and _git_switch have many rules about
when to enable or disable completing of these remote refs. The two
commands follow similar but not identical rules.
Set aside the question of command modes that do not accept this DWIM
logic (--track, -c, --orphan, --detach) for now. Thinking just about the
main mode of git checkout and git switch, the following guidelines will
help explain the basic rules we ought to support when deciding whether
to list the remote branches for DWIM in completion.
1. if --guess is enabled, we should list DWIM remote branch names, even
if something else would disable it
2. if --no-guess, --no-track or GIT_COMPLETION_CHECKOUT_NO_GUESS=1,
then we should disable listing DWIM remote branch names.
3. Since the '--guess' option is a boolean option, a later --guess
should override --no-guess, and a later --no-guess should override
--guess.
Putting all of these together, add some tests that highlight the
expected behavior of this DWIM logic.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When provided with no options, git switch only allows switching between
branches. The one exception to this is the "Do What I Mean" logic that
allows a unique remote branch name to be interpreted as a request to
create a branch of the same name that is tracking that remote branch.
Unfortunately, the logic for the completion of git switch results in
completing not just branch names, but also pseudorefs like HEAD, tags,
and fully specified <remote>/<branch> references.
For example, we currently complete the following:
$git switch <TAB>
HEAD
branch-in-other
master
master-in-other
matching-branch
matching-tag
other/branch-in-other
other/master-in-other
Indeed, if one were to attempt to use git switch with some of these
provided options, git will reject the request:
$git switch HEAD
fatal: a branch is expected, got 'HEAD
$git switch matching-tag
fatal: a branch is expected, got tag 'matching-tag'
$git switch other/branch-in-other
fatal: a branch is expected, got remote branch 'other/branch-in-other'
Ideally, git switch without options ought to complete only words which
will be accepted. Without options, this means to list local branch names
and the unique remote branch names without their remote name pre-pended.
$git switch <TAB>
branch-in-other
master
master-in-other
matching-branch
Add a test case that highlights this subpar completion. Also add
a similar test for git checkout completion that shows that due to the
complex nature of git checkout, it must complete all references.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When clearing the builtin operations on re-sourcing in the ZSH case we
can use the native ${parameters} associative array keys values to get
the currently `__gitcomp_builtin_*` operations using pattern matching
instead of using sed.
As also stated in commit 94408dc7, introducing this change the usage of
sed has some overhead implications, while ZSH can do this check just
using its native syntax.
Signed-off-by: Marco Trevisan (Treviño) <mail@3v1n0.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original patch selection code was written for `git add -p`, and the
fundamental unit on which it works is a hunk.
We hacked around that to handle deletions back in 24ab81ae4d
(add-interactive: handle deletion of empty files, 2009-10-27). But `git
add -p` would never see a new file, since we only consider the set of
tracked files in the index.
However, since the same machinery was used for `git checkout -p` &
friends, we can see new files.
Handle this case specifically, adding a new prompt for it that is
modeled after the `deleted file` case.
This also fixes the problem where added _empty_ files could not be
staged via `git checkout -p`.
Reported-by: Merlin Büge <toni@bluenox07.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit introduced --ignore-date flag to rebase -i, but the
name is rather vague as it does not say whether the author date or the
committer date is ignored. Add an alias to convey the precise purpose.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the on-going effort to retire the apply rebase backend
teach the merge backend how to handle --ignore-date. We take care to
handle the combination of --ignore-date and
--committer-date-is-author-date in the same way as the apply backend.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of amend_author was to free() the malloc()'d string
obtained from get_author() when amending a commit. But we can
also use the variable to free() the author at our convenience.
Rename it to convey this meaning.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the on-going effort to retire the apply rebase backend teach
the merge backend how to handle --committer-date-is-author-date.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and 'merge'
each of which support a different set of options. In particuar the apply
backend supports a number of options implemented by 'git am' that are
not available to the merge backend. As part of an on going effort to
remove the apply backend this patch adds support for the
--ignore-whitespace option to the merge backend. This option treats
lines with only whitespace changes as unchanged and is implemented in
the merge backend by translating it to -Xignore-space-change.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote may or may not operate within a repository, and as such will
not have been initialized with the repository's hash algorithm. Even if
it were, the remote side could be using a different algorithm and we
would still want to display those refs properly. Find the hash
algorithm used by the remote side by querying the transport object and
set our hash algorithm accordingly.
Without this change, if the remote side is using SHA-256, we truncate
the refs to 40 hex characters, since that's the length of the default
hash algorithm (SHA-1).
Note that technically this is not a correct setting of the repository
hash algorithm since, if we are in a repository, it might be one of a
different hash algorithm from the remote side. However, our current
code paths don't handle multiple algorithms and won't for some time, so
this is the best we can do. We rely on the fact that ls-remote never
modifies the current repository, which is a reasonable assumption to
make.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has hard-coded pkt-lines with object IDs. The pkt-line
lengths necessarily differ between hash algorithms, so generate these
lines with the packetize helper so they're always the right size. In
addition, we will require an object-format capability for SHA-256, so
pass that capability on to the upload-pack process.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to communicate the protocol supported by the server side, add
support for advertising the object-format capability. We check that the
client side sends us an identical algorithm if it sends us its own
object-format capability, and assume it speaks SHA-1 if not.
In the test, when we're using an algorithm other than SHA-1, we need to
specify the algorithm in use so we don't get a failure with an "unknown
format" message. Add a test that we handle a mismatched algorithm.
Remove the test_oid_init call since it's no longer necessary.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using protocol v2, we need to know what hash algorithm is used by
the remote end. See if the server has sent us an object-format
capability, and if so, use it to determine the hash algorithm in use and
set that value in the packet reader. Parse the refs using this
algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're parsing refs, we need to know not only what the line we're
parsing is, but also the hash algorithm we should use to parse it, which
is stored in the reader object. Pass the packet reader object through
to the protocol v2 ref parsing function.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using SHA-256, we need to take advantage of the extensions section
in the config file, so we need to use repository format version 1.
Update the test to look for the correct value.
Note that test_oid produces a value without a trailing newline, so use
echo to ensure we print a trailing newline to compare it correctly
against the actual results.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index is capable of reading any possible index file whether or not
the index is inside a repository. However, because our index files lack
metadata about the hash algorithm in use, it's not possible to
autodetect the algorithm that a particular index file is using.
In order to allow us to read index files of any algorithm, let's set up
the .git directory gently so that we default to the algorithm for the
current repository, and add an --object-format option to allow users to
override this setting and continue to run show-index outside of a
repository altogether. Let's also document this new option so that
people can find it and use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our style these days is to place the description and the opening quote
of the body on the same line as test_expect_success (if it fits), to
place the trailing quote on a line by itself after the body, and to use
tabs. Since we're going to be making several significant changes to
this test, modernize the style to aid in readability of the subsequent
patches.
This patch should have no functional change.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both v2 pack index files and the v3 format specified as part of the
NewHash work have similar data starting at the CRC table. Much of the
existing code wants to read either this table or the offset entries
following it, and in doing so computes the offset each time.
In order to share as much code between v2 and v3, compute the offset of
the CRC table and store it when the pack is opened. Use this value to
compute offsets to not only the CRC table, but to the offset entries
beyond it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the test assertions in this test checks that git branch -m works
even without a .git/config file. However, if the repository requires
configuration extensions, such as because it uses a non-SHA-1 algorithm,
this assertion will fail. Mark the assertion as requiring SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're checking the repository's format, set the hash algorithm at
the same time. This ensures that we perform a suitable initialization
early enough to avoid confusing any parts of the code. If we defer
until later, we can end up with portions of the code which are confused
about the hash algorithm, resulting in segfaults when working with
SHA-256 repositories.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parse the server's object-format capability and respond accordingly,
dying if there is a mismatch.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure that we pass the object-format capability in the synthesized test
data so that this test works with algorithms other than SHA-1.
In addition, add an additional test using the old data for when we're
using SHA-1 so that we can be sure that we preserve backwards
compatibility with servers not offering the object-format capability.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a clone, we don't know what hash algorithm the other end
will support. Currently, we don't support fetching data belonging to a
different algorithm, so we must know what algorithm the remote side is
using in order to properly initialize the repository. We can know that
only after fetching the refs, so if the remote side has any references,
use that information to reinitialize the repository with the correct
hash algorithm information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the object-format extensions that let us determine the hash
algorithm in use when pushing, pulling, and fetching.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement the object-format extensions that let us determine the hash
algorithm in use when pushing or pulling data.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the remote helper docs to document the object-format extensions
we will implement in remote-curl and the transport helper code shortly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we're fetching refs, detect the hash algorithm and parse the refs
using that algorithm.
As mentioned in the documentation, if multiple versions of the
object-format capability are provided, we use the first. No known
implementation supports multiple algorithms now, but they may in the
future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're going to be using this function in other files, so no longer mark
this function static.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
If the server does support our hash, advertise it as part of our
capabilities.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function, server_supports_hash, to see if the remote server
supports a particular hash algorithm when speaking protocol v1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When connecting to a remote system, we need to know what hash algorithm
it will be using to talk to us. Add a hash_algo member to struct
transport and add a function to read this data from the transport
object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a member for the hash algorithm currently in use to the packet
reader so it can parse references correctly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So far in protocol v2, all of our server capabilities that have values
have not had values that we've been interested in parsing. For example,
we receive but ignore the agent value.
However, in a future commit, we're going to want to parse out the value
of a server capability. To make this easy, add a function,
server_feature_v2, that can fetch the value provided as part of the
server capability.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a capability response, we can have multiple symref entries. In the
future, we will also allow for multiple hash algorithms to be specified.
To avoid duplication, expand the parse_feature_value function to take an
optional offset where the parsing should begin next time. Add a wrapper
function that allows us to query the next server feature value, and use
it in the existing symref parsing code.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Advertise the current hash algorithm in use by using the object-format
capability as part of the ref advertisement.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing capabilities for the pack protocol, there are times we'll
want to compare the value of a capability to a NUL-terminated string.
Since the data we're reading will be space-terminated, not
NUL-terminated, we need a function that compares the two strings, but
also checks that they're the same length. Otherwise, if we used strncmp
to compare these strings, we might accidentally accept a parameter that
was a prefix of the expected value.
Add a function, xstrncmpz, that takes a NUL-terminated string and a
non-NUL-terminated string, plus a length, and compares them, ensuring
that they are the same length.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future patch, we'll want to access multiple members from struct
packet_reader when parsing references. Therefore, have the ref parsing
code take pointers to struct reader instead of having to pass multiple
arguments to each function.
Rename the len variable to "linelen" to make it clearer what the
variable does in light of the variable change.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To set the default hash algorithm you can set the `GIT_DEFAULT_HASH`
environment variable. In the documentation this variable is named
`GIT_DEFAULT_HASH_ALGORITHM`, which is incorrect.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The explanation of the `--show-pulls` option added in commit 8d049e182e
("revision: --show-pulls adds helpful merges", 2020-04-10) consists of
several paragraphs and we use "+" throughout to tie them together in one
long chain of list continuations. Only thing is, we're not in any kind
of list, so these pluses end up being rendered literally.
The preceding few paragraphs describe `--ancestry-path` and there we
*do* have a list, since we've started one with `--ancestry-path::`. In
fact, we have several such lists for all the various history-simplifying
options we're discussing earlier in this file.
Thus, we're missing a list both from a consistency point of view and
from a practical rendering standpoint.
Let's start a list for `--show-pulls` where we start actually discussing
the option, and keep the paragraphs preceding it out of that list. That
is, drop all those pluses before the new list we're adding here.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with asciidoc-8.4.5 (as found on CentOS/Red Hat 6), the
period in the "[[files-in-.gitignore-are-tracked]]" anchor is not
properly parsed as a section:
WARNING: gitfaq.txt: line 245: missing [[files-in-.gitignore-are-tracked]] section
The resulting XML file fails to validate with xmlto:
xmlto: /git/Documentation/gitfaq.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
/git/Documentation/gitfaq.xml:3: element refentry: validity error :
Element refentry content does not follow the DTD, expecting
(beginpage? , indexterm* , refentryinfo? , refmeta? , (remark | link
| olink | ulink)* , refnamediv+ , refsynopsisdiv? , (refsect1+ |
refsection+)), got (refmeta refnamediv refsynopsisdiv refsect1
refsect1 refsect1 refsect1 variablelist refsect1 refsect1 )
Document /git/Documentation/gitfaq.xml does not validate
Let's avoid breaking users of platforms which ship an old version of
asciidoc, since the cost to do so is quite low.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various doc fixes.
* ma/doc-fixes:
git-sparse-checkout.txt: add missing '
git-credential.txt: use list continuation
git-commit-graph.txt: fix list rendering
git-commit-graph.txt: fix grammo
date-formats.txt: fix list continuation
In standard C, '{ 0 }' can be used as an universal zero-initializer.
However, Sparse complains if this is used on a type where the first
member (possibly nested) is a pointer since Sparse purposely wants
to warn when '0' is used to initialize a pointer type.
Legitimaly, it's desirable to be able to use '{ 0 }' as an idiom
without these warnings [1,2]. To allow this, an option have now
been added to Sparse:
537e3e2dae univ-init: conditionally accept { 0 } without warnings
So, add this option to the SPARSE_FLAGS variable.
Note: The option have just been added to Sparse. So, to benefit
now from this patch it's needed to use the latest Sparse
source from kernel.org. The option will simply be ignored
by older versions of Sparse.
[1] https://lore.kernel.org/r/e6796c60-a870-e761-3b07-b680f934c537@ramsayjones.plus.com
[2] https://lore.kernel.org/r/xmqqd07xem9l.fsf@gitster.c.googlers.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.
This can be seen in the following command which does not terminate:
$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
whereas the v1 version does terminate as expected:
$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
fatal: the remote end hung up unexpectedly
Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.
A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.
Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will use PACKET_READ_RESPONSE_END to separate
messages proxied by remote-curl. To prepare for this, add the
PACKET_READ_RESPONSE_END enum value.
In switch statements that need a case added, die() or BUG() when a
PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how
PACKET_READ_DELIM is implemented (especially in cases where packets are
being forwarded).
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.
For a stateless connection, inspect packets before sending them and
error out if a packet line packet is incomplete.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `diff.relative` boolean option set to `true` shows only changes in
the current directory/value specified by the `path` argument of the
`relative` option and shows pathnames relative to the aforementioned
directory.
Teach `--no-relative` to override earlier `--relative`
Add for git-format-patch(1) options documentation `--relative` and
`--no-relative`
Signed-off-by: Laurent Arnoud <laurent@spkdev.net>
Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Printing a message directly to stdout could affect TAP processing
and is not really needed, as there is a standard way to skip all
tests that could be used instead, while printing an equivalent
message.
While at it; update the message to better reflect that since
a85efb5985 (t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool,
2019-11-22), the enabling variable should be a recognized boolean
(ex: true, false, 1, 0) and get rid of the prerequisite that used
to guard all the tests, since "skip_all" is just much faster and
idempotent.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to create a branch "foo" based on "origin/master" and give
git commit -b an extra unsupported argument "bar", it confusingly
reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
$ git checkout --track -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
That's wrong, because it very well understands that "origin/master" is
supposed to be the start point for the new branch and not "bar". Check
if we got a commit and show more fitting messages in that case instead:
$ git checkout -b foo origin/master bar
fatal: Cannot update paths and switch to branch 'foo' at the same time.
$ git checkout --track -b foo origin/master bar
fatal: '--track' cannot be used with updating paths
Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test git checkout -b with and without --track and demonstrate unexpected
error messages when it's given an extra (i.e. unsupported) path
argument. In both cases it reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
The problem is that the start point we gave for the new branch is
"origin/master" and "bar" is just some extra argument -- it could even
be a valid commit, which would make the message even more confusing. We
have more fitting error messages in git commit, but get confused; use
the text of the rights ones in the tests.
Reported-by: Dana Dahlstrom <dahlstrom@google.com>
Original-test-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
06f5608c14 (bisect--helper: `bisect_start` shell function partially in C,
2019-01-02) adds a lax parser for `git bisect start` which could result
in a segfault under a bad syntax call for start with custom terms.
Detect if there are enough arguments left in the command line to use for
--term-{old,good,new,bad} and abort with the same syntax error the original
implementation will show if not.
While at it, remove an unnecessary (and incomplete) check for unknown
arguments and make sure to add a test to avoid regressions.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git hook list --porcelain <hookname>', which prints simply the
commands to be run in the order suggested by the config. This option is
intended for use by user scripts, wrappers, or out-of-process Git
commands which still want to execute hooks. For example, the following
snippet might be added to git-send-email.perl to introduce a
`pre-send-email` hook:
sub pre_send_email {
open(my $fh, 'git hook list --porcelain pre-send-email |');
chomp(my @hooks = <$fh>);
close($fh);
foreach $hook (@hooks) {
system $hook
}
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git hook list <hookname>', which checks the known configs in
order to create an ordered list of hooks to run on a given hook event.
Multiple commands can be specified for a given hook by providing
multiple "hook.<hookname>.command = <path-to-hook>" lines. Hooks will be
run in config order. If more properties need to be set on a given hook
in the future, commands can also be specified by providing
"hook.<hookname>.command = <hookcmd-name>", as well as a "[hookcmd
<hookcmd-name>]" subsection; at minimum, this subsection must contain a
"hookcmd.<hookcmd-name>.command = <path-to-hook>" line.
For example:
$ git config --list | grep ^hook
hook.pre-commit.command=baz
hook.pre-commit.command=~/bar.sh
hookcmd.baz.command=~/baz/from/hookcmd.sh
$ git hook list pre-commit
~/baz/from/hookcmd.sh
~/bar.sh
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce infrastructure for a new subcommand, git-hook, which will be
used to ease config-based hook management. This command will handle
parsing configs to compose a list of hooks to run for a given event, as
well as adding or modifying hook configs in an interactive fashion.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Begin a design document for config-based hooks, managed via git-hook.
Focus on an overview of the implementation and motivation for design
decisions. Briefly discuss the alternatives considered before this
point. Also, attempt to redefine terms to fit into a multihook world.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
df70b190 (completion: make stash -p and alias for stash push -p,
2018-04-20) wanted to make sure "git stash -p <TAB>" offers the same
completion as "git stash push -p <TAB>", but it did so by forcing the
$subcommand to be "push" whenever then "-p" option is found on the
command line.
This harms any subcommand that can take the "-p" option---even when the
subcommand is explicitly given, e.g. "git stash show -p", the code added
by the change would overwrite the $subcommand the user gave us.
Fix it by making sure that the defaulting to "push" happens only when
there is no $subcommand given yet.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* https://github.com/prati0100/git-gui:
git-gui: Handle Ctrl + BS/Del in the commit msg
Subject: git-gui: fix syntax error because of missing semicolon
If the conflict candidate file name from the top of the stack is not a
prefix of the current candiate directory then we can discard it as no
matching directory can come up later. But we are not done checking the
candidate directory -- the stack might still hold a matching file name,
so stay in the loop and check the next candidate file name.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exercise the case of putting a conflict candidate file name back on the
stack because a matching directory might yet come up later.
Do that by factoring out the test code into a function to allow for more
concise notation in the form of parameters indicating names of trees
(with trailing slash) and blobs (without trailing slash) in no
particular order (they are sorted by git mktree). Then add the new test
case as a second function call.
Fix a typo in the test title while at it ("dublicate").
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first four bytes of the line, the pkt-len, indicates the total
length of the pkt-line in hexadecimal. Fix wrong pkt-len headers of
some pkt-line messages in `http-protocol.txt` and `pack-protocol.txt`.
Reviewed-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Jiuyang Xie <jiuyang.xjy@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several code paths in the checkout code which are traversed only
in this case, due to switch having different defaults from checkout.
Let's add a test that the combination of options works and produces the
expected behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call init_checkout_metadata in reset_tree, we want to pass the
object ID of the commit in question so that it can be passed to filters,
or if there is no commit, the tree. We anticipated this latter case,
which can occur elsewhere in the checkout code, but it cannot occur
here. The only case in which we do not have a commit object is when
invoking git switch with --orphan. Moreover, we can only hit this code
path without a commit object additionally with either --force or
--discard-changes.
In such a case, there is no point initializing the checkout metadata
with a commit or tree because (a) there is no commit, only the empty
tree, and (b) we will never use the data, since no files will be smudged
when checking out a branch with no files. Pass the all-zeros object ID
in this case, since we just need some value which is a valid pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.26 used protocol v2 as its default protocol, but soon after
release, users noticed that the protocol v2 negotiation code was prone
to fail when fetching from some remotes that are far ahead of others
(such as linux-next.git versus Linus's linux.git). That has been
fixed by 0b07eecf6e (Merge branch 'jt/v2-fetch-nego-fix',
2020-05-01), but to be cautious, we are using protocol v0 as the
default in 2.27 to buy some time for any other unanticipated issues to
surface.
To that end, let's ensure that users requesting the bleeding edge
using the feature.experimental flag *do* get protocol v2. This way,
we can gain experience with a wider audience for the new protocol
version and be more confident when it is time to enable it by default
for all users in some future Git version.
Implementation note: this isn't with the rest of the
feature.experimental options in repo-settings.c because those are tied
to a repository object, whereas this code path is used for operations
like "git ls-remote" that do not require a repository.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow deleting words backwards and forwards using Ctrl + Backspace and
Delete in the commit message buffer.
* il/ctrl-bs-del:
git-gui: Handle Ctrl + BS/Del in the commit msg
Document some of the flag options in refs_ref_iterator_begin, and explain how
ref_iterator_advance_fn should handle them.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading and writing .git/refs/* assumes that refs are stored in the 'files'
ref backend.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
6c722cbe5a (bisect: allow CRLF line endings in "git bisect replay"
input, 2020-05-07) includes CR as a field separator, but relies on
it not being included in the last field, which breaks at least when
running under OpenBSD 6.7's sh.
Instead of just assume the CR will get swallowed, read the rest of
the line into an otherwise unused variable and ignore it everywhere
except on the call for git bisect start, where it matters.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'gitfaq.txt' was added in 2149b6748f (docs: add a FAQ, 2020-03-30),
it was added to the Makefile but not to command-list.txt.
Add it there also, so that the new FAQ is listed in the output of
`git help --guides`.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a BRE, that broke tests 30-32, 37-39, 42 at least with
OpenBSD 6.7; use a simpler ERE.
Fixes: d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seems to trigger a bug in at least OpenBSD's 6.7 sh where it is
interpreted as a history lookup and therefore fails 125-126, 128,
130.
Remove the subshell and get a space between ! and grep, so tests
pass successfully.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent attempt to make the test output nicer to view on CI
systems broke TAP output under bash. The effort has been reverted
to be re-attempted in the next cycle.
* jc/fix-tap-output-under-bash:
Revert "tests: when run in Bash, annotate test failures with file name/line number"
Revert "ci: add a problem matcher for GitHub Actions"
Revert "t/test_lib: avoid naked bash arrays in file_lineno"
Last-minute fix for our recent change to allow use of progress API
as a traceable region.
* ds/trace-log-progress-fix:
progress: call trace2_region_leave() only after calling _enter()
Instead of downloading Windows SDK for CI jobs for windows builds
from an external site (wingit.blob.core.windows.net), use the one
created in the windows-build job, to work around quota issues at
the external site.
* js/ci-sdk-download-fix:
ci: avoid pounding on the poor ci-artifacts container
When a binary file gets modified and renamed on both sides of history
to different locations, both files would be written to the working
tree but both would have the contents from "ours". This has been
corrected so that the path from each side gets their original content.
* en/merge-rename-rename-worktree-fix:
merge-recursive: fix rename/rename(1to2) for working tree with a binary
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
95acf11a3d ("diff: restrict when prefetching occurs", 2020-04-07) taught
diff to prefetch blobs in a more limited set of situations. These
limited situations include when the output format requires blob data,
and when inexact rename detection is needed.
There is an existing test case that tests inexact rename detection, but
it also uses an output format that requires blob data, resulting in the
inexact-rename-detection-only code not being tested. Update this test to
use the raw output format, which does not require blob data.
Thanks to Derrick Stolee for noticing this lapse in code coverage and
for doing the preliminary analysis [1].
[1] https://lore.kernel.org/git/853759d3-97c3-241f-98e1-990883cd204e@gmail.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will be manually processing packets and we will
need to access the length header. In order to simplify this, extern
packet_length() so that the logic can be reused.
Change the function parameter from `const char *linelen` to
`const char lenbuf_hex[4]`. Even though these two types behave
identically as function parameters, use the array notation to
semantically indicate exactly what this function is expecting as an
argument. Also, rename it from linelen to lenbuf_hex as the former
sounds like it should be an integral type which is misleading.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the switch statement, the difference between the `protocol_v2` and
`protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in
the latter. The fetch_pack() call is identical in both arms. However,
since this fetch_pack() call has so many parameters, it is not
immediately obvious that the call is identical in both cases.
Extract the common fetch_pack() call out of the switch statement so that
code duplication is reduced and the logic is more clear for future
readers. While we're at it, rewrite the switch statement as an if-else
tower for increased clarity.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a merge with a single strategy, the result of evaluate_result() is
effectively not used and therefore is not needed, so avoid altogether.
On Windows, this optimization can halve the time required to perform a
recursive merge of a single commit with the LLVM repo.
Signed-off-by: Andrew Ng <andrew.ng@sony.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some platforms likes HP-UX, grep(1) doesn't understand "-a".
Let's switch to perl.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" that accepts requests by "git push" learned to
outsource some of the ref updates to the new "proc-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a pseudo reference (such as "refs/for/master/topic"), may
create or update one or more references. The real names of the
references will be stored in the report options. Parse report options
to create or update remote-tracking branches properly.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to test update of remote-tracking branches for special refs,
add new "remote.origin.fetch" settings and test cases.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new multi-valued config variable "receive.procReceiveRefs"
for `receive-pack` command, like the follows:
git config --system --add receive.procReceiveRefs refs/for
git config --system --add receive.procReceiveRefs refs/drafts
If the specific prefix strings match the reference names of the commands
which are sent from git client to `receive-pack`, these commands will be
executed by an external hook (named "proc-receive"), instead of the
internal `execute_commands` function.
For example, if it is set to "refs/for", pushing to a reference such as
"refs/for/master" will not create or update reference "refs/for/master",
but may create or update a pull request directly by running the hook
"proc-receive".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new function `ref_is_matched()` to reuse `ref_is_hidden()`. Will use
this function for `receive-pack` to check commands with specific
prefixes.
Test case t5512 covered this change.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add ABNF notation for capability 'report-status-v2' which extends
capability 'report-status' by adding additional option lines.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid). Current "report-status" protocol cannot report the status for
such reference rewrite.
Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.
If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:
ok ref/for/master/topic
But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.
ok refs/for/master/topic
option refname refs/changes/23/123/1
option new-oid <new-oid>
ok refs/for/master/topic
option refname refs/changes/24/124/1
option new-oid <new-oid>
The client will report two new created references to the end user.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`. Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission. A contributor who has no
write-permission, cannot push to the repository directly. So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways. We call this workflow as a
distributed workflow.
It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases. For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.
git push origin \
HEAD:refs/for/<branch-name>/<session>
The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar". The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".
We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks. Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.
So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command. Commands with this filed
turned on will be executed by a new handler (an hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.
Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format. In the following example, The letter "S" stands for
"receive-pack" and letter "H" stands for the hook.
# Version and features negotiation.
S: PKT-LINE(version=1\0push-options atomic...)
S: flush-pkt
H: PKT-LINE(version=1\0push-options...)
H: flush-pkt
# Send commands from server to the hook.
S: PKT-LINE(<old-oid> <new-oid> <ref>)
S: ... ...
S: flush-pkt
# Send push-options only if the 'push-options' feature is enabled.
S: PKT-LINE(push-option)
S: ... ...
S: flush-pkt
# Receive result from the hook.
# OK, run this command successfully.
H: PKT-LINE(ok <ref>)
# NO, I reject it.
H: PKT-LINE(ng <ref> <reason>)
# Fall through, let 'receive-pack' to execute it.
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option fall-through)
# OK, but has an alternate reference. The alternate reference name
# and other status can be given in options
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option refname <refname>)
H: PKT-LINE(option old-oid <old-oid>)
H: PKT-LINE(option new-oid <new-oid>)
H: PKT-LINE(option forced-update)
H: ... ...
H: flush-pkt
After receiving a command, the hook will execute the command, and may
create/update different reference. For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head". The alternate reference name and other
status are given in option lines.
The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines. Finally, the result of the execution of these commands
will be reported to end user.
The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Topic "proc-receive-hook" will change the workflow and output of
git-push. Add some basic test cases in t5411 before introducing the new
topic.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a new reference (not a head or tag), report it as a new
reference instead of a new branch.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Where we explain the 'reapply' command, we don't properly wrap it in
single quote marks like we do with the other commands: We omit the
closing mark ("'reapply") and this ends up being rendered literally as
"'reapply". Add the missing "'".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use list continuation to avoid the second and third paragraphs
rendering with a different indentation from the first one where we
describe the "url" attribute.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first list item follows immediately on the paragraph where we
introduce the list. This makes the "*" render literally as part of one
huge paragraph. (With AsciiDoc, everything is fine after that, but with
Asciidoctor, we get some minor follow-on errors.) Add an empty line --
with a list continuation ("+") -- to make the first list item render ok.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to mix up the possessive "its" and "it's" ("it is"). Correct
an instance of this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The blank line before the lone "+" means it isn't detected as a list
continuation, but instead renders literally, at least with AsciiDoc.
Drop the empty line and, while at it, add a closing period to the
preceding paragraph.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite, and to do that copies the common branch in
test-lib and expands it to include it in a special case for FreeBSD.
Instead; test for it using a previously added extension to test-tool and
use that, together with a function that identifies when regcomp/regexec
will be called with broken patterns to avoid any test that would otherwise
rely on undefined behaviour.
The description of the first test which wasn't accurate has been corrected,
and the test rearranged for clarity, including a helper function that avoids
overly long lines.
Only the affected engines will have their tests suppressed, also including
"fixed" if the PCRE optimization that uses LIBPCRE2 since b65abcafc7
(grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) is not
available.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7187c7bbb8 (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27)
adds a REG_ILLSEQ prerequisite to avoid failures from the tests added in
4e2443b181 (log tests: test regex backends in "--encode=<enc>" tests,
2019-06-28), but hardcodes it to be only enabled in FreeBSD.
Instead of hardcoding the affected platform, teach the test-regex helper,
how to validate a pattern and report back, so it can be used to detect the
same issue in other affected systems (like DragonFlyBSD or macOS).
While at it, refactor the tool so it can report back the source of the
errors it founds, and can be invoked also in a --silent mode, when needed,
for backward compatibility. A missing flag has been added and the code
reformatted, as well as updates to the way the parameters are handled, for
consistency.
To minimize changes, it is assumed the regcomp error is of the right type
since we control the only caller, and is also assumed to affect both basic
and extended syntax (only basic is tested, but both behave the same in all
three affected platforms since they use the same function).
Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use fields from this struct in
receive_needs(), instead of local variables with the same name
and purpose.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to create_pack_file(),
so that this function, and the function it calls, can use all
the fields of the struct.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove the 'stateless_rpc' static
variable, as we can now use the field of 'struct upload_pack_data'
with the same name instead.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to check_non_tip(), so
that this function and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass that struct to send_ref(), so that
this function, and the functions it calls, can use all the
fields of the struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, we are passing around that struct to many
functions, so let's also pass 'struct string_list symref' around
at the same time by moving it from a local variable in
upload_pack() into a field of 'struct upload_pack_data'.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use the 'struct packet_writer writer'
field from 'struct upload_pack_data' in receive_needs(),
instead of a local 'struct packet_writer writer' variable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
receive_needs(), so that this function and the functions it
calls can use all the fields of that struct in followup commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's pass 'struct upload_pack_data' to
get_common_commits(), so that this function and the functions
it calls can use all the fields of that struct in followup
commits.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's use 'struct upload_pack_data' in
upload_pack().
This will make it possible in followup commits to remove a lot
of static variables and local variables that have the same name
and purpose as fields in 'struct upload_pack_data'. This will
also make upload_pack() work in a more similar way as
upload_pack_v2().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move 'struct upload_pack_data' and the
related upload_pack_data_init() and upload_pack_data_clear()
functions towards the beginning of the file, so that this struct
and its related functions can then be used by upload_pack() in a
followup commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's move the want_obj and have_obj object
arrays into 'struct upload_pack_data'.
These object arrays are used by both upload_pack() and
upload_pack_v2(), for example when these functions call
create_pack_file(). We are going to use
'struct upload_pack_data' in upload_pack() in a followup
commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we cleanup 'upload-pack.c' by using 'struct upload_pack_data'
more thoroughly, let's remove 'struct object_array wants' from
'struct upload_pack_data', as it appears to be unused.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strftime(3) man page is outside of the Git suite. Refererence it as
we do other external man pages and avoid creating a broken link when
generating the HTML documentation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7c5c9b9c57 (commit-graph: error out on invalid commit oids in
'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on
receiving non-commit OIDs as input to '--stdin-commits'.
This behavior can be cumbersome to work around in, say, the case of
piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if
the caller does not want to cull out non-commits themselves. In this
situation, it would be ideal if 'git commit-graph write' wrote the graph
containing the inputs that did pertain to commits, and silently ignored
the remainder of the input.
Some options have been proposed to the effect of '--[no-]check-oids'
which would allow callers to have the commit-graph builtin do just that.
After some discussion, it is difficult to imagine a caller who wouldn't
want to pass '--no-check-oids', suggesting that we should get rid of the
behavior of complaining about non-commit inputs altogether.
If callers do wish to retain this behavior, they can easily work around
this change by doing the following:
git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
awk '
!/commit/ { print "not-a-commit:"$1 }
/commit/ { print $1 }
' |
git commit-graph write --stdin-commits
To make it so that valid OIDs that refer to non-existent objects are
indeed an error after loosening the error handling, perform an extra
lookup to make sure that object indeed exists before sending it to the
commit-graph internals.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the subsequent commit, we will introduce a dependency on
'graph_read_expect' from t5318.7. Preemptively move it below
'graph_read_expect()'s definition so that the test can call it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous handful of commits, both 'git commit-graph write
--reachable' and '--stdin-commits' learned to peel tags down to the
commits which they refer to before passing them into the commit-graph
internals.
This makes the call to 'lookup_commit_reference_gently()' inside of
'fill_oids_from_commits()' a noop, since all OIDs are commits by that
point.
As such, remove the call entirely, as well as the progress meter, which
has been split and moved out to the callers in the aforementioned
earlier commits.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.
In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.
Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.
Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With either '--stdin-commits' or '--stdin-packs', the commit-graph
builtin will read line-delimited input, and interpret it either as a
series of commit OIDs, or pack names.
In a subsequent commit, we will begin handling '--stdin-commits'
differently by processing each line as it comes in, instead of in one
shot at the end. To make adequate room for this additional logic, split
the '--stdin-commits' case from '--stdin-packs' by only storing the
input when '--stdin-packs' is given.
In the case of '--stdin-commits', feed each line to a new
'read_one_commit' helper, which (for now) will merely call
'parse_oid_hex'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, labels are aligned to the leftmost column. Remove the
space-indentation from `free_specs:` to conform to this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a9f1f1f9f8 ("commit-slab.h: code split", 2018-05-19) split
commit-slab.h into commit-slab-decl.h and commit-slab-impl.h header
files, commit-slab-decl.h were left to use "COMMIT_SLAB_HDR_H",
while commit-slab-impl.h gained its own macro, "COMMIT_SLAB_IMPL_H".
As these two files use different include guards, there is nothing
broken, but let's update commit-slab-decl.h to match the convention
to name the include guard after the filename.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From e76eec3554 (ci: allow per-branch config for GitHub Actions,
2020-05-07), we started to allow contributors decide which branch
they want to build with GitHub Actions
by checking for a file named "ci/config/allow-ref".
In order to assist those contributors,
we provided a sample in "ci/config/allow-refs.sample",
and instructed them to drop the ".sample",
then commit that file to their repository.
We've misspelt the filename in that change.
Let's fix the spelling.
While we're at it, also instruct our contributors introduce that new
file to Git before commit, in case of they've never told Git before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On hppa these tests crash because the allocated stack space is too
small, even after it was doubled in b9a190789 (and the data size
doubled to match) to make it work on powerpc. For this arch just
skip these tests, which is enough to make the whole suite pass.
Fixes: https://bugs.debian.org/757402
Based-on-patch-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Price <gnprice@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 676eb0c1ce0d380478eb16bdc5a3f2a7bc01c1d2;
as we will be reverting the change to show these extra output
tokens under bash, the pattern would not match anything.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 303775a25f0b4ac5d6ad2e96eb4404c24209cad8;
instead of trying to salvage the tap-breaking change, let's
revert the whole thing for now.
A user of progress API calls start_progress() conditionally and
depends on the display_progress() and stop_progress() functions to
become no-op when start_progress() hasn't been called.
As we added a call to trace2_region_enter() to start_progress(), the
calls to other trace2 API calls from the progress API functions must
make sure that these trace2 calls are skipped when start_progress()
hasn't been called on the progress struct. Specifically, do not
call trace2_region_leave() from stop_progress() when we haven't
called start_progress(), which would have called the matching
trace2_region_enter().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this developer tested how the git-sdk-64-minimal artifact could be
served to all the GitHub workflow runs that need it, Azure Blobs looked
like a pretty good choice: it is reliable, fast and we already use it in
Git for Windows to serve components like OpenSSL, cURL, etc
It came as an unpleasant surprise just _how many_ times this artifact
was downloaded. It exploded the bandwidth to a point where the free tier
would no longer be enough, threatening to block other, essential Git for
Windows services.
Let's switch back to using the Build Artifacts of our trusty Azure
Pipeline for the time being.
To avoid unnecessary hammering of the Azure Pipeline artifacts, we use
the GitHub Action `actions/upload-artifact` in the `windows-build` job
and the GitHub Action `actions/download-artifact` in the `windows-test`
and `vs-test` jobs (the latter now depends on `windows-build` for that
reason, too).
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit b0a5a12a60 ("unpack-trees: allow check_updates() to work on a
different index", 2020-03-27) allowed check_updates() to work on a
different index, but it called get_progress() which was hardcoded to
work on o->result much like check_updates() had been. Update it to also
accept an index parameter and have check_updates() pass that parameter
along so that both are working on the same index.
Noticed-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach codepaths that show progress meter to also use the
start_progress() and the stop_progress() calls as a "region" to be
traced.
* es/trace-log-progress:
trace2: log progress time and throughput
Code cleanup and typofixes
* ds/bloom-cleanup:
completion: offer '--(no-)patch' among 'git log' options
bloom: use num_changes not nr for limit detection
bloom: de-duplicate directory entries
Documentation: changed-path Bloom filters use byte words
bloom: parse commit before computing filters
test-bloom: fix usage typo
bloom: fix whitespace around tab length
"git fsck" ensures that the paths recorded in tree objects are
sorted and without duplicates, but it failed to notice a case where
a blob is followed by entries that sort before a tree with the same
name. This has been corrected.
* rs/fsck-duplicate-names-in-trees:
fsck: report non-consecutive duplicate names in trees
"git p4" learned to recover from a (broken) state where a directory
and a file are recorded at the same path in the Perforce repository
the same way as their clients do.
* ao/p4-d-f-conflict-recover:
git-p4: recover from inconsistent perforce history
"rebase -i" segfaulted when rearranging a sequence that has a
fix-up that applies another fix-up (which may or may not be a
fix-up of yet another step).
* js/rebase-autosquash-double-fixup-fix:
rebase --autosquash: fix a potential segfault
"git bisect replay" had trouble with input files when they used
CRLF line ending, which has been corrected.
* cw/bisect-replay-with-dos:
bisect: allow CRLF line endings in "git bisect replay" input
ccd469450a (date.c: switch to reentrant {gm,local}time_r, 2019-11-28)
removes the only gmtime() call we had and moves to gmtime_r() which
doesn't have the same portability problems.
Remove the compat gmtime code since it is no longer needed, and confirm
by successfull running t4212 in FreeBSD 9.3 amd64 (the oldest I could
get a hold off).
Further work might be needed to ensure 32bit time_t systems (like FreeBSD
i386) will handle correctly the overflows tested in t4212, but that is
orthogonal to this change, and it doesn't change the current behaviour
as neither gmtime() or gmtime_r() will ever return NULL on those systems
because time_t is unsigned.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With a rename/rename(1to2) conflict, we attempt to do a three-way merge
of the file contents, so that the correct contents can be placed in the
working tree at both paths. If the file is a binary, however, no
content merging is possible and we should just use the original version
of the file at each of the paths.
Reported-by: Chunlin Zhang <zhangchunlin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document a capability that indicates which hash algorithms are in use by
both sides of a remote connection. Use the term "object-format", since
this is the term used for the repository extension as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While iterating references (to discover the set of commits to write to
the commit-graph with 'git commit-graph write --reachable'),
'add_ref_to_set' can save 'fill_oids_from_commits()' some time by
peeling the references beforehand.
Move peeling out of 'fill_oids_from_commits()' and into
'add_ref_to_set()' to use 'peel_ref()' instead of 'deref_tag()'. Doing
so allows the commit-graph machinery to use the peeled value from
'$GIT_DIR/packed-refs' instead of having to load and parse tags.
While we're at it, discard non-commit objects reachable from ref tips.
This would be done automatically by 'fill_oids_from_commits()', but such
functionality will be removed in a subsequent patch after the call to
'lookup_commit_reference_gently' is dropped (at which point a non-commit
object in the commits oidset will become an error).
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git commit-graph write --reachable' is invoked, the commit-graph
machinery calls 'for_each_ref()' to discover the set of reachable
commits.
Right now the 'add_ref_to_set' callback is not doing anything other than
adding an OID to the set of known-reachable OIDs. In a subsequent
commit, 'add_ref_to_set' will presumptively peel references. This
operation should be fast for repositories with an up-to-date
'$GIT_DIR/packed-refs', but may be slow in the general case.
So that it doesn't appear that 'git commit-graph write' is idling with
'--reachable' in the slow case, add a progress meter to provide some
output in the meantime.
In general, we don't expect a progress meter to appear at all, since
peeling references with a 'packed-refs' file is quick. If it's slow and
we do show a progress meter, the subsequent 'fill_oids_from_commits()'
will be fast, since all of the calls to
'lookup_commit_reference_gently()' will be no-ops.
Both progress meters are delayed, so it is unlikely that more than one
will appear. In either case, this intermediate state will go away in a
handful of patches, at which point there will be at most one progress
meter.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Serving a "git fetch" client over "git://" and "ssh://" protocols
using the on-wire protocol version 2 was buggy on the server end
when the client needs to make a follow-up request to
e.g. auto-follow tags.
* cc/upload-pack-v2-fetch-fix:
upload-pack: clear filter_options for each v2 fetch command
The code to skip unmerged paths in the index when sparse checkout
is in use would have made out-of-bound access of the in-core index
when the last path was unmerged, which has been corrected.
* ds/sparse-updates-oob-access-fix:
unpack-trees: avoid array out-of-bounds error
Instead of always building all branches at GitHub via Actions,
users can specify which branches to build.
* jk/ci-only-on-selected-branches:
ci: allow per-branch config for GitHub Actions
Teach "am", "commit", "merge" and "rebase", when they are run with
the "--quiet" option, to pass "--quiet" down to "gc --auto".
* jc/auto-gc-quiet:
auto-gc: pass --quiet down from am, commit, merge and rebase
auto-gc: extract a reusable helper from "git fetch"
Minor in-code comments and documentation updates around credential
API.
* cb/credential-doc-fixes:
credential: document protocol updates
credential: update gitcredentials documentation
credential: correct order of parameters for credential_match
credential: update description for credential_from_url_gently
The object walk with object filter "--filter=tree:0" can now take
advantage of the pack bitmap when available.
* tb/bitmap-walk-with-tree-zero-filter:
pack-bitmap: pass object filter to fill-in traversal
pack-bitmap.c: support 'tree:0' filtering
pack-bitmap.c: make object filtering functions generic
list-objects-filter: treat NULL filter_options as "disabled"
git-init(1)'s messages is subjected to i18n.
They should be tested by test_i18n* family.
Fix them.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pattern here looking for failures is specific to SHA-1. Let's
create a variable that matches the regex or glob pattern for a path
within the objects directory.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible a user may complain about the way that Git interacts with
their interactive shell, e.g. autocompletion or shell prompt. In that
case, it's useful for us to know which shell they're using
interactively.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It may be useful to know which shell Git was built to try to point to,
in the event that shell-based Git commands are failing. $SHELL_PATH is
set during the build and used to launch the manpage viewer, as well as
by git-compat-util.h, and it's used during tests. 'git version
--build-options' is encouraged for use in bug reports, so it makes sense
to include this information there.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than teaching only one operation, like 'git fetch', how to write
down throughput to traces, we can learn about a wide range of user
operations that may seem slow by adding tooling to the progress library
itself. Operations which display progress are likely to be slow-running
and the kind of thing we want to monitor for performance anyways. By
showing object counts and data transfer size, we should be able to
make some derived measurements to ensure operations are scaling the way
we expect.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git p4 submit with the --prepare-p4-only option, the program
should prepare a single p4 changelist and notify the user that more
commits are pending and then stop processing.
A bug has been introduced by the p4-changelist hook feature that
causes the program to continue to try and process all pending
changelists at the same time.
The function applyCommit returns True when applying the commit
was successful and the program should continue. However, when the
optional flag --prepare-p4-only is set, the program should stop
after the first application.
Change the logic in the run method for P4Submit to check for the
flag --prepare-p4-only after successfully completing the applyCommit
method.
Be aware - this change will fix the existing test error in t9807.23
for --prepare-p4-only. However there is insufficent coverage for
this flag. If more than 1 commit is pending submission to P4, the
method will properly prepare the P4 changelist, however it will
still exit the application with an exitcode of 1.
The current documentation does not define what the exit code should be
in this condition.
(See: https://git-scm.com/docs/git-p4#Documentation/git-p4.txt---prepare-p4-only)
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Control+BackSpace: Delete word to the left of the cursor.
- Control+Delete : Delete word to the right of the cursor.
Originally introduced by BRIEF and Turbo Vision between 1985 and 1992,
they were adopted by most CUA-Compliant UIs, including those of: OS/2,
Windows, Mac OS, Qt, GTK, Open/Libre Office, Gecko, and GNU Emacs.
In both cases Tk already implements the functionality bound to other key
combination, so we use that.
Graphical examples:
Deleting to the left:
v------ pointer
X_WORD____X
^-----^------ selection
Deleting to the right:
v--------- pointer
X_WORD_X
^--^------ selection
Signed-off-by: Ismael Luceno <ismael.luceno@tttech-auto.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Explanations about the configuration variables for git-grep are
duplicated in "Documentation/git-grep.txt" and
"Documentation/config/grep.txt", which can make maintenance difficult.
The first also contains a definition not present in the latter
(grep.fullName). To avoid problems like this, let's unify the
information in the second file and include it in the first.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that when GIT_TRACE_CURL is set, Git prints out "Authorization:
Basic <redacted>" instead of the base64-encoded authorization details.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous changes to the line-log machinery focused on making the
first result appear faster. This was achieved by no longer walking the
entire commit history before returning the early results. There is still
another way to improve the performance: walk most commits much faster.
Let's use the changed-path Bloom filters to reduce time spent computing
diffs.
Since the line-log computation requires opening blobs and checking the
content-diff, there is still a lot of necessary computation that cannot
be replaced with changed-path Bloom filters. The part that we can reduce
is most effective when checking the history of a file that is deep in
several directories and those directories are modified frequently. In
this case, the computation to check if a commit is TREESAME to its first
parent takes a large fraction of the time. That is ripe for improvement
with changed-path Bloom filters.
We must ensure that prepare_to_use_bloom_filters() is called in
revision.c so that the bloom_filter_settings are loaded into the struct
rev_info from the commit-graph. Of course, some cases are still
forbidden, but in the line-log case the pathspec is provided in a
different way than normal.
Since multiple paths and segments could be requested, we compute the
struct bloom_key data dynamically during the commit walk. This could
likely be improved, but adds code complexity that is not valuable at
this time.
There are two cases to care about: merge commits and "ordinary" commits.
Merge commits have multiple parents, but if we are TREESAME to our first
parent in every range, then pass the blame for all ranges to the first
parent. Ordinary commits have the same condition, but each is done
slightly differently in the process_ranges_[merge|ordinary]_commit()
methods. By checking if the changed-path Bloom filter can guarantee
TREESAME, we can avoid that tree-diff cost. If the filter says "probably
changed", then we need to run the tree-diff and then the blob-diff if
there was a real edit.
The Linux kernel repository is a good testing ground for the performance
improvements claimed here. There are two different cases to test. The
first is the "entire history" case, where we output the entire history
to /dev/null to see how long it would take to compute the full line-log
history. The second is the "first result" case, where we find how long
it takes to show the first value, which is an indicator of how quickly a
user would see responses when waiting at a terminal.
To test, I selected the paths that were changed most frequently in the
top 10,000 commits using this command (stolen from StackOverflow [1]):
git log --pretty=format: --name-only -n 10000 | sort | \
uniq -c | sort -rg | head -10
which results in
121 MAINTAINERS
63 fs/namei.c
60 arch/x86/kvm/cpuid.c
59 fs/io_uring.c
58 arch/x86/kvm/vmx/vmx.c
51 arch/x86/kvm/x86.c
45 arch/x86/kvm/svm.c
42 fs/btrfs/disk-io.c
42 Documentation/scsi/index.rst
(along with a bogus first result). It appears that the path
arch/x86/kvm/svm.c was renamed, so we ignore that entry. This leaves the
following results for the real command time:
| | Entire History | First Result |
| Path | Before | After | Before | After |
|------------------------------|--------|--------|--------|--------|
| MAINTAINERS | 4.26 s | 3.87 s | 0.41 s | 0.39 s |
| fs/namei.c | 1.99 s | 0.99 s | 0.42 s | 0.21 s |
| arch/x86/kvm/cpuid.c | 5.28 s | 1.12 s | 0.16 s | 0.09 s |
| fs/io_uring.c | 4.34 s | 0.99 s | 0.94 s | 0.27 s |
| arch/x86/kvm/vmx/vmx.c | 5.01 s | 1.34 s | 0.21 s | 0.12 s |
| arch/x86/kvm/x86.c | 2.24 s | 1.18 s | 0.21 s | 0.14 s |
| fs/btrfs/disk-io.c | 1.82 s | 1.01 s | 0.06 s | 0.05 s |
| Documentation/scsi/index.rst | 3.30 s | 0.89 s | 1.46 s | 0.03 s |
It is worth noting that the least speedup comes for the MAINTAINERS file
which is
* edited frequently,
* low in the directory heirarchy, and
* quite a large file.
All of those points lead to spending more time doing the blob diff and
less time doing the tree diff. Still, we see some improvement in that
case and significant improvement in other cases. A 2-4x speedup is
likely the more typical case as opposed to the small 5% change for that
file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch made it possible to perform line-level filtering
during history traversal instead of in an expensive preprocessing
step, but it still requires some simpler preprocessing steps, notably
topo-ordering. However, nowadays we have commit-graphs storing
generation numbers, which make it possible to incrementally traverse
the history in topological order, without the preparatory limit_list()
and sort_in_topological_order() steps; see b45424181e (revision.c:
generation-based topo-order algorithm, 2018-11-01).
This patch combines the two, so we can do both the topo-ordering and
the line-level filtering during history traversal, eliminating even
those simpler preprocessing steps, and thus further reducing the delay
before showing the first commit modifying the given line range.
The 'revs->limited' flag plays the central role in this, because, due
to limitations of the current implementation, the generation
number-based topo-ordering is only enabled when this flag remains
unset. Line-level log, however, always sets this flag in
setup_revisions() ever since the feature was introduced in 12da1d1f6f
(Implement line-history search (git log -L), 2013-03-28). The reason
for setting 'limited' is unclear, though, because the line-level log
itself doesn't directly depend on it, and it doesn't affect how the
limit_list() function limits the revision range. However, there is an
indirect dependency: the line-level log requires topo-ordering, and
the "traditional" sort_in_topological_order() requires an already
limited commit list since e6c3505b44 (Make sure we generate the whole
commit list before trying to sort it topologically, 2005-07-06). The
new, generation numbers-based topo-ordering doesn't require a limited
commit list anymore.
So don't set 'revs->limited' for line-level log, unless it is really
necessary, namely:
- The user explicitly requested parent rewriting, because that is
still done in the line_log_filter() preprocessing step (see
previous patch), which requires sort_in_topological_order() and in
turn limit_list() as well.
- A commit-graph file is not available or it doesn't yet contain
generation numbers. In these cases we had to fall back on
sort_in_topological_order() and in turn limit_list(). The
existing condition with generation_numbers_enabled() has already
ensured that the 'limited' flag is set in these cases; this patch
just makes sure that the line-level log sets 'revs->topo_order'
before that condition.
While the reduced delay before showing the first commit is measurable
in git.git, it takes a bigger repository to make it clearly noticable.
In both cases below the line ranges were chosen so that they were
modified rather close to the starting revisions, so the effect of this
change is most noticable.
# git.git
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
After:
real 0m0.058s
user 0m0.050s
sys 0m0.005s
# linux.git
$ time git --no-pager log \
-L:build_restore_work_registers:arch/mips/mm/tlbex.c -1 v5.2
Before:
real 0m1.129s
user 0m1.061s
sys 0m0.069s
After:
real 0m0.096s
user 0m0.087s
sys 0m0.009s
Additional testing by Derrick Stolee: Since this patch improves
the performance for the first result, I repeated the experiment
from the previous patch on the Linux kernel repository, reporting
real time here:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 0.71 s
After: 0.05 s
Now, we have dropped the full topo-order of all ~910,000 commits
before reporting the first result. The remaining performance
improvements then are:
1. Update the parent-rewriting logic to be incremental similar to
how "git log --graph" behaves.
2. Use changed-path Bloom filters to reduce the time spend in the
tree-diff to see if the path(s) changed.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As diff_tree_oid() computes a diff, it will terminate early if the
total number of changed paths is strictly larger than max_changes.
This includes the directories that changed, not just the file paths.
However, only the file paths are reflected in the resulting diff
queue's "nr" value.
Use the "num_changes" from diffopt to check if the diff terminated
early. This is incredibly important, as it can result in incorrect
filters! For example, the first commit in the Linux kernel repo
reports only 471 changes, but since these are nested inside several
directories they expand to 513 "real" changes, and in fact the
total list of changes is not reported. Thus, the computed filter
for this commit is incorrect.
Demonstrate the subtle difference by using one fewer file change
in the 'get bloom filter for commit with 513 changes' test. Before,
this edited 513 files inside "bigDir" which hit this inequality.
However, dropping the file count by one demonstrates how the
previous inequality was incorrect but the new one is correct.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current line-level log implementation performs a preprocessing
step in prepare_revision_walk(), during which the line_log_filter()
function filters and rewrites history to keep only commits modifying
the given line range. This preprocessing affects both responsiveness
and correctness:
- Git doesn't produce any output during this preprocessing step.
Checking whether a commit modified the given line range is
somewhat expensive, so depending on the size of the given revision
range this preprocessing can result in a significant delay before
the first commit is shown.
- Limiting the number of displayed commits (e.g. 'git log -3 -L...')
doesn't limit the amount of work during preprocessing, because
that limit is applied during history traversal. Alas, by that
point this expensive preprocessing step has already churned
through the whole revision range to find all commits modifying the
revision range, even though only a few of them need to be shown.
- It rewrites parents, with no way to turn it off. Without the user
explicitly requesting parent rewriting any parent object ID shown
should be that of the immediate parent, just like in case of a
pathspec-limited history traversal without parent rewriting.
However, after that preprocessing step rewrote history, the
subsequent "regular" history traversal (i.e. get_revision() in a
loop) only sees commits modifying the given line range.
Consequently, it can only show the object ID of the last ancestor
that modified the given line range (which might happen to be the
immediate parent, but many-many times it isn't).
This patch addresses both the correctness and, at least for the common
case, the responsiveness issues by integrating line-level log
filtering into the regular revision walking machinery:
- Make process_ranges_arbitrary_commit(), the static function in
'line-log.c' deciding whether a commit modifies the given line
range, public by removing the static keyword and adding the
'line_log_' prefix, so it can be called from other parts of the
revision walking machinery.
- If the user didn't explicitly ask for parent rewriting (which, I
believe, is the most common case):
- Call this now-public function during regular history traversal,
namely from get_commit_action() to ignore any commits not
modifying the given line range.
Note that while this check is relatively expensive, it must be
performed before other, much cheaper conditions, because the
tracked line range must be adjusted even when the commit will
end up being ignored by other conditions.
- Skip the line_log_filter() call, i.e. the expensive
preprocessing step, in prepare_revision_walk(), because, thanks
to the above points, the revision walking machinery is now able
to filter out commits not modifying the given line range while
traversing history.
This way the regular history traversal sees the unmodified
history, and is therefore able to print the object ids of the
immediate parents of the listed commits. The eliminated
preprocessing step can greatly reduce the delay before the first
commit is shown, see the numbers below.
- However, if the user did explicitly ask for parent rewriting via
'--parents' or a similar option, then stick with the current
implementation for now, i.e. perform that expensive filtering and
history rewriting in the preprocessing step just like we did
before, leaving the initial delay as long as it was.
I tried to integrate line-level log filtering with parent rewriting
into the regular history traversal, but, unfortunately, several
subtleties resisted... :) Maybe someday we'll figure out how to do
that, but until then at least the simple and common (i.e. without
parent rewriting) 'git log -L:func:file' commands can benefit from the
reduced delay.
This change makes the failing 'parent oids without parent rewriting'
test in 't4211-line-log.sh' succeed.
The reduced delay is most noticable when there's a commit modifying
the line range near the tip of a large-ish revision range:
# no parent rewriting requested, no commit-graph present
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0
Before:
real 0m9.570s
user 0m9.494s
sys 0m0.076s
After:
real 0m0.718s
user 0m0.674s
sys 0m0.044s
A significant part of the remaining delay is spent reading and parsing
commit objects in limit_list(). With the help of the commit-graph we
can eliminate most of that reading and parsing overhead, so here are
the timing results of the same command as above, but this time using
the commit-graph:
Before:
real 0m8.874s
user 0m8.816s
sys 0m0.057s
After:
real 0m0.107s
user 0m0.091s
sys 0m0.013s
The next patch will further reduce the remaining delay.
To be clear: this patch doesn't actually optimize the line-level log,
but merely moves most of the work from the preprocessing step to the
history traversal, so the commits modifying the line range can be
shown as soon as they are processed, and the traversal can be
terminated as soon as the given number of commits are shown.
Consequently, listing the full history of a line range, potentially
all the way to the root commit, will take the same time as before (but
at least the user might start reading the output earlier).
Furthermore, if the most recent commit modifying the line range is far
away from the starting revision, then that initial delay will still be
significant.
Additional testing by Derrick Stolee: In the Linux kernel repository,
the MAINTAINERS file was changed ~3,500 times across the ~915,000
commits. In addition to that edit frequency, the file itself is quite
large (~18,700 lines). This means that a significant portion of the
computation is taken up by computing the patch-diff of the file. This
patch improves the real time it takes to output the first result quite
a bit:
Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null
Before: 3.88 s
After: 0.71 s
If we drop the "-n 1" in the command, then there is no change in
end-to-end process time. This is because the command still needs to
walk the entire commit history, which negates the point of this
patch. This is expected.
As a note for future reference, the ~4.3 seconds in the old code
spends ~2.6 seconds computing the patch-diffs, and the rest of the
time is spent walking commits and computing diffs for which paths
changed at each commit. The changed-path Bloom filters could improve
the end-to-end computation time (i.e. no "-n 1" in the command).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing a changed-path Bloom filter, we need to take the
files that changed from the diff computation and extract the parent
directories. That way, a directory pathspec such as "Documentation"
could match commits that change "Documentation/git.txt".
However, the current code does a poor job of this process. The paths
are added to a hashmap, but we do not check if an entry already
exists with that path. This can create many duplicate entries and
cause the filter to have a much larger length than it should. This
means that the filter is more sparse than intended, which helps the
false positive rate, but wastes a lot of space.
Properly use hashmap_get() before hashmap_add(). Also be sure to
include a comparison function so these can be matched correctly.
This has an effect on a test in t0095-bloom.sh. This makes sense,
there are ten changes inside "smallDir" so the total number of
paths in the filter should be 11. This would result in 11 * 10 bits
required, and with 8 bits per byte, this results in 14 bytes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the tests in 't4211-line-log.sh' really check which parent
object IDs are shown in the output, either implicitly as part of
"Merge: ..." lines [1] or explicitly via the '%p' or '%P' format
specifiers in a custom pretty format.
Add two tests to 't4211-line-log.sh' to check which parent object IDs
are shown, one without and one with explicitly requested parent
rewriting, IOW without and with the '--parents' option.
The test without '--parents' is marked as failing, because without
that option parent rewriting should not be performed, and thus the
parent object ID should be that of the immediate parent, just like in
case of a pathspec-limited history traversal without parent rewriting.
The current line-level log implementation, however, performs parent
rewriting unconditionally and without a possibility to turn it off,
and, consequently, it shows the object ID of the most recent ancestor
that modified the given line range.
In both of these new tests we only really care about the object IDs of
the listed commits and their parents, but not the diffs of the line
ranges; the diffs have already been thoroughly checked in the previous
tests.
[1] While one of the tests ('-M -L ':f:b.c' parallel-change') does
list a merge commit, both of its parents happen to modify the
given line range and are listed as well, so the implications of
parent rewriting remained hidden and untested.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Documentation/technical/commit-graph-format.txt, the definition
of the BIDX chunk specifies the length is a number of 8-byte words.
During development we discovered that using 8-byte words in the
Murmur3 hash algorithm causes issues with big-endian versus little-
endian machines. Thus, the hash algorithm was adapted to work on a
byte-by-byte basis. However, this caused a change in the definition
of a "word" in bloom.h. Now, a "word" is a single byte, which allows
filters to be as small as two bytes. These length-two filters are
demonstrated in t0095-bloom.sh, and a larger filter of length 25 is
demonstrated as well.
The original point of using 8-byte words was for alignment reasons.
It also presented opportunities for extremely sparse Bloom filters
when there were a small number of changes at a commit, creating a
very low false-positive rate. However, modifying the format at this
point is unlikely to be a valuable exercise. Also, this use of
single-byte granularity does present opportunities to save space.
It is unclear if 8-byte alignment of the filters would present any
meaningful performance benefits.
Modify the format document to reflect reality.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the unused fields 'status', 'arg_alloc', 'arg_nr' and 'args'
from 'struct line_log_data'. They were already part of the struct
when it was introduced in commit 12da1d1f6 (Implement line-history
search (git log -L), 2013-03-28), but as far as I can tell none of
them have ever been actually used.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing changed-path Bloom filters for a commit, we need to
know if the commit has a parent or not. If the commit is not parsed,
then its parent pointer will be NULL.
As far as I can tell, the only opportunity to reach this code
without parsing the commit is inside "test-tool bloom
get_filter_for_commit" but it is best to be safe.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tree entries are sorted in path order, meaning that directory names get
a slash ('/') appended implicitly. Git fsck checks if trees contains
consecutive duplicates, but due to that ordering there can be
non-consecutive duplicates as well if one of them is a directory and the
other one isn't. Such a tree cannot be fully checked out.
Find these duplicates by recording candidate file names on a stack and
check candidate directory names against that stack to find matches.
Suggested-by: Brandon Williams <bwilliamseng@gmail.com>
Original-test-by: Brandon Williams <bwilliamseng@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce allows you commit files and directories with the same name,
so you could have files //depot/foo and //depot/foo/bar both checked
in. A p4 sync of a repository in this state fails. Deleting one of
the files recovers the repository.
When this happens we want git-p4 to recover in the same way as
perforce.
Note that Perforce has this change in their 2017.1 version:
Bugs fixed in 2017.1
#1489051 (Job #2170) **
Submitting a file with the same name as an existing depot
directory path (or vice versa) will now be rejected.
so people hopefully will not creating damaged Perforce repos
anymore, but "git p4" needs to be able to interact with already
corrupt ones.
Signed-off-by: Andrew Oakley <andrew@adoakley.name>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When selecting a batch of pack-files to repack in the "git
multi-pack-index repack" command, Git should respect the
repack.packKeptObjects config option. When false, this option says that
the pack-files with an associated ".keep" file should not be repacked.
This config value is "false" by default.
There are two cases for selecting a batch of objects. The first is the
case where the input batch-size is zero, which specifies "repack
everything". The second is with a non-zero batch size, which selects
pack-files using a greedy selection criteria. Both of these cases are
updated and tested.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "repack" subcommand of "git multi-pack-index" command
creates new packfile(s), it does not call the "git repack"
command but instead directly calls the "git pack-objects"
command, and the configuration variables meant for the "git
repack" command, like "repack.usedaeltabaseoffset", are ignored.
Check the configuration variables used by "git repack" ourselves
in "git multi-index-pack" and pass the corresponding options to
underlying "git pack-objects".
Note that `repack.writeBitmaps` configuration is ignored, as the
pack bitmap facility is useful only with a single packfile.
Signed-off-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rearranging the todo list so that the fixups/squashes are reordered
just after the commits they intend to fix up, we use two arrays to
maintain that list: `next` and `tail`.
The idea is that `next[i]`, if set to a non-negative value, contains the
index of the item that should be rearranged just after the `i`th item.
To avoid having to walk the entire `next` chain when appending another
fixup/squash, we also store the end of the `next` chain in `tail[i]`.
The logic we currently use to update these array items is based on the
assumption that given a fixup/squash item at index `i`, we just found
the index `i2` indicating the first item in that fixup chain.
However, as reported by Paul Ganssle, that need not be true: the special
form `fixup! <commit-hash>` is allowed to point to _another_ fixup
commit in the middle of the fixup chain.
Example:
* 0192a To fixup
* 02f12 fixup! To fixup
* 03763 fixup! To fixup
* 04ecb fixup! 02f12
Note how the fourth commit targets the second commit, which is already a
fixup that targets the first commit.
Previously, we would update `next` and `tail` under our assumption that
every `fixup!` commit would find the start of the `fixup!`/`squash!`
chain. This would lead to a segmentation fault because we would actually
end up with a `next[i]` pointing to a `fixup!` but the corresponding
`tail[i]` pointing nowhere, which would the lead to a segmentation
fault.
Let's fix this by _inserting_, rather than _appending_, the item. In
other words, if we make a given line successor of another line, we do
not simply forget any previously set successor of the latter, but make
it a successor of the former.
In the above example, at the point when we insert 04ecb just after
02f12, 03763 would already be recorded as a successor of 04ecb, and we
now "squeeze in" 04ecb.
To complete the idea, we now no longer assume that `next[i]` pointing to
a line means that `last[i]` points to a line, too. Instead, we extend
the concept of `last` to cover also partial `fixup!`/`squash!` chains,
i.e. chains starting in the middle of a larger such chain.
In the above example, after processing all lines, `last[0]`
(corresponding to 0192a) would point to 03763, which indeed is the end
of the overall `fixup!` chain, and `last[1]` (corresponding to 02f12)
would point to 04ecb (which is the last `fixup!` targeting 02f12, but it
has 03763 as successor, i.e. it is not the end of overall `fixup!`
chain).
Reported-by: Paul Ganssle <paul@ganssle.io>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent change to show files and line numbers of a breakage during
test (only available when running the tests with bash) were hurting
other shells with syntax errors, which has been corrected.
* cb/test-bash-lineno-fix:
t/test_lib: avoid naked bash arrays in file_lineno
The basic test did not honor $TEST_SHELL_PATH setting, which has
been corrected.
* cb/t0000-use-the-configured-shell:
t/t0000-basic: make sure subtests also use TEST_SHELL_PATH
The <stdlib.h> header on NetBSD brings in its own definition of
hmac() function (eek), which conflicts with our own and unrelated
function with the same name. Our function has been renamed to work
around the issue.
* cb/avoid-colliding-with-netbsd-hmac:
builtin/receive-pack: avoid generic function name hmac()
"git restore --staged --worktree" now defaults to take the contents
out of "HEAD", instead of erring out.
* es/restore-staged-from-head-by-default:
restore: default to HEAD when combining --staged and --worktree
The coding guideline for shell scripts instructed to refer to a
variable with dollar-sign inside arithmetic expansion to work
around a bug in old versions of dash, which is a thing of the past.
Now we are not forbidden from writing $((var+1)).
* jk/arith-expansion-coding-guidelines:
CodingGuidelines: drop arithmetic expansion advice to use "$x"
The sparse-checkout patterns have been forbidden from excluding all
paths, leaving an empty working tree, for a long time. This
limitation has been lifted.
* ds/sparse-allow-empty-working-tree:
sparse-checkout: stop blocking empty workdirs
"git branch" and other "for-each-ref" variants accepted multiple
--sort=<key> options in the increasing order of precedence, but it
had a few breakages around "--ignore-case" handling, and tie-breaking
with the refname, which have been fixed.
* jk/for-each-ref-multi-key-sort-fix:
ref-filter: apply fallback refname sort only after all user sorts
ref-filter: apply --ignore-case to all sorting keys
The samples in the credential documentation has been updated to
make it clear that we depict what would appear in the .git/config
file, by adding appropriate quotes as needed..
* jk/credential-sample-update:
gitcredentials(7): make shell-snippet example more realistic
gitcredentials(7): clarify quoting of helper examples
With the recent tightening of the code that is used to parse
various parts of a URL for use in the credential subsystem, a
hand-edited credential-store file causes the credential helper to
die, which is a bit too harsh to the users. Demote the error
behaviour to just ignore and keep using well-formed lines instead.
* cb/credential-store-ignore-bogus-lines:
credential-store: ignore bogus lines from store file
credential-store: document the file format a bit more
In error messages that "git switch" mentions its option to create a
new branch, "-b/-B" options were shown, where "-c/-C" options
should be, which has been corrected.
* dl/switch-c-option-in-error-message:
switch: fix errors and comments related to -c and -C
Because of the request/response model of protocol v2, the
upload_pack_v2() function is sometimes called twice in the same
process, while 'struct list_objects_filter_options filter_options'
was declared as static at the beginning of 'upload-pack.c'.
This made the check in list_objects_filter_die_if_populated(), which
is called by process_args(), fail the second time upload_pack_v2() is
called, as filter_options had already been populated the first time.
To fix that, filter_options is not static any more. It's now owned
directly by upload_pack(). It's now also part of 'struct
upload_pack_data', so that it's owned indirectly by upload_pack_v2().
In the long term, the goal is to also have upload_pack() use
'struct upload_pack_data', so adding filter_options to this struct
makes more sense than to have it owned directly by upload_pack_v2().
This fixes the first of the 2 bugs documented by d0badf8797
(partial-clone: demonstrate bugs in partial fetch, 2020-02-21).
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loop in warn_conflicted_path() that checks for the count of
entries with the same path uses "i+count" for the array
entry. However, the loop only verifies that the value of count is
below the array size. Fix this by adding i to the condition.
I hit this condition during a test of the in-tree sparse-checkout
feature, so it is exercised by the end of the series.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
[jc: readability fix]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We advertise that the bisect log can be corrected in your editor
before being fed to "git bisect replay", but some editors may
turn the line endings to CRLF.
Update the parser of the input lines so that the CR at the end of
the line gets ignored.
Were anyone to intentionally be using terms/revs with embedded CRs,
replaying such bisects will no longer work with this change. I suspect
that this is incredibly rare.
Signed-off-by: Christopher Warrington <chwarr@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to
'submodule--helper.c' and call the latter via 'git-submodule.sh'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally a failure a user is seeing may be related to a specific
hook which is being run, perhaps without the user realizing. While the
contents of hooks can be sensitive - containing user data or process
information specific to the user's organization - simply knowing that a
hook is being run at a certain stage can help us to understand whether
something is going wrong.
Without a definitive list of hook names within the code, we compile our
own list from the documentation. This is likely prone to bitrot, but
designing a single source of truth for acceptable hooks is too much
overhead for this small change to the bugreport tool.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* We need a `final_new_line` to make our source code as text file, per
POSIX and C specification.
* `bloom_filters` should be limited to interal linkage only
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document protocol changes after CVE-2020-11008, including the removal of
references to the override of attributes which is no longer recommended
after CVE-2020-5260 and that might be removed in the future.
While at it do some improvements for clarity and consistency.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the expected effect of all attributes and how the helpers
are expected to handle them and the context where they operate.
While at it, space the descriptions for clarity, and add a paragraph
mentioning the early termination in the list processing of helpers,
to complement the one about the special "quit" attribute.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
662f9cf154 (tests: when run in Bash, annotate test failures with file
name/line number, 2020-04-11), introduces a way to report the location
(file:lineno) of a failed test case by traversing the bash callstack.
The implementation requires bash and uses shell arrays and is therefore
protected by a guard but NetBSD sh will still have to parse the function
and therefore will result in:
** t0000-basic.sh ***
./test-lib.sh: 681: Syntax error: Bad substitution
Enclose the bash specific code inside an eval to avoid parsing errors in
the same way than 5826b7b595 (test-lib: check Bash version for '-x'
without using shell arrays, 2019-01-03)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3f824e91c8 (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08) allows for
setting a shell for running the tests, but the generated subtests weren't
updated.
Correct that and while at it update it to use write_script.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on the workflows of individual developers, it can either be
convenient or annoying that our GitHub Actions CI jobs are run on every
branch. As an example of annoying: if you carry many half-finished
work-in-progress branches and rebase them frequently against master,
you'd get tons of failure reports that aren't interesting (not to
mention the wasted CPU).
This commit adds a new job which checks a special branch within the
repository for CI config, and then runs a shell script it finds there to
decide whether to skip the rest of the tests. The default will continue
to run tests for all refs if that branch or script is missing.
There have been a few alternatives discussed:
One option is to carry information in the commit itself about whether it
should be tested, either in the tree itself (changing the workflow YAML
file) or in the commit message (a "[skip ci]" flag or similar). But
these are frustrating and error-prone to use:
- you have to manually apply them to each branch that you want to mark
- it's easy for them to leak into other workflows, like emailing patches
We could likewise try to get some information from the branch name. But
that leads to debates about whether the default should be "off" or "on",
and overriding still ends up somewhat awkward. If we default to "on",
you have to remember to name your branches appropriately to skip CI. And
if "off", you end up having to contort your branch names or duplicate
your pushes with an extra refspec.
By comparison, this commit's solution lets you specify your config once
and forget about it, and all of the data is off in its own ref, where it
can be changed by individual forks without touching the main tree.
There were a few design decisions that came out of on-list discussion.
I'll summarize here:
- we could use GitHub's API to retrieve the config ref, rather than a
real checkout (and then just operate on it via some javascript). We
still have to spin up a VM and contact GitHub over the network from
it either way, so it ends up not being much faster. I opted to go
with shell to keep things similar to our other tools (and really
could implement allow-refs in any language you want). This also makes
it easy to test your script locally, and to modify it within the
context of a normal git.git tree.
- we could keep the well-known refname out of refs/heads/ to avoid
cluttering the branch namespace. But that makes it awkward to
manipulate. By contrast, you can just "git checkout ci-config" to
make changes.
- we could assume the ci-config ref has nothing in it except config
(i.e., a branch unrelated to the rest of git.git). But dealing with
orphan branches is awkward. Instead, we'll do our best to efficiently
check out only the ci/config directory using a shallow partial clone,
which allows your ci-config branch to be just a normal branch, with
your config changes on top.
- we could provide a simpler interface, like a static list of ref
patterns. But we can't get out of spinning up a whole VM anyway, so
we might as well use that feature to make the config as flexible as
possible. If we add more config, we should be able to reuse our
partial-clone to set more outputs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These commands take the --quiet option for their own operation, but
they forget to pass the option down when they invoke "git gc --auto"
internally.
Teach them to do so using the run_auto_gc() helper we added in the
previous step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1991006c (fetch: convert argv_gc_auto to struct argv_array,
2014-08-16), we taught "git fetch --quiet" to pass the "--quiet"
option down to "gc --auto". This issue, however, is not limited to
"fetch":
$ git grep -e 'gc.*--auto' \*.c
finds hits in "am", "commit", "merge", and "rebase" and these
commands do not pass "--quiet" down to "gc --auto" when they
themselves are told to be quiet.
As a preparatory step, let's introduce a helper function
run_auto_gc(), that the caller can pass a boolean "quiet",
and redo the fix to "git fetch" using the helper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In two tests introduced by 4fa3f00abb ("fetch-pack: in protocol v2,
in_vain only after ACK", 2020-04-28) and 2f0a093dd6 ("fetch-pack: in
protocol v2, reset in_vain upon ACK", 2020-04-28), the count of objects
downloaded is checked by grepping for a specific message in the packet
trace. However, this is flaky as that specific message may be delivered
over 2 or more packet lines.
Instead, grep over stderr, just like the "fetch creating new shallow
root" test in the same file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an issue in 'Common Issues' section which addresses the confusion
between performing a 'fetch' and a 'pull'.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitcredentials(7) already mentions several possible invocations that one
can use as the value for credential.helper. However, many people are
not aware that there are other options than a simple credential helper
name, so let's place some explanatory text in the documentation for
credential.helper as well.
We still refer the user to gitcredential(7) for additional explanations
and helpful examples.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add issue in 'Common Issues' section which addresses the problem of
Git tracking files/paths mentioned in '.gitignore'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In documentation pertaining to autostash behavior, we refer to the
"stash reflog". This description is too low-level as the reflog refers
to an implementation detail of how the stash works and, for end-users,
they do not need to be aware of this at all.
Change references of "stash reflog" to "stash list", which should
provide more accessible terminology for end-users.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same as js/partial-urlmatch-2.17, built on more recent codebase
to avoid unnecessary merge conflicts.
* js/partial-urlmatch:
credential: handle `credential.<partial-URL>.<key>` again
credential: optionally allow partial URLs in credential_from_url_gently()
Recent updates broke parsing of "credential.<url>.<key>" where
<url> is not a full URL (e.g. [credential "https://"] helper = ...)
stopped working, which has been corrected.
* js/partial-urlmatch-2.17:
credential: handle `credential.<partial-URL>.<key>` again
credential: optionally allow partial URLs in credential_from_url_gently()
credential: fix grammar
Some of the files commit-graph subsystem keeps on disk did not
correctly honor the core.sharedRepository settings and some were
left read-write.
* tb/commit-graph-perm-bits:
commit-graph.c: make 'commit-graph-chain's read-only
commit-graph.c: ensure graph layers respect core.sharedRepository
commit-graph.c: write non-split graphs as read-only
lockfile.c: introduce 'hold_lock_file_for_update_mode'
tempfile.c: introduce 'create_tempfile_mode'
The approxidate parser learns to parse seconds with fraction.
* dd/iso-8601-updates:
date.c: allow compact version of ISO-8601 datetime
date.c: skip fractional second part of ISO-8601
date.c: validate and set time in a helper function
date.c: s/is_date/set_date/
Update the parser used for credential.<URL>.<variable>
configuration, to handle <URL>s with '/' in them correctly.
* bc/wildcard-credential:
credential: fix matching URLs with multiple levels in path
By default, files are restored from the index for --worktree, and from
HEAD for --staged. When --worktree and --staged are combined, --source
must be specified to disambiguate the restore source[1], thus making it
cumbersome to restore a file in both the worktree and the index.
However, HEAD is also a reasonable default for --worktree when combined
with --staged, so make it the default anytime --staged is used (whether
combined with --worktree or not).
[1]: Due to an oversight, the --source requirement, though documented,
is not actually enforced.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fabec2c5c3 (builtin/receive-pack: switch to use the_hash_algo, 2019-08-18)
renames hmac_sha1 to hmac, as it was updated to use the hash function used
by git (which won't be sha1 in the future).
hmac() is provided by NetBSD >= 8 libc and therefore conflicts as shown by :
builtin/receive-pack.c:421:13: error: conflicting types for 'hmac'
static void hmac(unsigned char *out,
^~~~
In file included from ./git-compat-util.h:172:0,
from ./builtin.h:4,
from builtin/receive-pack.c:1:
/usr/include/stdlib.h:305:10: note: previous declaration of 'hmac' was here
ssize_t hmac(const char *, const void *, size_t, const void *, size_t, void *,
^~~~
Rename it again to hmac_hash to reflect it will use the git's defined hash
function and avoid the conflict, while at it update a comment to better
describe the HMAC function that was used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In subsequent patches, we are going to update a progress meter when
'add_ref_to_set()' is called, and need a convenient way to pass a
'struct progress *' in from the caller.
Introduce 'refs_cb_data' as a catch-all for parameters that
'add_ref_to_set' may need, and wrap the existing single parameter in
that struct.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the beginning in 118250728e (credential: apply helper config,
2011-12-10), the declaration for that function used a different order
than the implementation.
All callers use the same order than the implementation, so update
the declaration in credential.h to match.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c44088ecc4 (credential: treat URL without scheme as invalid, 2020-04-18)
changes the implementation for this function to return -1 if protocol is
missing.
Update blurb to match implementation.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).
And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.
But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------
[...]
5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2%
The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patch, we made it easy to define other filters that
exclude all objects of a certain type. Use that in order to implement
bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is
equal to 0.
The general case is not helped by bitmaps, since for values of 'n > 0',
the object filtering machinery requires a full-blown tree traversal in
order to determine the depth of a given tree. Caching this is
non-obvious, too, since the same tree object can have a different depth
depending on the context (e.g., a tree was moved up in the directory
hierarchy between two commits).
But, the 'n = 0' case can be helped, and this patch does so. Running
p5310.11 in this tree and on master with the kernel, we can see that
this case is helped substantially:
Test master this tree
--------------------------------------------------------------------------------
5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4%
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.
In the future, we would like to add support for filters that behave as
if they exclude a certain type of object, for e.g., the tree depth
filter with depth 0.
To prepare for this, make some of the functions used for filtering more
generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
they can work over arbitrary object types.
To that end, create 'find_tip_objects' and
'filter_bitmap_exclude_type', and redefine the aforementioned functions
in terms of those.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most callers, we have an actual list_objects_filter_options struct,
and if no filtering is desired its "choice" element will be
LOFC_DISABLED. However, some code may have only a pointer to such a
struct which may be NULL (because _their_ callers didn't care about
filtering, either). Rather than forcing them to handle this explicitly
like:
if (filter_options)
traverse_commit_list_filtered(filter_options, revs,
show_commit, show_object,
show_data, NULL);
else
traverse_commit_list(revs, show_commit, show_object,
show_data);
let's just treat a NULL filter_options the same as LOFC_DISABLED. We
only need a small change, since that option struct is converted into a
real filter only in the "init" function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fuzzer running on the entry point provided by fuzz-commit-graph.c
revealed a memory leak when parse_commit_graph() creates a struct
bloom_filter_settings and then returns early due to error. Fix that
error by always freeing that struct first (if it exists) before
returning early due to error.
While making that change, I also noticed another possible memory leak -
when the BLOOMDATA chunk is provided but not BLOOMINDEXES. Also fix that
error.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 9e468334b4 (ref-filter: fallback on alphabetical comparison,
2015-10-30) taught ref-filter's sort to fallback to comparing refnames.
But it did it at the wrong level, overriding the comparison result for a
single "--sort" key from the user, rather than after all sort keys have
been exhausted.
This worked correctly for a single "--sort" option, but not for multiple
ones. We'd break any ties in the first key with the refname and never
evaluate the second key at all.
To make matters even more interesting, we only applied this fallback
sometimes! For a field like "taggeremail" which requires a string
comparison, we'd truly return the result of strcmp(), even if it was 0.
But for numerical "value" fields like "taggerdate", we did apply the
fallback. And that's why our multiple-sort test missed this: it uses
taggeremail as the main comparison.
So let's start by adding a much more rigorous test. We'll have a set of
commits expressing every combination of two tagger emails, dates, and
refnames. Then we can confirm that our sort is applied with the correct
precedence, and we'll be hitting both the string and value comparators.
That does show the bug, and the fix is simple: moving the fallback to
the outer compare_refs() function, after all ref_sorting keys have been
exhausted.
Note that in the outer function we don't have an "ignore_case" flag, as
it's part of each individual ref_sorting element. It's debatable what
such a fallback should do, since we didn't use the user's keys to match.
But until now we have been trying to respect that flag, so the
least-invasive thing is to try to continue to do so. Since all callers
in the current code either set the flag for all keys or for none, we can
just pull the flag from the first key. In a hypothetical world where the
user really can flip the case-insensitivity of keys separately, we may
want to extend the code to distinguish that case from a blanket
"--ignore-case".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the ref-filter users (for-each-ref, branch, and tag) take an
--ignore-case option which makes filtering and sorting case-insensitive.
However, this option was applied only to the first element of the
ref_sorting list. So:
git for-each-ref --ignore-case --sort=refname
would do what you expect, but:
git for-each-ref --ignore-case --sort=refname --sort=taggername
would sort the primary key (taggername) case-insensitively, but sort the
refname case-sensitively. We have two options here:
- teach callers to set ignore_case on the whole list
- replace the ref_sorting list with a struct that contains both the
list of sorting keys, as well as options that apply to _all_
keys
I went with the first one here, as it gives more flexibility if we later
want to let the users set the flag per-key (presumably through some
special syntax when defining the key; for now it's all or nothing
through --ignore-case).
The new test covers this by sorting on both tagger and subject
case-insensitively, which should compare "a" and "A" identically, but
still sort them before "b" and "B". We'll break ties by sorting on the
refname to give ourselves a stable output (this is actually supposed to
be done automatically, but there's another bug which will be fixed in
the next commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the error condition when updating the sparse-checkout leaves
an empty working directory.
This behavior was added in 9e1afb167 (sparse checkout: inhibit empty
worktree, 2009-08-20). The comment was added in a7bc906f2 (Add
explanation why we do not allow to sparse checkout to empty working
tree, 2011-09-22) in response to a "dubious" comment in 84563a624
(unpack-trees.c: cosmetic fix, 2010-12-22).
With the recent "cone mode" and "git sparse-checkout init [--cone]"
command, it is common to set a reasonable sparse-checkout pattern
set of
/*
!/*/
which matches only files at root. If the repository has no such files,
then their "git sparse-checkout init" command will fail.
Now that we expect this to be a common pattern, we should not have the
commands fail on an empty working directory. If it is a confusing
result, then the user can recover with "git sparse-checkout disable"
or "git sparse-checkout set". This is especially simple when using cone
mode.
Reported-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice to use "$x" rather than "x" in arithmetric expansion was
working around a dash bug fixed in 0.5.4. Even Debian oldstable has
0.5.8 these days. And in the meantime, we've added almost two dozen
instances of the "x" form which you can find with:
git grep '$(([a-z]'
and nobody seems to have complained. Let's declare this workaround
obsolete and simplify our style guide.
Helped-by: Danh Doan <congdanhqx@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the added checks for invalid URLs in credentials, any locally
modified store files which might have empty lines or even comments
were reported[1] failing to parse as valid credentials.
Instead of doing a hard check for credentials, do a soft one and
therefore avoid the reported fatal error.
While at it add tests for all known corruptions that are currently
ignored to keep track of them and avoid the risk of regressions.
[1] https://stackoverflow.com/a/61420852/5005936
Reported-by: Dirk <dirk@ed4u.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's typical to find Markdown documentation alongside source code, and
having better context for documentation changes is useful; see also
commit 69f9c87d4 (userdiff: add support for Fountain documents,
2015-07-21).
The pattern is based on the CommonMark specification 0.29, section 4.2
<https://spec.commonmark.org/> but doesn't match empty headings, as
seeing them in a hunk header is unlikely to be useful.
Only ATX headings are supported, as detecting setext headings would
require printing the line before a pattern matches, or matching a
multiline pattern. The word-diff pattern is the same as the pattern for
HTML, because many Markdown parsers accept inline HTML.
Signed-off-by: Ash Holland <ash@sorrel.sh>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The upload-pack protocol v2 gave up too early before finding a
common ancestor, resulting in a wasteful fetch from a fork of a
project. This has been corrected to match the behaviour of v0
protocol.
* jt/v2-fetch-nego-fix:
fetch-pack: in protocol v2, reset in_vain upon ACK
fetch-pack: in protocol v2, in_vain only after ACK
fetch-pack: return enum from process_acks()
Error and verbose trace messages from "git push" did not redact
credential material embedded in URLs.
* js/anonymise-push-url-in-errors:
push: anonymize URLs in error messages and warnings
The "bugreport" tool.
* es/bugreport:
bugreport: drop extraneous includes
bugreport: add compiler info
bugreport: add uname info
bugreport: gather git version and build info
bugreport: add tool to generate debugging info
help: move list_config_help to builtin/help
Incompatible options "--root" and "--fork-point" of "git rebase"
have been marked and documented as being incompatible.
* en/rebase-root-and-fork-point-are-incompatible:
rebase: display an error if --root and --fork-point are both provided
Recent update to Homebrew used by macOS folks breaks build by
moving gettext library and necessary headers.
* ds/build-homebrew-gettext-fix:
macOS/brew: let the build find gettext headers/libraries/msgfmt
Compilation fix.
* dd/sparse-fixes:
progress.c: silence cgcc suggestion about internal linkage
graph.c: limit linkage of internal variable
compat/regex: move stdlib.h up in inclusion chain
test-parse-pathspec-file.c: s/0/NULL/ for pointer type
The multi-pack-index left mmapped file descriptors open when it
does not have to.
* ds/multi-pack-index:
multi-pack-index: close file descriptor after mmap
"git blame" learns to take advantage of the "changed-paths" Bloom
filter stored in the commit-graph file.
* ds/blame-on-bloom:
test-bloom: check that we have expected arguments
test-bloom: fix some whitespace issues
blame: drop unused parameter from maybe_changed_path
blame: use changed-path Bloom filters
tests: write commit-graph with Bloom filters
revision: complicated pathspecs disable filters
Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.
* gs/commit-graph-path-filter:
bloom: ignore renames when computing changed paths
commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
t4216: add end to end tests for git log with Bloom filters
revision.c: add trace2 stats around Bloom filter usage
revision.c: use Bloom filters to speed up path based revision walks
commit-graph: add --changed-paths option to write subcommand
commit-graph: reuse existing Bloom filters during write
commit-graph: write Bloom filters to commit graph file
commit-graph: examine commits by generation number
commit-graph: examine changed-path objects in pack order
commit-graph: compute Bloom filters for changed paths
diff: halt tree-diff early after max_changes
bloom.c: core Bloom filter implementation for changed paths.
bloom.c: introduce core Bloom filter constructs
bloom.c: add the murmur3 hash implementation
commit-graph: define and use MAX_NUM_CHUNKS
The commit-graph code exhausted file descriptors easily when it
does not have to.
* tb/commit-graph-fd-exhaustion-fix:
commit-graph: close descriptors after mmap
commit-graph.c: gracefully handle file descriptor exhaustion
t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests
commit-graph.c: don't use discarded graph_name in error
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
Tighten "git mailinfo" to notice and error out when decoded result
contains NUL in it.
* dd/mailinfo-with-nul:
mailinfo: disallow NUL character in mail's header
mailinfo.c: avoid strlen on strings that can contains NUL
t4254: merge 2 steps of a single test
Test clean-up.
* dl/test-must-fail-fixes-4:
t9819: don't use test_must_fail with p4
t9164: use test_must_fail only on git commands
t9160: use test_path_is_missing()
t9141: use test_path_is_missing()
t7508: don't use `test_must_fail test_cmp`
t7408: replace incorrect uses of test_must_fail
t6030: use test_path_is_missing()
The build procedure did not use the libcurl library and its include
files correctly for a custom-built installation.
* jk/build-with-right-curl:
Makefile: avoid running curl-config unnecessarily
Makefile: use curl-config --cflags
Makefile: avoid running curl-config multiple times
Fix alignment issues that were likely introduced due to an editor
using tab lengths of 4 instead of 8.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's an example of using your own bit of shell to act as a credential
helper, but it's not very realistic:
- It's stupid to hand out your secret password to _every_ host. In the
real world you'd use the config-matcher to limit it to a particular
host.
- We never provided a username. We can easily do that in another config
option (you can do it in the helper, too, but this is much more
readable).
- We were sending the secret even for store/erase operations. This
is OK because Git would just ignore it, but a real system would
probably be unlocking a password store, which you wouldn't want to do
more than necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We give several helper config examples, but don't make clear that these
are raw values. It's up to the user to add the appropriate quoting to
put them into a config file (either by running with "git config" and
quoting against the shell, or by adding double-quotes as appropriate
within the git-config file).
Let's flesh them out as full config blocks, which makes the syntax more
clear (and makes it possible for people to just cut-and-paste them as a
starting point). I added double-quotes to any values larger than a
single word. That isn't strictly necessary in all cases, but it
sidesteps explaining the rules about exactly when you need to quote a
value.
The existing quotes can be converted to single-quotes in one instance,
and backslash-esccaped in the other. I also swapped out backticks for
our preferred $().
Reported-by: douglas.fuller@gmail.com
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'commit_shallow_file()' and 'rollback_shallow_file()' were
introduced, they did not have a documenting comment, when they could
have benefited from one.
Add a brief note about what these functions do, and make a special note
that they reset stat-validity checks.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, some functions will be moved from 'commit.c' to have
prototypes in a new 'shallow.h' and their implementations in
'shallow.c'.
Three functions in 'commit.c' use 'commit_graft_pos()' (they are
'register_commit_graft()', 'lookup_commit_graft()', and
'unregister_shallow()'). The first two of these will stay in 'commit.c',
but the latter will move to 'shallow.c', and thus needs
'commit_graft_pos' to be non-static.
Prepare for that by making 'commit_graft_pos' non-static so that it can
be called from both 'commit.c' and 'shallow.c'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d787d311db (checkout: split part of it to new command 'switch',
2019-03-29), the `git switch` command was created by extracting the
common functionality of cmd_checkout() in checkout_main(). However, in
b7b5fce270 (switch: better names for -b and -B, 2019-03-29), the branch
creation and force creation options for 'switch' were changed to -c and
-C, respectively. As a result of this, error messages and comments that
previously referred to `-b` and `-B` became invalid for `git switch`.
For error messages that refer to `-b` and `-B`, use a format string
instead so that `-c` and `-C` can be printed when `git switch` is
invoked.
Reported-by: Robert Simpson
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref --stdin" learned a handful of new verbs to let the
user control ref update transactions more explicitly, which helps
as an ingredient to implement two-phase commit-style atomic
ref-updates across multiple repositories.
* ps/transactional-update-ref-stdin:
update-ref: implement interactive transaction handling
update-ref: read commands in a line-wise fashion
update-ref: move transaction handling into `update_refs_stdin()`
update-ref: pass end pointer instead of strbuf
update-ref: drop unused argument for `parse_refname`
update-ref: organize commands in an array
strbuf: provide function to append whole lines
git-update-ref.txt: add missing word
refs: fix segfault when aborting empty transaction
The directory traversal code had redundant recursive calls which
made its performance characteristics exponential with respect to
the depth of the tree, which was corrected.
* en/fill-directory-exponential:
completion: fix 'git add' on paths under an untracked directory
Fix error-prone fill_directory() API; make it only return matches
dir: replace double pathspec matching with single in treat_directory()
dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory()
dir: replace exponential algorithm with a linear one
dir: refactor treat_directory to clarify control flow
dir: fix confusion based on variable tense
dir: fix broken comment
dir: consolidate treat_path() and treat_one_path()
dir: fix simple typo in comment
t3000: add more testcases testing a variety of ls-files issues
t7063: more thorough status checking
"sparse-checkout" UI improvements.
* en/sparse-checkout:
sparse-checkout: provide a new reapply subcommand
unpack-trees: failure to set SKIP_WORKTREE bits always just a warning
unpack-trees: provide warnings on sparse updates for unmerged paths too
unpack-trees: make sparse path messages sound like warnings
unpack-trees: split display_error_msgs() into two
unpack-trees: rename ERROR_* fields meant for warnings to WARNING_*
unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier
sparse-checkout: use improved unpack_trees porcelain messages
sparse-checkout: use new update_sparsity() function
unpack-trees: add a new update_sparsity() function
unpack-trees: pull sparse-checkout pattern reading into a new function
unpack-trees: do not mark a dirty path with SKIP_WORKTREE
unpack-trees: allow check_updates() to work on a different index
t1091: make some tests a little more defensive against failures
unpack-trees: simplify pattern_list freeing
unpack-trees: simplify verify_absent_sparse()
unpack-trees: remove unused error type
unpack-trees: fix minor typo in comment
Update the CI configuration to use GitHub Actions, retiring the one
based on Azure Pipelines.
* dd/ci-swap-azure-pipelines-with-github-actions:
ci: let GitHub Actions upload failed tests' directories
ci: add a problem matcher for GitHub Actions
tests: when run in Bash, annotate test failures with file name/line number
ci: retire the Azure Pipelines definition
README: add a build badge for the GitHub Actions runs
ci: configure GitHub Actions for CI/PR
ci: run gem with sudo to install asciidoctor
ci: explicit install all required packages
ci: fix the `jobname` of the `GETTEXT_POISON` job
ci/lib: set TERM environment variable if not exist
ci/lib: allow running in GitHub Actions
ci/lib: if CI type is unknown, show the environment variables
A new CI job to build and run test suite on linux with musl libc
has been added.
* dd/ci-musl-libc:
travis: build and test on Linux with musl libc and busybox
ci/linux32: libify install-dependencies step
ci: refactor docker runner script
ci/linux32: parameterise command to switch arch
ci/lib-docker: preserve required environment variables
ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
The stash entry created by "git rebase --autosquash" to keep the
initial dirty state were discarded by mistake upon "git rebase
--quit", which has been corrected.
* dl/merge-autostash-rebase-quit-fix:
rebase: save autostash entry into stash reflog on --quit
In a previous commit, we made incremental graph layers read-only by
using 'git_mkstemp_mode' with permissions '0444'.
There is no reason that 'commit-graph-chain's should be modifiable by
the user, since they are generated at a temporary location and then
atomically renamed into place.
To ensure that these files are read-only, too, use
'hold_lock_file_for_update_mode' with the same read-only permission
bits, and let the umask and 'adjust_shared_perm' take care of the rest.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-layered commit-graphs use 'adjust_shared_perm' to make the
commit-graph file readable (or not) to a combination of the user, group,
and others.
Call 'adjust_shared_perm' for split-graph layers to make sure that these
also respect 'core.sharedRepository'. The 'commit-graph-chain' file
already respects this configuration since it uses
'hold_lock_file_for_update' (which calls 'adjust_shared_perm' eventually
in 'create_tempfile_mode').
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, Git learned 'hold_lock_file_for_update_mode' to
allow the caller to specify the permission bits (prior to further
adjustment by the umask and shared repository permissions) used when
acquiring a temporary file.
Use this in the commit-graph machinery for writing a non-split graph to
acquire an opened temporary file with permissions read-only permissions
to match the split behavior. (In the split case, Git uses
git_mkstemp_mode' for each of the commit-graph layers with permission
bits '0444').
One can notice this discrepancy when moving a non-split graph to be part
of a new chain. This causes a commit-graph chain where all layers have
read-only permission bits, except for the base layer, which is writable
for the current user.
Resolve this discrepancy by using the new
'hold_lock_file_for_update_mode' and passing the desired permission
bits.
Doing so causes some test fallout in t5318 and t6600. In t5318, this
occurs in tests that corrupt a commit-graph file by writing into it. For
these, 'chmod u+w'-ing the file beforehand resolves the issue. The
additional spot in 'corrupt_graph_verify' is necessary because of the
extra 'git commit-graph write' beforehand (which *does* rewrite the
commit-graph file). In t6600, this is caused by copying a read-only
commit-graph file into place and then trying to replace it. For these,
make these files writable.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both test_submodule_switch_recursing_with_args() and
test_submodule_forced_switch_recursing_with_args() call the internal
function test_submodule_recursing_with_args_common() with the final
argument of `--recurse-submodules`. Consolidate this duplication by
appending the argument in test_submodule_recursing_with_args_common().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the shell scripts in this codebase, the usual style is to include a
space between the function name and the (). Add these missing spaces to
conform to the usual style of the code.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading a malformed credential URL line and silently ignoring it
does not mean that we support empty lines and/or "# commented" lines
forever. We should document it to avoid confusion.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Those fetching over protocol v2 from linux-next and other kernel
repositories are reporting that v2 often fetches way too much than
needed.
* jn/demote-proto2-from-default:
Revert "fetch: default to protocol version 2"
GNU/Hurd is also among the ones that need the fopen() wrapper.
* jc/gnu-hurd-lets-fread-read-dirs:
config.mak.uname: Define FREAD_READS_DIRECTORIES for GNU/Hurd
"git grep" did not quote a path with unusual character like other
commands (like "git diff", "git status") do, but did quote when run
from a subdirectory, both of which has been corrected.
* mt/grep-cquote-path:
grep: follow conventions for printing paths w/ unusual chars
The "--decorate-refs" and "--decorate-refs-exclude" options "git
log" takes have learned a companion configuration variable
log.excludeDecoration that sits at the lowest priority in the
family.
* ds/log-exclude-decoration-config:
log: add log.excludeDecoration config option
log-tree: make ref_filter_match() a helper method
"git diff-tree --pretty --notes" used to hit an assertion failure,
as it forgot to initialize the notes subsystem.
* tb/diff-tree-with-notes:
diff-tree.c: load notes machinery when required
Allowing the user to split a patch hunk while "git stash -p" does
not work well; a band-aid has been added to make this (partially)
work better.
* js/stash-p-fix:
stash -p: (partially) fix bug concerning split hunks
t3904: fix incorrect demonstration of a bug
Code in builtin/*, i.e. those can only be called from within
built-in subcommands, that implements bulk of a couple of
subcommands have been moved to libgit.a so that they could be used
by others.
* dl/libify-a-few:
Lib-ify prune-packed
Lib-ify fmt-merge-msg
"git push --atomic" used to show failures for refs that weren't
even pushed, which has been corrected.
* jx/atomic-push:
transport-helper: new method reject_atomic_push()
transport-helper: mark failure for atomic push
send-pack: mark failure of atomic push properly
t5543: never report what we do not push
send-pack: fix inconsistent porcelain output
"git diff" in a partial clone learned to avoid lazy loading blob
objects in more casese when they are not needed.
* jt/avoid-prefetch-when-able-in-diff:
diff: restrict when prefetching occurs
diff: refactor object read
diff: make diff_populate_filespec_options struct
promisor-remote: accept 0 as oid_nr in function
"git commit-graph write --expire-time=<timestamp>" did not use the
given timestamp correctly, which has been corrected.
* ds/commit-graph-expiry-fix:
commit-graph: fix buggy --expire-time option
Documentation updates around the "--recurse-submodules" option.
* dr/doc-recurse-submodules:
doc: --recurse-submodules mostly applies to active submodules
doc: be more precise on (fetch|push).recurseSubmodules
doc: explain how to deactivate submodule.recurse completely
doc: document --recurse-submodules for reset and restore
doc: list all commands affected by submodule.recurse
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"
* jc/log-no-mailmap:
log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
clone: reorder --recursive/--recurse-submodules
parse-options: teach "git cmd -h" to show alias as alias
Raise the minimum required version of docbook-xsl package to 1.74,
as 1.74.0 was from late 2008, which is more than 10 years old, and
drop compatibility cruft from our documentation suite.
* ma/doc-discard-docbook-xsl-1.73:
user-manual.conf: don't specify [listingblock]
INSTALL: drop support for docbook-xsl before 1.74
manpage-normal.xsl: fold in manpage-base.xsl
manpage-bold-literal.xsl: stop using git.docbook.backslash
Doc: drop support for docbook-xsl before 1.73.0
Doc: drop support for docbook-xsl before 1.72.0
Doc: drop support for docbook-xsl before 1.71.1
The "git submodule" command did not initialize a few variables it
internally uses and was affected by variable settings leaked from
the environment.
* lx/submodule-clear-variables:
git-submodule.sh: setup uninitialized variables
The custom hash function used by "git fast-import" has been
replaced with the one from hashmap.c, which gave us a nice
performance boost.
* jk/fast-import-use-hashmap:
fast-import: replace custom hash with hashmap.c
The config API made mixed uses of int and size_t types to represent
length of various pieces of text it parsed, which has been updated
to use the correct type (i.e. size_t) throughout.
* jk/config-use-size-t:
config: reject parsing of files over INT_MAX
config: use size_t to store parsed variable baselen
git_config_parse_key(): return baselen as size_t
config: drop useless length variable in write_pair()
parse_config_key(): return subsection len as size_t
remote: drop auto-strlen behavior of make_branch() and make_rewrite()
Validation of push certificate has been made more robust against
timing attacks.
* bc/constant-memequal:
receive-pack: compilation fix
builtin/receive-pack: use constant-time comparison for HMAC value
The code that refreshes the last access and modified time of
on-disk packfiles and loose object files have been updated.
* lr/freshen-file-fix:
freshen_file(): use NULL `times' for implicit current-time
"git rebase" happens to call some hooks meant for "checkout" and
"commit" by this was not a designed behaviour than historical
accident. This has been documented.
* en/rebase-doc-hooks-called-by-accident:
git-rebase.txt: add another hook to the hooks section, and explain more
Document the recommended way to abort a failing test early (e.g. by
exiting a loop), which is to say "return 1".
* jc/doc-test-leaving-early:
t/README: suggest how to leave test early with failure
Various tests have been updated to work around issues found with
shell utilities that come with busybox etc.
* dd/test-with-busybox:
t5703: feed raw data into test-tool unpack-sideband
t4124: tweak test so that non-compliant diff(1) can also be used
t7063: drop non-POSIX argument "-ls" from find(1)
t5616: use rev-parse instead to get HEAD's object_id
t5003: skip conversion test if unzip -a is unavailable
t5003: drop the subshell in test_lazy_prereq
test-lib-functions: test_cmp: eval $GIT_TEST_CMP
t4061: use POSIX compliant regex(7)
Just like 47abd85ba0 (fetch: Strip usernames from url's before storing
them, 2009-04-17) and later 882d49ca5c (push: anonymize URL in status
output, 2016-07-13), and even later c1284b21f2 (curl: anonymize URLs
in error messages and warnings, 2019-03-04) this change anonymizes URLs
(read: strips them of user names and especially passwords) in
user-facing error messages and warnings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a03b55530a (merge: teach --autostash option, 2020-04-07), the
--autostash option was introduced for `git merge`. Notably, when
`git merge --quit` is run with an autostash entry present, it is saved
into the stash reflog. This is contrasted with the current behaviour of
`git rebase --quit` where the autostash entry is simply just dropped out
of existence.
Adopt the behaviour of `git merge --quit` in `git rebase --quit` and
save the autostash entry into the stash reflog instead of just deleting
it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the usage for `git push` is shown, it includes the following
lines
--recurse-submodules[=(check|on-demand|no)]
control recursive pushing of submodules
which seem to indicate that the argument for --recurse-submodules is
optional. However, we cannot actually run that optiion without an
argument:
$ git push --recurse-submodules
fatal: recurse-submodules missing parameter
Unset PARSE_OPT_OPTARG so that it is clear that this option requires an
argument. Since the parse-options machinery guarantees that an argument
is present now, assume that `arg` is set in the else of
option_parse_recurse_submodules().
Reported-by: Andrew White <andrew.white@audinate.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, there are many options which use OPTION_CALLBACK in a
plain ol' struct definition. However, we have the OPT_CALLBACK and
OPT_CALLBACK_F macros which are meant to abstract these plain struct
definitions away. These macros are useful as they semantically signal to
developers that these are just normal callback option with nothing fancy
happening.
Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or
OPT_CALLBACK_F where applicable. The heavy lifting was done using the
following (disgusting) shell script:
#!/bin/sh
do_replacement () {
tr '\n' '\r' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' |
tr '\r' '\n'
}
for f in $(git ls-files \*.c)
do
do_replacement <"$f" >"$f.tmp"
mv "$f.tmp" "$f"
done
The result was manually inspected and then reformatted to match the
style of the surrounding code. Finally, using
`git grep OPTION_CALLBACK \*.c`, leftover results which were not handled
by the script were manually transformed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test added by 477dcaddb6 (tests: do not let lazy prereqs inside
`test_expect_*` turn off tracing, 2020-03-26) runs a sub-test script
that traces a test with a lazy prereq, like:
test_have_prereq LAZY && echo trace
That won't work if GIT_TEST_FAIL_PREREQS is set in the environment,
because our have_prereq will report failure, and we won't run the echo
at all.
We could work around this by avoiding the &&-chain, but we can
fix this and any future tests at once by unsetting that variable for our
sub-tests. These are meant to be controlled environments where we test
the test-suite itself; the outer test snippet should be in charge of the
sub-test environment, not whatever mode the user happens to be running
in.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the function process_acks() in fetch-pack.c, the variable
received_ack is meant to track that an ACK was received, but it was
never set. This results in negotiation terminating prematurely through
the in_vain counter, when the counter should have been reset upon every
ACK.
Therefore, reset the in_vain counter upon every ACK.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, Git stops negotiation when it has sent at least
MAX_IN_VAIN (which is 256) "have" lines without having any of them
ACK-ed. But this is supposed to trigger only after the first ACK, as
pack-protocol.txt says:
However, the 256 limit *only* turns on in the canonical client
implementation if we have received at least one "ACK %s continue"
during a prior round. This helps to ensure that at least one common
ancestor is found before we give up entirely.
The code path for protocol v0 observes this, but not protocol v2,
resulting in shorter negotiation rounds but significantly larger
packfiles. Teach the code path for protocol v2 to check this criterion
only after at least one ACK was received.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
process_acks() returns 0, 1, or 2, depending on whether "ready" was
received and if not, whether at least one commit was found to be common.
Replace these magic numbers with a documented enum.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the generic parts of the source files, system headers like
<time.h> and <stdio.h> are supposed to be included indirectly
by including "git-compat-util.h", which manages portability issues.
Drop our explicit inclusions and rely on "cache.h", which includes
"git-compat-util.h".
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--root implies we want to rebase all commits since the beginning of
history. --fork-point means we want to use the reflog of the specified
upstream to find the best common ancestor between <upstream> and
<branch> and only rebase commits since that common ancestor. These
options are clearly contradictory, so throw an error (instead of
segfaulting on a NULL pointer) if both are specified.
Reported-by: Alexander Berg <alexander.berg@atos.net>
Documentation-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
46fd7b3900 ("credential: allow wildcard patterns when matching config",
2020-02-20) introduced support for matching credential helpers using
urlmatch. In doing so, it introduced code to percent-encode the paths
we get from the credential helper so that they could be effectively
matched by the urlmatch code.
Unfortunately, that code had a bug: it percent-encoded the slashes in
the path, resulting in any URL path that contained multiple levels
(i.e., a directory component) not matching.
We are currently the only caller of the percent-encoding code and could
simply change it not to encode slashes. However, we still want to
encode slashes in the username component, so we need to have both
behaviors available.
So instead, let's add a flag to control encoding slashes, which is the
behavior we want here, and use it when calling the code in this case.
Add a test for credential helper URLs using multiple slashes in the
path, which our test suite previously lacked, as well as one ensuring
that we handle usernames with slashes gracefully. Since we're testing
other percent-encoding handling, let's add one for non-ASCII UTF-8
characters as well.
Reported-by: Ilya Tretyakov <it@it3xl.ru>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apparently a recent Homebrew update now installs `gettext` into the
subdirectory /usr/local/opt/gettext/[lib/include].
Sometimes the ci job succeeds:
brew link --force gettext
Linking /usr/local/Cellar/gettext/0.20.1... 179 symlinks created
And sometimes installing the package "gettext" with force-link fails:
brew link --force gettext
Warning: Refusing to link macOS provided/shadowed software: gettext
If you need to have gettext first in your PATH run:
echo 'export PATH="/usr/local/opt/gettext/bin:$PATH"' >> ~/.bash_profile
(And the is not the final word either, since macOS itself says:
The default interactive shell is now zsh.)
Anyway, The latter requires CFLAGS to include /usr/local/opt/gettext/include
and LDFLAGS to include /usr/local/opt/gettext/lib.
Likewise, the `msgfmt` tool is no longer in the `PATH`.
While it is unclear which change is responsible for this breakage (that
most notably only occurs on CI build agents that updated very recently),
https://github.com/Homebrew/homebrew-core/pull/53489 has fixed it.
Nevertheless, let's work around this issue, as there are still quite a
few build agents out there that need some help in this regard: we
explicitly do not call `brew update` in our CI/PR builds anymore.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use 'hold_lock_file_for_update' (and the '_timeout') variant to
acquire a lock when updating references, the commit-graph file, and so
on.
In particular, the commit-graph machinery uses this to acquire a
temporary file that is used to write a non-split commit-graph. In a
subsequent commit, an issue in the commit-graph machinery produces
graph files that have a different permission based on whether or not
they are part of a multi-layer graph will be addressed.
To do so, the commit-graph machinery will need a version of
'hold_lock_file_for_update' that takes the permission bits from the
caller.
Introduce such a function in this patch for both the
'hold_lock_file_for_update' and 'hold_lock_file_for_update_timeout'
functions, and leave the existing functions alone by inlining their
definitions in terms of the new mode variants.
Note that, like in the previous commit, 'hold_lock_file_for_update_mode'
is not guarenteed to set the given mode, since it may be modified by
both the umask and 'core.sharedRepository'.
Note also that even though the commit-graph machinery only calls
'hold_lock_file_for_update', that this is defined in terms of
'hold_lock_file_for_update_timeout', and so both need an additional mode
parameter here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, 'hold_lock_file_for_update' will gain an additional
'mode' parameter to specify permissions for the associated temporary
file.
Since the lockfile.c machinery uses 'create_tempfile' which always
creates a temporary file with global read-write permissions, introduce a
variant here that allows specifying the mode.
Note that the mode given to 'create_tempfile_mode' is not guaranteed to
be written to disk, since it is subject to both the umask and
'core.sharedRepository'.
Arguably, all temporary files should have permission 0444, since they
are likely to be renamed into place and then not written to again. This
is a much larger change than we may want to take on in this otherwise
small patch, so for the time being, make 'create_tempfile' behave as it
has always done by inlining it to 'create_tempfile_mode' with mode set
to '0666'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Linux with musl libc, we have this inclusion chain:
compat/regex/regex.c:69
`-> compat/regex/regex_internal.h
`-> /usr/include/stdlib.h
`-> /usr/include/features.h
`-> /usr/include/alloca.h
In that inclusion chain, `<features.h>` claims it's _BSD_SOURCE
compatible when it's NOT asked to be either
{_POSIX,_GNU,_XOPEN,_BSD}_SOURCE, or __STRICT_ANSI__.
And, `<stdlib.h>` will include `<alloca.h>` to be compatible with
software written for GNU and BSD. Thus, redefine `alloca` macro,
which was defined before at compat/regex/regex.c:66.
Considering this is only compat code, we've taken from other project,
it's not our business to decide which source should we adhere to.
Include `<stdlib.h>` early to prevent the redefinition of alloca.
This also remove a potential warning about alloca not defined on:
#undef alloca
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't ever refer to the descriptor after mmap-ing it. And keeping it
open means we can run out of descriptors in degenerate cases (e.g.,
thousands of split chain files). Let's close it as soon as possible.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit trailers like "Thanks-to:", "Fixes:", and "Closes:" are fairly
common, but gitweb didn't highlight them like other trailers.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
356aea6 ("doc: move extensions.worktreeConfig to the right place",
2018-11-14) moved the explanation of extension.worktreeConfig from
config.txt to technical/repository-version.txt. However, the former
still contains a reference to the removed paragraph. We could fix it
referencing the gitrepository-layout man page, which contains the moved
explanation. But the git-worktree man page has additional information
and recommendations for the worktree config file, so let's reference it
instead.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to the fixes for CVE-2020-11008, we were _very_ lenient in what we
required from a URL in order to parse it into a `struct credential`.
That led to serious vulnerabilities.
There was one call site, though, that really needed that leniency: when
parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
Settings like this might be desired when users want to use, say, a given
user name on a given host, regardless of the protocol to be used.
In preparation for fixing that bug, let's refactor the code to
optionally allow for partial URLs. For the moment, this functionality is
only exposed via the now-renamed function `credential_from_url_1()`, but
it is not used. The intention is to make it easier to verify that this
commit does not change the existing behavior unless explicitly allowing
for partial URLs.
Please note that this patch does more than just reinstating a way to
imitate the behavior before those CVE-2020-11008 fixes: Before that, we
would simply ignore URLs without a protocol. In other words,
misleadingly, the following setting would be applied to _all_ URLs:
[credential "example.com"]
username = that-me
The obvious intention is to match the host name only. With this patch,
we allow precisely that: when parsing the URL with non-zero
`allow_partial_url`, we do not simply return success if there was no
protocol, but we simply leave the protocol unset and continue parsing
the URL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to the fixes for CVE-2020-11008, we were _very_ lenient in what we
required from a URL in order to parse it into a `struct credential`.
That led to serious vulnerabilities.
There was one call site, though, that really needed that leniency: when
parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
Settings like this might be desired when users want to use, say, a given
user name on a given host, regardless of the protocol to be used.
In preparation for fixing that bug, let's refactor the code to
optionally allow for partial URLs. For the moment, this functionality is
only exposed via the now-renamed function `credential_from_url_1()`, but
it is not used. The intention is to make it easier to verify that this
commit does not change the existing behavior unless explicitly allowing
for partial URLs.
Please note that this patch does more than just reinstating a way to
imitate the behavior before those CVE-2020-11008 fixes: Before that, we
would simply ignore URLs without a protocol. In other words,
misleadingly, the following setting would be applied to _all_ URLs:
[credential "example.com"]
username = that-me
The obvious intention is to match the host name only. With this patch,
we allow precisely that: when parsing the URL with non-zero
`allow_partial_url`, we do not simply return success if there was no
protocol, but we simply leave the protocol unset and continue parsing
the URL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was a lot going on behind the scenes when the vulnerability and
possible solutions were discussed. Grammar was not a primary focus,
that's why this slipped in.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-commit(1) says ISO-8601 is one of our supported date format.
ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.
And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:
2008-02-14 20:30:45.019-04:00
While doing this, make sure that we only interpret the number after the
second and the dot as fractional when and only when the date is known,
since only ISO-8601 allows the fractional part, and we've taught our
users to interpret "12:34:56.7.days.ago" as a way to specify a time
relative to current time.
Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will reuse this logic, move it to a helper, now.
While we're at it, explicit states that we intentionally ignore
old-and-defective 2nd leap second.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A number of spots in t5537 use the non-indented heredoc '<<EOF' when
they would benefit from instead using '<<-EOF' or simply
test_write_lines.
In preparation for adding new tests in a good style and being consistent
with the surrounding code, update the existing tests to improve their
readability.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index subsystem was not closing its file descriptor
after memory-mapping the file contents. After this mmap() succeeds,
there is no need to keep the file descriptor open. In fact, there
is signficant reason to close it so we do not run out of
descriptors.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "test-tool bloom" is not fed a command, or if arguments are missing
for some commands, it will just segfault. Let's check argc and write a
friendlier usage message.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a layered commit-graph, the commit-graph machinery uses
'commit_graph_filenames_after' and 'commit_graph_hash_after' to keep
track of the layers in the chain that we are in the process of writing.
When the number of commit-graph layers shrinks, we initialize all
entries in the aforementioned arrays, because we know the structure of
the new commit-graph chain immediately (since there are no new layers,
there are no unknown hash values).
But when the number of commit-graph layers grows (i.e., that
'num_commit_graphs_after > num_commit_graphs_before'), then we leave
some entries in the filenames and hashes arrays as uninitialized,
because we will fill them in later as those values become available.
For instance, we rely on 'write_commit_graph_file's to store the
filename and hash of the last layer in the new chain, which is the one
that it is responsible for writing. But, it's possible that
'write_commit_graph_file' may fail, e.g., from file descriptor
exhaustion. In this case it is possible that 'git_mkstemp_mode' will
fail, and that function will return early *before* setting the values
for the last commit-graph layer's filename and hash.
This causes a number of upleasant side-effects. For instance, trying to
'free()' each entry in 'ctx->commit_graph_filenames_after' (and
similarly for the hashes array) causes us to 'free()' uninitialized
memory, since the area is allocated with 'malloc()' and is therefore
subject to contain garbage (which is left alone when
'write_commit_graph_file' returns early).
This can manifest in other issues, like a general protection fault,
and/or leaving a stray 'commit-graph-chain.lock' around after the
process dies. (The reasoning for this is still a mystery to me, since
we'd otherwise usually expect the kernel to run tempfile.c's 'atexit()'
handlers in the case of a normal death...)
To resolve this, initialize the memory with 'CALLOC_ARRAY' so that
uninitialized entries are filled with zeros, and can thus be 'free()'d
as a noop instead of causing a fault.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1400 the prerequisite 'ULIMIT_FILE_DESCRIPTORS' is defined and used
to effectively guard the helper function 'run_with_limited_open_files'
from being used on systems that do not satisfy this prerequisite.
In the subsequent patch, we will introduce another test outside of t1400
that would benefit from using this prerequisite. So, move it to
'test-lib.sh' instead so that it can be used by multiple tests.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a commit-graph layer, we do so in a temporary file which is
renamed into place. If we fail to create a temporary file, for e.g.,
because we have too many open files, then 'git_mkstemp_mode' sets the
pattern to the empty string, in which case we get an error something
along the lines of:
error: unable to create ''
It's not useful to show the pattern here at all, since we (1) know the
pattern is well-formed, and (2) would have already shown the dirname
when trying to create the leading directories. So, replace this error
with something friendlier.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't use the "parent" parameter at all (probably because the bloom
filter for a commit is always defined against a single parent anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Looking at the value of %(push:remoteref) only handles the case when an
explicit push refspec is passed. But it does not handle the fallback
cases of looking at the configuration value of `push.default`.
In particular, doing something like
git config push.default current
git for-each-ref --format='%(push)'
git for-each-ref --format='%(push:remoteref)'
prints a useful tracking ref for the first for-each-ref, but an empty
string for the second.
Since the intention of %(push:remoteref), from 9700fae5ee (for-each-ref:
let upstream/push report the remote ref name) is to get exactly which
branch `git push` will push to, even in the fallback cases, fix this.
To get the meaning of %(push:remoteref), `ref-filter.c` calls
`remote_ref_for_branch`. We simply add a new static helper function,
`branch_get_push_remoteref` that follows the logic of
`branch_get_push_1`, and call it from `remote_ref_for_branch`.
We also update t/6300-for-each-ref.sh to handle all `push.default`
strategies. This involves testing `push.default=simple` twice, once
where there is a matching upstream branch and once when there is none.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function is_date, confusingly also set tm_year. tm_mon, and tm_mday
after validating input.
Rename to set_date to reflect its real usage.
Also, change return value is 0 on success and -1 on failure following
our convention on function do some real work.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_state()` shell functions in C and also add a
subcommand `--bisect-state` to `git-bisect--helper` to call them from
git-bisect.sh .
Using `--bisect-state` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and will be called by some
other methods.
`bisect_head()` is only called from `bisect_state()`, thus it is not
required to introduce another subcommand.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the subcommand to `git bisect--helper` and call it from
git-bisect.sh.
With the conversion of `bisect_auto_next()` from shell to C in a
previous commit, `bisect_start()` can now be fully ported to C.
So let's complete the `--bisect-start` subcommand of
`git bisect--helper` so that it fully implements `bisect_start()`,
and let's use this subcommand in `git-bisect.sh` instead of
`bisect_start()`.
This removes the signal handling we had in `bisect_start()` as we
don't really need it. While at it, "trap" is changed to "handle".
As "trap" is a reference to the shell "trap" builtin, which isn't
used anymore.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_next()` and the `bisect_auto_next()` shell functions
in C and add the subcommands to `git bisect--helper` to call them from
git-bisect.sh .
bisect_auto_next() function returns an enum bisect_error type as whole
`git bisect` can exit with an error code when bisect_next() does.
Using `--bisect-next` and `--bisect-auto-next` subcommands is a
temporary measure to port shell function to C so as to use the existing
test suite. As more functions are ported, `--bisect-auto-next`
subcommand will be retired and will be called by some other methods.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_autostart()` shell function in C and add the
C implementation from `bisect_next()` which was previously left
uncovered. Also add a subcommand `--bisect-autostart` to
`git bisect--helper` be called from `bisect_state()` from
git-bisect.sh .
Using `--bisect-autostart` subcommand is a temporary measure to port
shell function to C so as to use the existing test suite. As more
functions are ported, this subcommand will be retired and
bisect_autostart() will be called directly by `bisect_state()`.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor code adding a new `write_in_file()` function
that opens a file for writing a message and closes it.
This helper will be used in later steps and makes the code
simpler and easier to understand.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following 'enum bisect_error' vocabulary, return variable 'res' is
always non-positive.
Let's use '-res' instead of 'abs(res)' to make the code clearer.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a `cmd_*()` function, return `error()` cannot be used
because that translates to `-1` and `cmd_*()` functions need
to return exit codes.
Let's fix switch default return.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're passing buffer from strbuf to reencode_string,
which will call strlen(3) on that buffer,
and discard the length of newly created buffer.
Then, we compute the length of the return buffer to attach to strbuf.
During this process, we introduce a discrimination between mail
originally written in utf-8 and other encoding.
* if the email was written in utf-8, we leave it as is. If there is
a NUL character in that line, we complains loudly:
error: a NUL byte in commit log message not allowed.
* if the email was written in other encoding, we truncate the data as
the NUL character in that line, then we used the truncated line for
the metadata.
We can do better by reusing all the available information,
and call the underlying lower level function that will be called
indirectly by reencode_string. By doing this, we will also postpone
the NUL character processing to the commit step, which will
complains about the faulty metadata.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we are at it, make sure we run a clean up after testing.
In a later patch, we will test for more corrupted patch.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing of URL for the credential helper has been corrected.
* jk/credential-parsing-end-of-host-in-URL:
credential: treat "?" and "#" in URLs as end of host
Allow "git rebase" to reapply all local commits, even if the may be
already in the upstream, without checking first.
* jt/rebase-allow-duplicate:
rebase --merge: optionally skip upstreamed commits
"git rebase" (again) learns to honor "--no-keep-empty", which lets
the user to discard commits that are empty from the beginning (as
opposed to the ones that become empty because of rebasing). The
interactive rebase also marks commits that are empty in the todo.
* en/rebase-no-keep-empty:
rebase: fix an incompatible-options error message
rebase: reinstate --no-keep-empty
rebase -i: mark commits that begin empty in todo editor
A Windows-specific test element has been made more robust against
misuse from both user's environment and programmer's errors.
* js/mingw-is-hidden-test-fix:
t: restrict `is_hidden` to be called only on Windows
mingw: make test_path_is_hidden more robust
t: consolidate the `is_hidden` functions
The interactive input from various codepaths are consolidated and
any prompt possibly issued earlier are fflush()ed before we read.
* js/flush-prompt-before-interative-input:
interactive: explicitly `fflush` stdout before expecting input
interactive: refactor code asking the user for interactive input
"git log" learned "--show-pulls" that helps pathspec limited
history views; a merge commit that takes the whole change from a
side branch, which is normally omitted from the output, is shown
in addition to the commits that introduce real changes.
* ds/revision-show-pulls:
revision: --show-pulls adds helpful merges
Misc fixes for Windows.
* js/mingw-fixes:
mingw: help debugging by optionally executing bash with strace
mingw: do not treat `COM0` as a reserved file name
mingw: use modern strftime implementation if possible
We've left the command line parsing of "git log :/a/b/" broken for
about a full year without anybody noticing, which has been
corrected.
* jc/missing-ref-store-fix:
repository: mark the "refs" pointer as private
sha1-name: do not assume that the ref store is initialized
The output from "git format-patch" uses RFC 2047 encoding for
non-ASCII letters on From: and Subject: headers, so that it can
directly be fed to e-mail programs. A new option has been added
to produce these headers in raw.
* eb/format-patch-no-encode-headers:
format-patch: teach --no-encode-email-headers
The more aggressive updates to remote-tracking branches we had for
the past 7 years or so were not reflected in the documentation,
which has been corrected.
* pb/pull-fetch-doc:
pull doc: correct outdated description of an example
pull doc: refer to a specific section in 'fetch' doc
"git rebase" learned the "--no-gpg-sign" option to countermand
commit.gpgSign the user may have.
* dd/no-gpg-sign:
Documentation: document merge option --no-gpg-sign
Documentation: merge commit-tree --[no-]gpg-sign
Documentation: reword commit --no-gpg-sign
Documentation: document am --no-gpg-sign
cherry-pick/revert: honour --no-gpg-sign in all case
rebase.c: honour --no-gpg-sign
The logic to auto-follow tags by "git clone --single-branch" was
not careful to avoid lazy-fetching unnecessary tags, which has been
corrected.
* jk/use-quick-lookup-in-clone-for-tag-following:
clone: use "quick" lookup while following tags
"git rebase" with the merge backend did not work well when the
rebase.abbreviateCommands configuration was set.
* ag/rebase-merge-allow-ff-under-abbrev-command:
t3432: test `--merge' with `rebase.abbreviateCommands = true', too
sequencer: don't abbreviate a command if it doesn't have a short form
Code cleanup.
* jk/oid-array-cleanups:
oidset: stop referring to sha1-array
ref-filter: stop referring to "sha1 array"
bisect: stop referring to sha1_array
test-tool: rename sha1-array to oid-array
oid_array: rename source file from sha1-array
oid_array: use size_t for iteration
oid_array: use size_t for count and allocation
"git pull --rebase" tried to run a rebase even after noticing that
the pull results in a fast-forward and no rebase is needed nor
sensible, for the past few years due to a mistake nobody noticed.
* en/pull-do-not-rebase-after-fast-forwarding:
pull: avoid running both merge and rebase
"git pull" shares many options with underlying "git fetch", but
some of them were not documented and some of those that would make
sense to pass down were not passed down.
* rs/pull-options-sync-code-and-doc:
pull: pass documented fetch options on
pull: remove --update-head-ok from documentation
The server-end of the v2 protocol to serve "git clone" and "git
fetch" was not prepared to see a delim packets at unexpected
places, which led to a crash.
* jk/harden-protocol-v2-delim-handling:
test-lib-functions: simplify packetize() stdin code
upload-pack: handle unexpected delim packets
test-lib-functions: make packetize() more efficient
Utitiles run via the run_command() API were not spawned correctly
on Cygwin, when the paths to them are given as a full path with
backslashes.
* ak/run-command-on-cygwin-fix:
run-command: trigger PATH lookup properly on Cygwin
When fed a midx that records no objects, some codepaths tried to
loop from 0 through (num_objects-1), which, due to integer
arithmetic wrapping around, made it nonsense operation with out of
bounds array accesses. The code has been corrected to reject such
an midx file.
* dr/midx-avoid-int-underflow:
midx.c: fix an integer underflow
Simplify the commit ancestry connectedness check in a partial clone
repository in which "promised" objects are assumed to be obtainable
lazily on-demand from promisor remote repositories.
* jt/connectivity-check-optim-in-partial-clone:
connected: always use partial clone optimization
"git p4" learned four new hooks and also "--no-verify" option to
bypass them (and the existing "p4-pre-submit" hook).
* bk/p4-pre-edit-changelist:
git-p4: add RCS keyword status message
git-p4: add p4 submit hooks
git-p4: restructure code in submit
git-p4: add --no-verify option
git-p4: add p4-pre-submit exit text
git-p4: create new function run_git_hook
git-p4: rewrite prompt to be Windows compatible
The import-tars importer (in contrib/fast-import/) used to create
phony files at the top-level of the repository when the archive
contains global PAX headers, which made its own logic to detect and
omit the common leading directory ineffective, which has been
corrected.
* js/import-tars-do-not-make-phony-files-from-pax-headers:
import-tars: ignore the global PAX header
Enable tests that require GnuPG on Windows.
* js/tests-gpg-integration-on-windows:
tests: increase the verbosity of the GPG-related prereqs
tests: turn GPG, GPGSM and RFC1991 into lazy prereqs
tests: do not let lazy prereqs inside `test_expect_*` turn off tracing
t/lib-gpg.sh: stop pretending to be a stand-alone script
tests(gpg): allow the gpg-agent to start on Windows
This reverts commit 684ceae32d.
Users fetching from linux-next and other kernel remotes are reporting
that the limited ref advertisement causes negotiation to reach
MAX_IN_VAIN, resulting in too-large fetches.
Reported-by: Lubomir Rintel <lkundrak@v3.sk>
Reported-by: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GNU/Hurd is another platform that behaves like this. Set it to
UnfortunatelyYes so that config directory files are correctly processed.
This fixes the corresponding 'proper error on directory "files"' test in
t1308-config-set.sh.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not use C99 "for loop initial declaration" in our codebase
(yet), but one snuck in.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some asynchronous operations, we build a chain of callbacks to
execute when the operation is done. These callbacks are held in $after,
and a new callback can be added by appending to $after. Once the
operation is done, $after is executed as a script.
But if we don't append a semi-colon after the procedure calls, they will
appear to Tcl as arguments to the previous procedure's arguments. So,
for example, if $after is "foo", and we just append "bar", then $after
becomes "foo bar", and bar will be treated as an argument to foo. If foo
does not accept any optional arguments, it would result in Tcl throwing
an error. If instead we do append a semi-colon, $after will look like
"foo;bar;", and these will be treated as two separate procedure calls.
Before d9c6469 (git-gui: update status bar to track operations,
2019-12-01), this problem was masked because ui_ready/ui_status did
accept an optional argument. In d9c6469, ui_ready stopped accepting an
optional argument, and this error started showing up.
Another instance of this problem is when a call to ui_status without a
trailing semicolon. ui_status never accepted an optional argument to
begin with, but the issue never managed to surface.
So, fix these errors by making sure we always append a semi-colon after
procedure calls when multiple callbacks are involved in $after.
Helped-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Ansgar Röber <ansgar.roeber@rwth-aachen.de>
In the example by Jon Loeliger the selector 'A^2' was duplicated. This
might confuse readers.
Signed-off-by: Michael F. Schönitzer <michael@schoenitzer.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since its introduction in 7249e91 (revision.c: support --notes
command-line option, 2011-03-29), combining '--notes' with any option
that causes us to format notes (e.g., '--pretty', '--format="%N"', etc)
results in a failed assertion at runtime.
$ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
Aborted
This failure is due to diff-tree not calling 'load_display_notes' to
initialize the notes machinery.
Ordinarily, this failure isn't triggered, because it requires passing
both '--notes' and another of the above mentioned options. In the case
of '--pretty', for example, we set 'opt->verbose_header', causing
'show_log()' to eventually call 'format_display_notes()', which expects
a non-NULL 'display_note_trees'.
Without initializing the notes machinery, 'display_note_trees' remains
NULL, and thus triggers an assertion failure.
Fix this by initializing the notes machinery after parsing our options,
and harden this behavior against regression with a test in t4013. (Note
that the added ref in this test requires updating two unrelated tests
which use 'log --all', and thus need to learn about the new refs).
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were using `test_must_fail p4` to test that the p4 command failed as
expected. However, test_must_fail() is used to ensure that commands fail
in an expected way, not due to something like a segv. Since we are not
in the business of verifying the sanity of the external world, replace
`test_must_fail p4` with `! p4` and assume that the `p4` command does
not die unexpectedly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_must_fail` function should only be used for git commands;
we are not in the business of catching segmentation fault by external
commands. Shell helper functions test_cmp and svn_cmd used in this
script are wrappers around external commands, so just use `! cmd`
instead of `test_must_fail cmd`
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
this file not exist, but there shouldn't exit _any_ filesystem entity in
these paths, replace `test_must_fail test -f` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
these directories not exist, but there shouldn't exist _any_ filesystem
entity in these paths, replace `test_must_fail test -d` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to t/README, test_must_fail() should only be used to test for
failure in Git commands.
Replace the invocation of `test_must_fail test_path_is_file` with
`test_path_is_missing` since, in this test case, the path should not
exist at all.
In all the cases where `test_must_fail test_alternate_is_used` appears,
test_alternate_is_used() fails because test_line_count() cannot open the
non-existent $alternates_file. Replace
`test_must_fail test_alternate_is_used` with `test_path_is_missing` to
test for this directly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we should assume that external commands work sanely. Replace
`test_must_fail test -e` with `test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep does not follow the conventions used by other Git commands when
printing paths that contain unusual characters (as double-quotes or
newlines). Commands such as ls-files, commit, status and diff will:
- Quote and escape unusual pathnames, by default.
- Print names verbatim and unquoted when "-z" is used.
But grep *never* quotes/escapes absolute paths with unusual chars and
*always* quotes/escapes relative ones, even with "-z". Besides being
inconsistent in its own output, the deviation from other Git commands
can be confusing. So let's make it follow the two rules above and add
some tests for this new behavior. Note that, making grep quote/escape
all unusual paths by default, also make it fully compliant with the
core.quotePath configuration, which is currently ignored for absolute
paths.
Reported-by: Greg Hurrell <greg@hurrell.net>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's URL parser interprets
https:///example.com/repo.git
to have no host and a path of "example.com/repo.git". Curl, on the
other hand, internally redirects it to https://example.com/repo.git. As
a result, until "credential: parse URL without host as empty host, not
unset", tricking a user into fetching from such a URL would cause Git to
send credentials for another host to example.com.
Teach fsck to block and detect .gitmodules files using such a URL to
prevent sharing them with Git versions that are not yet protected.
A relative URL in a .gitmodules file could also be used to trigger this.
The relative URL resolver used for .gitmodules does not normalize
sequences of slashes and can follow ".." components out of the path part
and to the host part of a URL, meaning that such a relative URL can be
used to traverse from a https://foo.example.com/innocent superproject to
a https:///attacker.example.com/exploit submodule. Fortunately,
redundant extra slashes in .gitmodules are rare, so we can catch this by
detecting one after a leading sequence of "./" and "../" components.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Until "credential: refuse to operate when missing host or protocol",
Git's credential handling code interpreted URLs with empty scheme to
mean "give me credentials matching this host for any protocol".
Luckily libcurl does not recognize such URLs (it tries to look for a
protocol named "" and fails). Just in case that changes, let's reject
them within Git as well. This way, credential_from_url is guaranteed to
always produce a "struct credential" with protocol and host set.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
libcurl permits making requests without a URL scheme specified. In
this case, it guesses the URL from the hostname, so I can run
git ls-remote http::ftp.example.com/path/to/repo
and it would make an FTP request.
Any user intentionally using such a URL is likely to have made a typo.
Unfortunately, credential_from_url is not able to determine the host and
protocol in order to determine appropriate credentials to send, and
until "credential: refuse to operate when missing host or protocol",
this resulted in another host's credentials being leaked to the named
host.
Teach credential_from_url_gently to consider such a URL to be invalid
so that fsck can detect and block gitmodules files with such URLs,
allowing server operators to avoid serving them to downstream users
running older versions of Git.
This also means that when such URLs are passed on the command line, Git
will print a clearer error so affected users can switch to the simpler
URL that explicitly specifies the host and protocol they intend.
One subtlety: .gitmodules files can contain relative URLs, representing
a URL relative to the URL they were cloned from. The relative URL
resolver used for .gitmodules can follow ".." components out of the path
part and past the host part of a URL, meaning that such a relative URL
can be used to traverse from a https://foo.example.com/innocent
superproject to a https::attacker.example.com/exploit submodule.
Fortunately a leading ':' in the first path component after a series of
leading './' and '../' components is unlikely to show up in other
contexts, so we can catch this by detecting that pattern.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
When we try to initialize credential loading by URL and find that the
URL is invalid, we set all fields to NULL in order to avoid acting on
malicious input. Later when we request credentials, we diagonse the
erroneous input:
fatal: refusing to work with credential missing host field
This is problematic in two ways:
- The message doesn't tell the user *why* we are missing the host
field, so they can't tell from this message alone how to recover.
There can be intervening messages after the original warning of
bad input, so the user may not have the context to put two and two
together.
- The error only occurs when we actually need to get a credential. If
the URL permits anonymous access, the only encouragement the user gets
to correct their bogus URL is a quiet warning.
This is inconsistent with the check we perform in fsck, where any use
of such a URL as a submodule is an error.
When we see such a bogus URL, let's not try to be nice and continue
without helpers. Instead, die() immediately. This is simpler and
obviously safe. And there's very little chance of disrupting a normal
workflow.
It's _possible_ that somebody has a legitimate URL with a raw newline in
it. It already wouldn't work with credential helpers, so this patch
steps that up from an inconvenience to "we will refuse to work with it
at all". If such a case does exist, we should figure out a way to work
with it (especially if the newline is only in the path component, which
we normally don't even pass to helpers). But until we see a real report,
we're better off being defensive.
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
In 07259e74ec (fsck: detect gitmodules URLs with embedded newlines,
2020-03-11), git fsck learned to check whether URLs in .gitmodules could
be understood by the credential machinery when they are handled by
git-remote-curl.
However, the check is overbroad: it checks all URLs instead of only
URLs that would be passed to git-remote-curl. In principle a git:// or
file:/// URL does not need to follow the same conventions as an http://
URL; in particular, git:// and file:// protocols are not succeptible to
issues in the credential API because they do not support attaching
credentials.
In the HTTP case, the URL in .gitmodules does not always match the URL
that would be passed to git-remote-curl and the credential machinery:
Git's URL syntax allows specifying a remote helper followed by a "::"
delimiter and a URL to be passed to it, so that
git ls-remote http::https://example.com/repo.git
invokes git-remote-http with https://example.com/repo.git as its URL
argument. With today's checks, that distinction does not make a
difference, but for a check we are about to introduce (for empty URL
schemes) it will matter.
.gitmodules files also support relative URLs. To ensure coverage for the
https based embedded-newline attack, urldecode and check them directly
for embedded newlines.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
The credential helper protocol was designed to be very flexible: the
fields it takes as input are treated as a pattern, and any missing
fields are taken as wildcards. This allows unusual things like:
echo protocol=https | git credential reject
to delete all stored https credentials (assuming the helpers themselves
treat the input that way). But when helpers are invoked automatically by
Git, this flexibility works against us. If for whatever reason we don't
have a "host" field, then we'd match _any_ host. When you're filling a
credential to send to a remote server, this is almost certainly not what
you want.
Prevent this at the layer that writes to the credential helper. Add a
check to the credential API that the host and protocol are always passed
in, and add an assertion to the credential_write function that speaks
credential helper protocol to be doubly sure.
There are a few ways this can be triggered in practice:
- the "git credential" command passes along arbitrary credential
parameters it reads from stdin.
- until the previous patch, when the host field of a URL is empty, we
would leave it unset (rather than setting it to the empty string)
- a URL like "example.com/foo.git" is treated by curl as if "http://"
was present, but our parser sees it as a non-URL and leaves all
fields unset
- the recent fix for URLs with embedded newlines blanks the URL but
otherwise continues. Rather than having the desired effect of
looking up no credential at all, many helpers will return _any_
credential
Our earlier test for an embedded newline didn't catch this because it
only checked that the credential was cleared, but didn't configure an
actual helper. Configuring the "verbatim" helper in the test would show
that it is invoked (it's obviously a silly helper which doesn't look at
its input, but the point is that it shouldn't be run at all). Since
we're switching this case to die(), we don't need to bother with a
helper. We can see the new behavior just by checking that the operation
fails.
We'll add new tests covering partial input as well (these can be
triggered through various means with url-parsing, but it's simpler to
just check them directly, as we know we are covered even if the url
parser changes behavior in the future).
[jn: changed to die() instead of logging and showing a manual
username/password prompt]
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We may feed a URL like "cert:///path/to/cert.pem" into the credential
machinery to get the key for a client-side certificate. That
credential has no hostname field, which is about to be disallowed (to
avoid confusion with protocols where a helper _would_ expect a
hostname).
This means as of the next patch, credential helpers won't work for
unlocking certs. Let's fix that by doing two things:
- when we parse a url with an empty host, set the host field to the
empty string (asking only to match stored entries with an empty
host) rather than NULL (asking to match _any_ host).
- when we build a cert:// credential by hand, similarly assign an
empty string
It's the latter that is more likely to impact real users in practice,
since it's what's used for http connections. But we don't have good
infrastructure to test it.
The url-parsing version will help anybody using git-credential in a
script, and is easy to test.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Many of the tests in t0300 give partial inputs to git-credential,
omitting a protocol or hostname. We're checking only high-level things
like whether and how helpers are invoked at all, and we don't care about
specific hosts. However, in preparation for tightening up the rules
about when we're willing to run a helper, let's start using input that's
a bit more realistic: pretend as if http://example.com is being
examined.
This shouldn't change the point of any of the tests, but do note we have
to adjust the expected output to accommodate this (filling a credential
will repeat back the protocol/host fields to stdout, and the helper
debug messages and askpass prompt will change on stderr).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We test a toy credential helper that writes "quit=1" and confirms that
we stop running other helpers. However, that helper is unrealistic in
that it does not bother to read its stdin at all.
For now we don't send any input to it, because we feed git-credential a
blank credential. But that will change in the next patch, which will
cause this test to racily fail, as git-credential will get SIGPIPE
writing to the helper rather than exiting because it was asked to.
Let's make this one-off helper more like our other sample helpers, and
have it source the "dump" script. That will read stdin, fixing the
SIGPIPE problem. But it will also write what it sees to stderr. We can
make the test more robust by checking that output, which confirms that
we do run the quit helper, don't run any other helpers, and exit for the
reason we expected.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Add new method in transport-helper to reject all references if any
reference is failed for atomic push.
This method is reused in "send-pack.c" and "transport-helper.c", one for
SSH, git and file protocols, and the other for HTTP protocol.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in
push_refs_with_push, 2019-07-11) noticed the incomplete report of
failure of an atomic push for HTTP protocol. But the implementation
has a flaw that mark all remote references as failure.
Only mark necessary references as failure in `push_refs_with_push()` of
transport-helper.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing with SSH or other smart protocol, references are validated
by function `check_to_send_update()` before they are sent in commands
to `send_pack()` of "receve-pack". For atomic push, if a reference is
rejected after the validation, only references pushed by user should be
marked as failure, instead of report failure on all remote references.
Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in
push_refs_with_push, 2019-07-11) wanted to fix report issue of HTTP
protocol, but marked all remote references failure for atomic push.
In order to fix the issue of status report for SSH or other built-in
smart protocol, revert part of that commit and add additional status
for function `atomic_push_failure()`. The additional status for it
except the "REF_STATUS_EXPECTING_REPORT" status are:
- REF_STATUS_NONE : Not marked as "REF_STATUS_EXPECTING_REPORT" yet.
- REF_STATUS_OK : Assume OK for dryrun or status_report is disabled.
This fix won't resolve the issue of status report in transport-helper
for HTTP or other protocols, and breaks test case in t5541. Will fix
it in additional commit.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we push some references to the git server, we expect git to report
the status of the references we are pushing; no more, no less. But when
pusing with atomic mode, if some references cannot be pushed, Git reports
the reject message on all references in the remote repository.
Add new test cases in t5543, and fix them in latter commit.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The porcelain output of a failed `git-push` command is inconsistent for
different protocols. For example, the following `git-push` command
may fail due to the failure of the `pre-receive` hook.
git push --porcelain origin HEAD:refs/heads/master
For SSH protocol, the porcelain output does not end with a "Done"
message:
To <URL/of/upstream.git>
! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined)
While for HTTP protocol, the porcelain output does end with a "Done"
message:
To <URL/of/upstream.git>
! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined)
Done
The following code at the end of function `send_pack()` indicates that
`send_pack()` should not return an error if some references are rejected
in porcelain mode.
int send_pack(...)
... ...
if (args->porcelain)
return 0;
for (ref = remote_refs; ref; ref = ref->next) {
switch (ref->status) {
case REF_STATUS_NONE:
case REF_STATUS_UPTODATE:
case REF_STATUS_OK:
break;
default:
return -1;
}
}
return 0;
}
So if atomic push failed, must check the porcelain mode before return
an error. And `receive_status()` should not return an error for a
failed updated reference, because `send_pack()` will check them instead.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing 'restore' and 'switch' sub commands to zsh completion
candidate output. E.g.
$ git re<tab>
rebase -- forward-port local commits to the updated upstream head
reset -- reset current HEAD to the specified state
restore -- restore working tree files
$ git s<tab>
show -- show various types of objects
status -- show the working tree status
switch -- switch branches
Signed-off-by: Terry Moschou <tmoschou@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters help reduce the amount of tree
parsing required during history queries. Before calculating a
diff, we can ask the filter if a path changed between a commit
and its first parent. If the filter says "no" then we can move
on without parsing trees. If the filter says "maybe" then we
parse trees to discover if the answer is actually "yes" or "no".
When computing a blame, there is a section in find_origin() that
computes a diff between a commit and one of its parents. When this
is the first parent, we can check the Bloom filters before calling
diff_tree_oid().
In order to make this work with the blame machinery, we need to
initialize a struct bloom_key with the initial path. But also, we
need to add more keys to a list if a rename is detected. We then
check to see if _any_ of these keys answer "maybe" in the diff.
During development, I purposefully left out this "add a new key
when a rename is detected" to see if the test suite would catch
my error. That is how I discovered the issues with
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS from the previous change.
With that change, we can feel some confidence in the coverage of
this change.
If a user requests copy detection using "git blame -C", then there
are more places where the set of "important" files can expand. I
do not know enough about how this happens in the blame machinery.
Thus, the Bloom filter integration is explicitly disabled in this
mode. A later change could expand the bloom_key data with an
appropriate call (or calls) to add_bloom_key().
If we did not disable this mode, then the following tests would
fail:
t8003-blame-corner-cases.sh
t8011-blame-split-file.sh
Generally, this is a performance enhancement and should not
change the behavior of 'git blame' in any way. If a repo has a
commit-graph file with computed changed-path Bloom filters, then
they should notice improved performance for their 'git blame'
commands.
Here are some example timings that I found by blaming some paths
in the Linux kernel repository:
git blame arch/x86/kernel/topology.c >/dev/null
Before: 0.83s
After: 0.24s
git blame kernel/time/time.c >/dev/null
Before: 0.72s
After: 0.24s
git blame tools/perf/ui/stdio/hist.c >/dev/null
Before: 0.27s
After: 0.11s
I specifically looked for "deep" paths that were also edited many
times. As a counterpoint, the MAINTAINERS file was edited many
times but is located in the root tree. This means that the cost of
computing a diff relative to the pathspec is very small. Here are
the timings for that command:
git blame MAINTAINERS >/dev/null
Before: 20.1s
After: 18.0s
These timings are the best of five. The worst-case runs were on the
order of 2.5 minutes for both cases. Note that the MAINTAINERS file
has 18,740 lines across 17,000+ commits. This happens to be one of
the cases where this change provides the least improvement.
The lack of improvement for the MAINTAINERS file and the relatively
modest improvement for the other examples can be easily explained.
The blame machinery needs to compute line-level diffs to determine
which lines were changed by each commit. That makes up a large
proportion of the computation time, and this change does not
attempt to improve on that section of the algorithm. The
MAINTAINERS file is large and changed often, so it takes time to
determine which lines were updated by which commit. In contrast,
the code files are much smaller, and it takes longer to comute
the line-by-line diff for a single patch on the Linux mailing
lists.
Outside of the "-C" integration, I believe there is little more to
gain from the changed-path Bloom filters for 'git blame' after this
patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_COMMIT_GRAPH environment variable updates the commit-
graph file whenever "git commit" is run, ensuring that we always
have an updated commit-graph throughout the test suite. The
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variable was
introduced to write the changed-path Bloom filters whenever "git
commit-graph write" is run. However, the GIT_TEST_COMMIT_GRAPH
trick doesn't launch a separate process and instead writes it
directly.
To expand the number of tests that have commits in the commit-graph
file, add a helper method that computes the commit-graph and place
that helper inside "git commit" and "git merge".
In the helper method, check GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS
to ensure we are writing changed-path Bloom filters whenever
possible.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters work only when we can compute an
explicit Bloom filter key in advance. When a pathspec is given
that allows case-insensitive checks or wildcard matching, we
must disable the Bloom filter performance checks.
By checking the pathspec in prepare_to_use_bloom_filters(), we
avoid setting up the Bloom filter data and thus revert to the
usual logic.
Before this change, the following tests would fail*:
t6004-rev-list-path-optim.sh (Tests 6-7)
t6130-pathspec-noglob.sh (Tests 3-6)
t6131-pathspec-icase.sh (Tests 3-5)
*These tests would fail when using GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_BLOOM_FILTERS except that the latter
environment variable was not set up correctly to write the changed-
path Bloom filters in the test suite. That will be fixed in the
next change.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To help pinpoint the source of a regression, it is useful to know some
info about the compiler which the user's Git client was built with. By
adding a generic get_compiler_info() in 'compat/' we can choose which
relevant information to share per compiler; to get started, let's
demonstrate the version of glibc if the user built with 'gcc'.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The contents of uname() can give us some insight into what sort of
system the user is running on, and help us replicate their setup if need
be. The domainname field is not guaranteed to be available, so don't
collect it.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Knowing which version of Git a user has and how it was built allows us
to more precisely pin down the circumstances when a certain issue
occurs, so teach bugreport how to tell us the same output as 'git
version --build-options'.
It's not ideal to directly call 'git version --build-options' because
that output goes to stdout. Instead, wrap the version string in a helper
within help.[ch] library, and call that helper from within the bugreport
library.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git how to prompt the user for a good bug report: reproduction
steps, expected behavior, and actual behavior. Later, Git can learn how
to collect some diagnostic information from the repository.
If users can send us a well-written bug report which contains diagnostic
information we would otherwise need to ask the user for, we can reduce
the number of question-and-answer round trips between the reporter and
the Git contributor.
Users may also wish to send a report like this to their local "Git
expert" if they have put their repository into a state they are confused
by.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting in 3ac68a93fd, help.o began to depend on builtin/branch.o,
builtin/clean.o, and builtin/config.o. This meant that help.o was
unusable outside of the context of the main Git executable.
To make help.o usable by other commands again, move list_config_help()
into builtin/help.c (where it makes sense to assume other builtin libraries
are present).
When command-list.h is included but a member is not used, we start to
hear a compiler warning. Since the config list is generated in a fairly
different way than the command list, and since commands and config
options are semantically different, move the config list into its own
header and move the generator into its own script and build rule.
For reasons explained in 976aaedc (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29), some build
artifacts we consider non-source files cannot be generated in the
Visual Studio environment, and we already have some Makefile tweaks
to help Visual Studio to use generated command-list.h header file.
Do the same to a new generated file, config-list.h, introduced by
this change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
In 'git log', the --decorate-refs-exclude option appends a pattern
to a string_list. This list is used to prevent showing some refs
in the decoration output, or even by --simplify-by-decoration.
Users may want to use their refs space to store utility refs that
should not appear in the decoration output. For example, Scalar [1]
runs a background fetch but places the "new" refs inside the
refs/scalar/hidden/<remote>/* refspace instead of refs/<remote>/*
to avoid updating remote refs when the user is not looking. However,
these "hidden" refs appear during regular 'git log' queries.
A similar idea to use "hidden" refs is under consideration for core
Git [2].
Add the 'log.excludeDecoration' config option so users can exclude
some refs from decorations by default instead of needing to use
--decorate-refs-exclude manually. The config value is multi-valued
much like the command-line option. The documentation is careful to
point out that the config value can be overridden by the
--decorate-refs option, even though --decorate-refs-exclude would
always "win" over --decorate-refs.
Since the 'log.excludeDecoration' takes lower precedence to
--decorate-refs, and --decorate-refs-exclude takes higher
precedence, the struct decoration_filter needed another field.
This led also to new logic in load_ref_decorations() and
ref_filter_match().
There are several tests in t4202-log.sh that test the
--decorate-refs-(include|exclude) options, so these are extended.
Since the expected output is already stored as a file, most tests
could simply replace a "--decorate-refs-exclude" option with an
in-line config setting. Other tests involve the precedence of
the config option compared to command-line options and needed more
modification.
[1] https://github.com/microsoft/scalar
[2] https://lore.kernel.org/git/77b1da5d3063a2404cd750adfe3bb8be9b6c497d.1585946894.git.gitgitgadget@gmail.com/
Helped-by: Junio C Hamano <gister@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref_filter_match() method is defined in refs.h and implemented
in refs.c, but is only used by add_ref_decoration() in log-tree.c.
Move it into that file as a static helper method. The
match_ref_pattern() comes along for the ride.
While moving the code, also make a slight adjustment to have
ref_filter_match() take a struct decoration_filter pointer instead
of multiple string lists. This is non-functional, but will make a
later change be much cleaner.
The diff is easier to parse when using the --color-moved option.
Reported-by: Junio C Hamano <gister@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "mboxrd" pretty format was introduced in 9f23e04061 (pretty: support
"mboxrd" output format, 2016-06-05) but wasn't mentioned in the
documentation.
Signed-off-by: Emma Brooks <me@pluvano.com>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the supplied integer for "precision" is negative in
`"%.*s", len, line` then it is ignored. So the current code is
equivalent to just `"%s", line` because it is executed only if
`len` is negative.
Fix this by saving the value of `len` before overwriting it with the
return value of `parse_git_diff_header()`.
Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git range-diff` calls `git log` internally and tries to parse its
output. But `git log` output can be customized by the user in their
git config and for certain configurations either an error will be
returned by `git range-diff` or it will crash.
To fix this explicitly set the output format of the internally
executed `git log` with `--pretty=medium`. Because that cancels
`--notes`, add explicitly `--notes` at the end.
Also, make sure we never crash in the same way - trying to dereference
`util` which was never created and has remained NULL. It would happen
if the first line of `git log` output does not begin with 'commit '.
Alternative considered but discarded - somehow disable all git configs
and behave as if no config is present in the internally executed
`git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read
`.git/config`.
Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's unusual to see:
https://example.com?query-parameters
without an intervening slash, like:
https://example.com/some-path?query-parameters
or even:
https://example.com/?query-parameters
but it is a valid end to the hostname (actually "authority component")
according to RFC 3986. Likewise for "#".
And curl will parse the URL according to the standard, meaning it will
contact example.com, but our credential code would ask about a bogus
hostname with a "?" in it. Let's make sure we follow the standard, and
more importantly ask about the same hosts that curl will be talking to.
It would be nice if we could just ask curl to parse the URL for us. But
it didn't grow a URL-parsing API until 7.62, so we'd be stuck with
fallback code either way. Plus we'd need this code in the main Git
binary, where we've tried to avoid having a link dependency on libcurl.
But let's at least fix our parser. Moving to curl's parser would prevent
other potential discrepancies, but this gives us immediate relief for
the known problem, and would help our fallback code if we eventually use
curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update freshen_file() to use a NULL `times', semantically equivalent to
the currently setup, with an explicit `actime' and `modtime' set to the
"current time", but with the advantage that it works with other files
not owned by the current user.
Fixes an issue on shared repos with a split index, where eventually a
user's operation creates a shared index, and another user will later do
an operation that will try to update its freshness, but will instead
raise a warning:
$ git status
warning: could not freshen shared index '.git/sharedindex.bd736fa10e0519593fefdb2aec253534470865b2'
Signed-off-by: Luciano Miguel Ferreira Rocha <luciano.rocha@booking.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When operating on a stream of commit OIDs on stdin, 'git commit-graph
write' checks that each OID refers to an object that is indeed a commit.
This is convenient to make sure that the given input is well-formed, but
can sometimes be undesirable.
For example, server operators may wish to feed the refnames that were
updated during a push to 'git commit-graph write --input=stdin-commits',
and silently discard refs that don't point at commits. This can be done
by combing the output of 'git for-each-ref' with '--format
%(*objecttype)', but this requires opening up a potentially large number
of objects. Instead, it is more convenient to feed the updated refs to
the commit-graph machinery, and let it throw out refs that don't point
to commits.
Introduce '--[no-]check-oids' to make such a behavior possible. With
'--check-oids' (the default behavior to retain backwards compatibility),
'git commit-graph write' will barf on a non-commit line in its input.
With 'no-check-oids', such lines will be silently ignored, making the
above possible by specifying this option.
No matter which is supplied, 'git commit-graph write' retains the
behavior from the previous commit of rejecting non-OID inputs like
"HEAD" and "refs/heads/foo" as before.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'write_commit_graph()' function takes in either a string list of
pack indices, or a string list of hexadecimal commit OIDs. These
correspond to the '--stdin-packs' and '--stdin-commits' mode(s) from
'git commit-graph write'.
Using a string_list of hexadecimal commit IDs is not the most efficient
use of memory, since we can instead use the 'struct oidset', which is
more well-suited for this case.
This has another benefit which will become apparent in the following
commit. This is that we are about to disambiguate the kinds of errors we
produce with '--stdin-commits' into "non-hex input" and "hex-input, but
referring to a non-commit object". By having 'write_commit_graph' take
in a 'struct oidset *' of commits, we place the burden on the caller (in
this case, the builtin) to handle the first case, and the commit-graph
machinery can handle the second case.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally, it may be useful for callers to know the number of object
IDs in an oidset. Right now, the only way to compute this is to call
'kh_size' on the internal 'kh_set_oid_t'.
Similar to how we wrap other 'kh_*' functions over the 'oidset' type,
let's allow callers to compute this value by introducing 'oidset_size'.
We will add its first caller in the subsequent commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using split commit-graphs, it is sometimes useful to completely
replace the commit-graph chain with a new base.
For example, consider a scenario in which a repository builds a new
commit-graph incremental for each push. Occasionally (say, after some
fixed number of pushes), they may wish to rebuild the commit-graph chain
with all reachable commits.
They can do so with
$ git commit-graph write --reachable
but this removes the chain entirely and replaces it with a single
commit-graph in 'objects/info/commit-graph'. Unfortunately, this means
that the next push will have to move this commit-graph into the first
layer of a new chain, and then write its new commits on top.
Avoid such copying entirely by allowing the caller to specify that they
wish to replace the entirety of their commit-graph chain, while also
specifying that the new commit-graph should become the basis of a fresh,
length-one chain.
This addresses the above situation by making it possible for the caller
to instead write:
$ git commit-graph write --reachable --split=replace
which writes a new length-one chain to 'objects/info/commit-graphs',
making the commit-graph incremental generated by the subsequent push
relatively cheap by avoiding the aforementioned copy.
In order to do this, remove an assumption in 'write_commit_graph_file'
that chains are always at least two incrementals long.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we laid the groundwork for supporting different
splitting strategies. In this commit, we introduce the first splitting
strategy: 'no-merge'.
Passing '--split=no-merge' is useful for callers which wish to write a
new incremental commit-graph, but do not want to spend effort condensing
the incremental chain [1]. Previously, this was possible by passing
'--size-multiple=0', but this no longer the case following 63020f175f
(commit-graph: prefer default size_mult when given zero, 2020-01-02).
When '--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain, and it will always write a new incremental.
[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is not dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.
In the next two commits, we will introduce additional splitting
strategies that can exert additional control over:
- when a split commit-graph is and isn't written, and
- when the existing commit-graph chain is discarded completely and
replaced with another graph
To prepare for this, make '--split' take an optional strategy (as in
'--split[=<strategy>]'), and add a new enum to describe which strategy
is being used. For now, no strategies are given, and the only enumerated
value is 'COMMIT_GRAPH_SPLIT_UNSPECIFIED', indicating the absence of a
strategy.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 61df89c8e5 (commit-graph: don't early exit(1) on e.g. "git status",
2019-03-25), the former 'load_commit_graph_one' was refactored into
'open_commit_graph' and 'load_commit_graph_one_fd_st' as a means of
avoiding an early-exit from non-library code.
However, 'load_commit_graph_one' does not support commit-graph chains,
and hence the 'read-graph' test tool does not work with them.
Replace 'load_commit_graph_one' with 'read_commit_graph_one' in order to
support commit-graph chains. In the spirit of 61df89c8e5,
'read_commit_graph_one' does not ever 'die()', making it a suitable
replacement here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function won't work anywhere else, so let's mark it as an explicit
bug if it is called on a non-Windows platform.
Let's also rename the function to avoid cluttering the global namespace
with an overly-generic function name.
While at it, we also fix the code comment above that function: the
lower-case `windows` refers to something different than `Windows`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function uses Windows' system tool `attrib` to determine the state
of the hidden flag of a file or directory.
We should not actually expect the first `attrib.exe` in the PATH to
be the one we are looking for. Or that it is in the PATH, for that
matter.
Let's use the full path to the tool instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `is_hidden` function can be used (only on Windows) to determine
whether a directory or file have their `hidden` flag set.
This function is duplicated between two test scripts. It is better to
move it into `test-lib-functions.sh` so that it is reused.
This patch is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using `starts_with()`, the magic number 7, `strlen()` and a
fair number of additions to verify the three parts of the config key
"branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them
more explicitly.
We need to introduce a new variable for this (we certainly can't modify
`k` just because we see "branch."!). With `skip_prefix()` we often use
quite bland names like `p` or `str`. Let's do the same. If and when this
function needs to do more prefix-skipping, we'll have a generic variable
ready for this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing against an upstream that has had many commits since the
original branch was created:
O -- O -- ... -- O -- O (upstream)
\
-- O (my-dev-branch)
it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.
Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.
This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user specifies the apply backend with options that only work
with the merge backend, such as
git rebase --apply --exec /bin/true HEAD~3
the error message has always been
fatal: --exec requires an interactive rebase
This error message is misleading and was one of the reasons we renamed
the interactive backend to the merge backend. Update the error message
to state that these options merely require use of the merge backend.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d48e5e21da ("rebase (interactive-backend): make --keep-empty the
default", 2020-02-15) turned --keep-empty (for keeping commits which
start empty) into the default. The logic underpinning that commit was:
1) 'git commit' errors out on the creation of empty commits without an
override flag
2) Once someone determines that the override is worthwhile, it's
annoying and/or harmful to required them to take extra steps in
order to keep such commits around (and to repeat such steps with
every rebase).
While the logic on which the decision was made is sound, the result was
a bit of an overcorrection. Instead of jumping to having --keep-empty
being the default, it jumped to making --keep-empty the only available
behavior. There was a simple workaround, though, which was thought to
be good enough at the time. People could still drop commits which
started empty the same way the could drop any commits: by firing up an
interactive rebase and picking out the commits they didn't want from the
list. However, there are cases where external tools might create enough
empty commits that picking all of them out is painful. As such, having
a flag to automatically remove start-empty commits may be beneficial.
Provide users a way to drop commits which start empty using a flag that
existed for years: --no-keep-empty. Interpret --keep-empty as
countermanding any previous --no-keep-empty, but otherwise leaving
--keep-empty as the default.
This might lead to some slight weirdness since commands like
git rebase --empty=drop --keep-empty
git rebase --empty=keep --no-keep-empty
look really weird despite making perfect sense (the first will drop
commits which become empty, but keep commits that started empty; the
second will keep commits which become empty, but drop commits which
started empty). However, --no-keep-empty was named years ago and we are
predominantly keeping it for backward compatibility; also we suspect it
will only be used rarely since folks already have a simple way to drop
commits they don't want with an interactive rebase.
Reported-by: Bryan Turner <bturner@atlassian.com>
Reported-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While many users who intentionally create empty commits do not want them
thrown away by a rebase, there are third-party tools that generate empty
commits that a user might not want. In the past, users have used rebase
to get rid of such commits (a side-effect of the fact that the --apply
backend is not currently capable of keeping them). While such users
could fire up an interactive rebase and just remove the lines
corresponding to empty commits, that might be difficult if the
third-party tool generates many of them. Simplify this task for users
by marking such lines with a suffix of " # empty" in the todo list.
Suggested-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the last few commits have made it possible for the config parser
to handle config files up to the limits of size_t, the rest of the code
isn't really ready for this. In particular, we often feed the keys as
strings into printf "%s" format specifiers. And because the printf
family of functions must return an int to specify the result, they
complain. Here are two concrete examples (using glibc; we're in
uncharted territory here so results may vary):
Generate a gigantic .gitmodules file like this:
git submodule add /some/other/repo foo
{
printf '[submodule "'
perl -e 'print "a" x 2**31'
echo '"]path = foo'
} >.gitmodules
git commit -m 'huge gitmodule'
then try this:
$ git show
BUG: strbuf.c:397: your vsnprintf is broken (returned -1)
The problem is that we end up calling:
strbuf_addf(&sb, "submodule.%s.ignore", submodule_name);
which relies on vsnprintf(), and that function has no way to report back
a size larger than INT_MAX.
Taking that same file, try this:
git config --file=.gitmodules --list --name-only
On my system it produces an output with exactly 4GB of spaces. I
confirmed in a debugger that we reach the config callback with the key
intact: it's 2147483663 bytes and full of a's. But when we print it with
this call:
printf("%s%c", key_, term);
we just get the spaces.
So given the fact that these are insane cases which we have no need to
support, the weird behavior from feeding the results to printf even if
the code is careful, and the possibility of uncareful code introducing
its own integer truncation issues, let's just declare INT_MAX as a limit
for parsing config files.
We'll enforce the limit in get_next_char(), which generalizes over all
sources (blobs, files, etc) and covers any element we're parsing
(whether section, key, value, etc). For simplicity, the limit is over
the length of the _whole_ file, so you couldn't have two 1GB values in
the same file. This should be perfectly fine, as the expected size for
config files is generally kilobytes at most.
With this patch both cases above will yield:
fatal: bad config line 1 in file .gitmodules
That's not an amazing error message, but the parser isn't set up to
provide specific messages (it just breaks out of the parsing loop and
gives that generic error even if see a syntactic issue). And we really
wouldn't expect to see this case outside of somebody maliciously probing
the limits of the config system.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the config parsing infrastructure is limited in what it can
parse only by the size of memory, because it parses character by
character, building up strbufs for keys, values, etc. One exception is
the "baselen" value we keep in git_parse_source(), which is an int.
That stores the length of the section.subsection base, to which we can
then append individual key names (by truncating back to the baselen with
strbuf_setlen(), and then appending characters for the key name). But
because it's an int, if we see an absurdly long section or subsection,
we may overflow the integer, wrapping negative. That negative value is
then implicitly cast to a size_t when we pass it to strbuf_setlen(),
creating a very large value and triggering a BUG. For example:
$ {
printf '[foo "'
perl -e 'print "a" x 2**31'
echo '"]bar = value'
} >huge
$ git config --file=huge --list
fatal: BUG: strbuf_setlen() beyond buffer
While this is obviously a silly case that we don't care about
supporting, it's worth fixing it by switching to a size_t for a few
reasons:
- we should try to avoid hitting BUG assertions at all
- avoiding integer truncation or overflow sets a good example and
makes it easier to audit the code for more important issues
- the BUG outcome is what happens in _this_ instance, because we wrap
negative. If we used a 2**32 subsection, we'd wrap to a small
positive value and actually generate wrong output (the subsection of
our key would be truncated).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the recent change to parse_config_key(), the best type to return
a string length is a size_t, as it won't cause integer truncation for a
gigantic key. And as with that change, this is mostly a clarity /
hygiene issue for now, as our config parser would choke on such a large
key anyway.
There are a few ripple effects within the config code, as callers switch
to using size_t. I also adjusted a few related variables that iterate
over strings. The most unexpected change is that a call to strbuf_addf()
had to switch to strbuf_add(). We can't use a size_t with "%.*s",
because printf precisions must have type "int" (we could cast, of
course, but that would miss the point of using size_t in the first
place).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We compute the length of a subset of a string, but then use that length
only to feed a "%.*s" printf placeholder for the same string. We can
just use "%s" to achieve the same thing.
The variable became useless in cb891a5989 (Use a strbuf for building up
section header and key/value pair strings., 2007-12-14), which swapped
out a write() which _did_ use the length for a strbuf_addf() call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return the length to a subset of a string using an "int *"
out-parameter. This is fine most of the time, as we'd expect config keys
to be relatively short, but it could behave oddly if we had a gigantic
config key. A more appropriate type is size_t.
Let's switch over, which lets our callers use size_t as appropriate
(they are bound by our type because they must pass the out-parameter as
a pointer). This is mostly just a cleanup to make it clear this code
handles long strings correctly. In practice, our config parser already
chokes on long key names (because of a similar int/size_t mixup!).
When doing an int/size_t conversion, we have to be careful that nobody
was trying to assign a negative value to the variable. I manually
confirmed that for each case here. They tend to just feed the result to
xmemdupz() or similar; in a few cases I adjusted the parameter types for
helper functions to make sure the size_t is preserved.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The make_branch() and make_rewrite() functions can take a NUL-terminated
string or a ptr/len pair. They use a sentinel value of "0" for the len
to tell the difference between the two. However, when parsing config
like:
[branch ""]
merge = whatever
whose key flattens to:
branch..merge
we might actually have a zero-length branch name. This is obviously
nonsense, but the current code would consider it as a NUL-terminated
string and use the branch name ".merge".
We could use a better sentinel value here (like "-1"), but that gets in
the way of moving to size_t, which is a more appropriate type for a
ptr/len combo.
Let's instead just drop this feature and have the callers (of which
there are only two total) use strlen() themselves. This simplifies the
code, and lets us move to using size_t.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On certain network filesystems (currently encountered with Isilon, but
in theory more network storage solutions could be causing the same
issue), when the directory in question is missing,
`raceproof_create_file()` fails with an `ERROR_INVALID_PARAMETER`
instead of an `ERROR_PATH_NOT_FOUND`.
Since it is highly unlikely that we produce such an error by mistake
(the parameters we pass are fairly benign), we can be relatively certain
that the directory is missing in this instance. So let's just translate
that error automagically.
This fixes https://github.com/git-for-windows/git/issues/1345.
Signed-off-by: Nathan Sanders <spekbukkem@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Arguably, CI builds' most important task is to not only identify
regressions, but to make it as easy as possible to investigate what went
wrong.
In that light, we will want to provide users with a way to inspect the
tests' output as well as the corresponding directories.
This commit adds build steps that are only executed when tests failed,
uploading the relevant information as build artifacts. These artifacts
can then be downloaded by interested parties to diagnose the failures
more efficiently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test fails, it is nice to see where the corresponding code lives
in the worktree. Sadly, it seems that only Bash allows us to infer this
information. Let's do it when we detect that we're running in a Bash.
This will come in handy in the next commit, where we teach the GitHub
Actions workflow to annotate failed test runs with this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have GitHub Actions now. Running the same builds and tests in Azure
Pipelines would be redundant, and a waste of energy.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds CI builds via GitHub Actions. While the underlying
technology is at least _very_ similar to that of Azure Pipelines, GitHub
Actions are much easier to set up than Azure Pipelines:
- no need to install a GitHub App,
- no need to set up an Azure DevOps account,
- all you need to do is push to your fork on GitHub.
Therefore, it makes a lot of sense for us to have a working GitHub
Actions setup.
While copy/editing `azure-pipelines.yml` into
`.github/workflows/main.yml`, we also use the opportunity to accelerate
the step that sets up a minimal subset of Git for Windows' SDK in the
Windows-build job:
- we now download a `.tar.xz` stored in Azure Blobs and extract it
simultaneously by calling `curl` and piping the result to `tar`,
- decompressing via `xz`,
- all three utilities are installed together with Git for Windows
At the same time, we also make use of the matrix build feature, which
reduces the amount of repeated text by quite a bit.
Also, we do away with the parts that try to mount a file share on which
`prove` can store data between runs. It is just too complicated to set
up, and most times the tree changes anyway, so there is little return on
investment there.
Initial-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will run Documentation job in GitHub Actions.
The job will run without elevated permission.
Run `gem` with `sudo` to elevate permission in order to be able to
install to system location.
This will also keep this installation in-line with other installation in
our Linux system for CI.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[Danh: reword commit message]
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will support GitHub Action.
Explicitly install all of our build dependencies on Linux.
Since GitHub Action's Linux VM hasn't installed our build dependencies.
And there're no harm to reinstall them (in Travis)
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At least one interactive command writes a prompt to `stdout` and then
reads user input on `stdin`: `git clean --interactive`. If the prompt is
left in the buffer, the user will not realize the program is waiting for
their input.
So let's just flush `stdout` before reading the user's input.
Signed-off-by: 마누엘 <nalla@hamal.uberspace.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are quite a few code locations (e.g. `git clean --interactive`)
where Git asks the user for an answer. In preparation for fixing a bug
shared by all of them, and also to DRY up the code, let's refactor it.
Please note that most of these callers trimmed white-space both at the
beginning and at the end of the answer, instead of trimming only the
end (as the caller in `add-patch.c` does).
Therefore, technically speaking, we change behavior in this patch. At
the same time, it can be argued that this is actually a bug fix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSYS2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.
That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.
Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default file history simplification of "git log -- <path>" or
"git rev-list -- <path>" focuses on providing the smallest set of
commits that first contributed a change. The revision walk greatly
restricts the set of walked commits by visiting only the first
TREESAME parent of a merge commit, when one exists. This means
that portions of the commit-graph are not walked, which can be a
performance benefit, but can also "hide" commits that added changes
but were ignored by a merge resolution.
The --full-history option modifies this by walking all commits and
reporting a merge commit as "interesting" if it has _any_ parent
that is not TREESAME. This tends to be an over-representation of
important commits, especially in an environment where most merge
commits are created by pull request completion.
Suppose we have a commit A and we create a commit B on top that
changes our file. When we merge the pull request, we create a merge
commit M. If no one else changed the file in the first-parent
history between M and A, then M will not be TREESAME to its first
parent, but will be TREESAME to B. Thus, the simplified history
will be "B". However, M will appear in the --full-history mode.
However, suppose that a number of topics T1, T2, ..., Tn were
created based on commits C1, C2, ..., Cn between A and M as
follows:
A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn
\ \ \ \ / / / /
\ \__.. \ \/ ..__T1 / Tn
\ \__.. /\ ..__T2 /
\_____________________B \____________________/
If the commits T1, T2, ... Tn did not change the file, then all of
P1 through Pn will be TREESAME to their first parent, but not
TREESAME to their second. This means that all of those merge commits
appear in the --full-history view, with edges that immediately
collapse into the lower history without introducing interesting
single-parent commits.
The --simplify-merges option was introduced to remove these extra
merge commits. By noticing that the rewritten parents are reachable
from their first parents, those edges can be simplified away. Finally,
the commits now look like single-parent commits that are TREESAME to
their "only" parent. Thus, they are removed and this issue does not
cause issues anymore. However, this also ends up removing the commit
M from the history view! Even worse, the --simplify-merges option
requires walking the entire history before returning a single result.
Many Git users are using Git alongside a Git service that provides
code storage alongside a code review tool commonly called "Pull
Requests" or "Merge Requests" against a target branch. When these
requests are accepted and merged, they typically create a merge
commit whose first parent is the previous branch tip and the second
parent is the tip of the topic branch used for the request. This
presents a valuable order to the parents, but also makes that merge
commit slightly special. Users may want to see not only which
commits changed a file, but which pull requests merged those commits
into their branch. In the previous example, this would mean the
users want to see the merge commit "M" in addition to the single-
parent commit "C".
Users are even more likely to want these merge commits when they
use pull requests to merge into a feature branch before merging that
feature branch into their trunk.
In some sense, users are asking for the "first" merge commit to
bring in the change to their branch. As long as the parent order is
consistent, this can be handled with the following rule:
Include a merge commit if it is not TREESAME to its first
parent, but is TREESAME to a later parent.
These merges look like the merge commits that would result from
running "git pull <topic>" on a main branch. Thus, the option to
show these commits is called "--show-pulls". This has the added
benefit of showing the commits created by closing a pull request or
merge request on any of the Git hosting and code review platforms.
To test these options, extend the standard test example to include
a merge commit that is not TREESAME to its first parent. It is
surprising that that option was not already in the example, as it
is instructive.
In particular, this extension demonstrates a common issue with file
history simplification. When a user resolves a merge conflict using
"-Xours" or otherwise ignoring one side of the conflict, they create
a TREESAME edge that probably should not be TREESAME. This leads
users to become frustrated and complain that "my change disappeared!"
In my experience, showing them history with --full-history and
--simplify-merges quickly reveals the problematic merge. As mentioned,
this option is expensive to compute. The --show-pulls option
_might_ show the merge commit (usually titled "resolving conflicts")
more quickly. Of course, this depends on the user having the correct
parent order, which is backwards when using "git pull master" from a
topic branch.
There are some special considerations when combining the --show-pulls
option with --simplify-merges. This requires adding a new PULL_MERGE
object flag to store the information from the initial TREESAME
comparisons. This helps avoid dropping those commits in later filters.
This is covered by a test, including how the parents can be simplified.
Since "struct object" has already ruined its 32-bit alignment by using
33 bits across parsed, type, and flags member, let's not make it worse.
PULL_MERGE is used in revision.c with the same value (1u<<15) as
REACHABLE in commit-graph.c. The REACHABLE flag is only used when
writing a commit-graph file, and a revision walk using --show-pulls
does not happen in the same process. Care must be taken in the future
to ensure this remains the case.
Update Documentation/rev-list-options.txt with significant details
around this option. This requires updating the example in the
History Simplification section to demonstrate some of the problems
with TREESAME second parents.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, `--autostash` only worked with `git pull --rebase`. However, in
the last patch, merge learned `--autostash` as well so there's no reason
why we should have this restriction anymore. Teach pull to pass
`--autostash` to merge, just like it did for rebase.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, test_pull_autostash() was hardcoded to run
`test_cmp_rev HEAD^ copy` to test that a rebase happened. However, in a
future patch, we plan on testing merging as well. Make
test_pull_autostash() accept a parent number as an argument so that, in
the future, we can test if a merge happened in addition to a rebase.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rebase, one can pass the `--autostash` option to cause the worktree
to be automatically stashed before continuing with the rebase. This
option is missing in merge, however.
Implement the `--autostash` option and corresponding `merge.autoStash`
option in merge which stashes before merging and then pops after.
This option is useful when a developer has some local changes on a topic
branch but they realize that their work depends on another branch.
Previously, they had to run something like
git fetch ...
git stash push
git merge FETCH_HEAD
git stash pop
but now, that is reduced to
git fetch ...
git merge --autostash FETCH_HEAD
When an autostash is generated, it is automatically reapplied to the
worktree only in three explicit situations:
1. An incomplete merge is commit using `git commit`.
2. A merge completes successfully.
3. A merge is aborted using `git merge --abort`.
In all other situations where the merge state is removed using
remove_merge_branch_state() such as aborting a merge via
`git reset --hard`, the autostash is saved into the stash reflog
instead keeping the worktree clean.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Suggested-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split apply_save_autostash() into apply_autostash_oid() and
apply_save_autostash() where the former operates on an OID string and
the latter reads the OID from a file before passing it into
apply_save_autostash_oid().
This function is required for a future commmit which will rely on being
able to apply an autostash whose OID is stored as a string.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract common functionality of apply_autostash() into
apply_save_autostash() and use it to implement save_autostash(). This
function will be used in a future commit.
The difference between save_autostash() and apply_autostash() is that
the former does not try to apply the stash. It skips that step and
just stores the created entry in the stash reflog.
This is useful in the case where we abort an operation when an autostash
is present but we don't want to dirty the worktree with the application
of the stash. For example, in a future commit, we will implement
`git merge --autostash`. Since merges can be aborted using
`git reset --hard`, we'd make use of save_autostash() to save the
autostash entry instead of applying it to the worktree thus keeping the
worktree undirtied.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly remove autostash file in apply_autostash() once it has been
applied successfully.
This is currently a no-op because the only users of this function will unlink
the state (including the autostash file) after this function runs.
However, in the future, we will introduce a user of the function that
does not explicitly remove the state so we do it here.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lib-ify the autostash code by extracting perform_autostash() from rebase
into sequencer. In a future commit, this will be used to implement
`--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying create_autostash() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions and `the_repository`. Also, make it accept a `path` argument
so that we no longer rely have to rely on `struct rebase_options`.
Finally, make it accept a `default_reflog_action` argument so we no
longer have to rely on `DEFAULT_REFLOG_ACTION`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will lib-ify this code. In preparation for
this, extract the code into the create_autostash() function so that it
can be cleaned up before it is finally lib-ified.
This patch is best viewed with `--color-moved` and
`--color-moved-ws=allow-indentation-change`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the process of lib-ifying the autostash code. In a future
commit, this will be used to implement `--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying reset_head() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions. Also, make it accept a `const char *default_reflog_action`
argument so that the default action of "rebase" isn't hardcoded in.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The apply_autostash() function in builtin/rebase.c is similar enough to
the apply_autostash() function in sequencer.c that they are almost
interchangeable, except for the type of arg they accept. Make the
sequencer.c version extern and use it in rebase.
The rebase version was introduced in 6defce2b02 (builtin rebase: support
`--autostash` option, 2018-09-04) as part of the shell to C conversion.
It opted to duplicate the function because, at the time, there was
another in-progress project converting interactive rebase from shell to
C as well and they did not want to clash with them by refactoring
sequencer.c version of apply_autostash(). Since both efforts are long
done, we can freely combine them together now.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The preferred terminology is to refer to object identifiers as "OIDs".
Rename the `stash_sha1` variable to `stash_oid` in order to conform to
this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make apply_autostash() more generic for future extraction, make
it accept a `path` argument so that the location from where to read the
reference to the autostash commit can be customized. Remove the `opts`
argument since it was unused before anyway.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since in sequencer.c, read_one() basically duplicates the functionality
of read_oneliner(), reduce code duplication by replacing read_one() with
read_oneliner().
This was done with the following Coccinelle script
@@
expression a, b;
@@
- read_one(a, b)
+ !read_oneliner(b, a, READ_ONELINER_WARN_MISSING)
and long lines were manually broken up.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're comparing a push cert nonce, we currently do so using strcmp.
Most implementations of strcmp short-circuit and exit as soon as they
know whether two values are equal. This, however, is a problem when
we're comparing the output of HMAC, as it leaks information in the time
taken about how much of the two values match if they do indeed differ.
In our case, the nonce is used to prevent replay attacks against our
server via the embedded timestamp and replay attacks using requests from
a different server via the HMAC. Push certs, which contain the nonces,
are signed, so an attacker cannot tamper with the nonces without
breaking validation of the signature. They can, of course, create their
own signatures with invalid nonces, but they can also create their own
signatures with valid nonces, so there's nothing to be gained. Thus,
there is no security problem.
Even though it doesn't appear that there are any negative consequences
from the current technique, for safety and to encourage good practices,
let's use a constant time comparison function for nonce verification.
POSIX does not provide one, but they are easy to write.
The technique we use here is also used in NaCl and the Go standard
library and relies on the fact that bitwise or and xor are constant time
on all known architectures.
We need not be concerned about exiting early if the actual and expected
lengths differ, since the standard cryptographic assumption is that
everyone, including an attacker, knows the format of and algorithm used
in our nonces (and in any event, they have the source code and can
determine it easily). As a result, we assume everyone knows how long
our nonces should be. This philosophy is also taken by the Go standard
library and other cryptographic libraries when performing constant time
comparisons on HMAC values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c931ba4e (sha1-name.c: remove the_repo from handle_one_ref(),
2019-04-16) replaced the use of for_each_ref() helper, which works
with the main ref store of the default repository instance, with
refs_for_each_ref(), which can work on any ref store instance, by
assuming that the repository instance the function is given has its
ref store already initialized.
But it is possible that nobody has initialized it, in which case,
the code ends up dereferencing a NULL pointer.
Reported-by: Érico Rolim <erico.erc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters record an entry in the filter for
every path that was changed. This includes every add and delete,
regardless of whether a rename was detected. Detecting renames
causes significant performance issues, but also will trigger
downloading missing blobs in partial clone.
The simple fix is to disable rename detection when computing a
changed-path Bloom filter. This should already be disabled by
default, but it is good to explicitly enforce the intended
behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 1925fe0c8a ("Documentation: wrap config listings in "----"",
2019-09-07) wrapped this fairly large block of example config directives
in "----". The closing "----" ended up a few lines too early though.
Make sure to include the trailing "IncludeIf.onbranch:..." example, too.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7e9e048661 (stash -p: demonstrate failure of split with mixed y/n,
2015-04-16), a regression test for a known breakage that was added to
the test script `t3904-stash-patch.sh` that demonstrated that splitting
a hunk and trying to stash only part of that split hunk fails (but
shouldn't).
As expected, it still fails, but for the wrong reason: once the bug is
fixed, we would expect stderr to show nothing, yet the regression test
expects stderr to show something.
Let's fix that by telling that regression test case to expect nothing to
be printed to stderr.
While at it, also drop the obvious left-over from debugging where the
regression test did not mind `git stash -p` to return a non-zero exit
status.
Of course, the regression test still fails, but this time for the
correct reason.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4dc42c6c18 (mingw: refuse paths containing reserved names,
2019-12-21), we started disallowing file names that are reserved, e.g.
`NUL`, `CONOUT$`, etc.
This included `COM<n>` where `<n>` is a digit. Unfortunately, this
includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to
the official documentation, `COM0` is mentioned in the "NT Namespaces"
section but it is explicitly _omitted_ from the list of reserved names:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
Tests corroborate this: it is totally possible to write a file called
`com0.c` on Windows 10, but not `com1.c`.
So let's tighten the code to disallow only the reserved `COM<n>` file
names, but to allow `COM0` again.
This fixes https://github.com/git-for-windows/git/issues/2470.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Microsoft introduced a new "Universal C Runtime Library" (UCRT) with
Visual Studio 2015. The UCRT comes with a new strftime() implementation
that supports more date formats. We link git against the older
"Microsoft Visual C Runtime Library" (MSVCRT), so to use the UCRT
strftime() we need to load it from ucrtbase.dll using
DECLARE_PROC_ADDR()/INIT_PROC_ADDR().
Most supported Windows systems should have recieved the UCRT via Windows
update, but in some cases only MSVCRT might be available. In that case
we fall back to using that implementation.
With this change, it is possible to use e.g. the `%g` and `%V` date
format specifiers, e.g.
git show -s --format=%cd --date=format:‘%g.%V’ HEAD
Without this change, the user would see this error message on Windows:
fatal: invalid strftime format: '‘%g.%V’'
This fixes https://github.com/git-for-windows/git/issues/2495
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a (late) companion for f6461b82b9 (Documentation: fix build
with Asciidoctor 2, 2019-09-15).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit subjects or authors have non-ASCII characters, git
format-patch Q-encodes them so they can be safely sent over email.
However, if the patch transfer method is something other than email (web
review tools, sneakernet), this only serves to make the patch metadata
harder to read without first applying it (unless you can decode RFC 2047
in your head). git am as well as some email software supports
non-Q-encoded mail as described in RFC 6531.
Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the
user control this behavior.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 6cdccfce1e (i18n: make GETTEXT_POISON a runtime option,
2018-11-08), the `jobname` was adjusted to have the `GIT_TEST_` prefix,
but that prefix makes no sense in this context.
Co-authored-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitHub Action doesn't set TERM environment variable, which is required
by "tput".
Fallback to dumb if it's not set.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each CI system we support, we need a specific arm in that if/else
construct in ci/lib.sh. Let's add one for GitHub Actions.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* dd/ci-musl-libc:
travis: build and test on Linux with musl libc and busybox
ci/linux32: libify install-dependencies step
ci: refactor docker runner script
ci/linux32: parameterise command to switch arch
ci/lib-docker: preserve required environment variables
ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
* dd/test-with-busybox:
t5703: feed raw data into test-tool unpack-sideband
t4124: tweak test so that non-compliant diff(1) can also be used
t7063: drop non-POSIX argument "-ls" from find(1)
t5616: use rev-parse instead to get HEAD's object_id
t5003: skip conversion test if unzip -a is unavailable
t5003: drop the subshell in test_lazy_prereq
test-lib-functions: test_cmp: eval $GIT_TEST_CMP
t4061: use POSIX compliant regex(7)
The function read_oneliner() is a generally useful util function.
Instead of hiding it as a static function within sequencer.c, extern it
so that it can be reused by others.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on externing read_oneliner(). Future users of
read_oneliner() will want the ability to output warnings in the event
that the `path` doesn't exist. Introduce the
`READ_ONELINER_WARN_MISSING` flag which, if active, would issue a
warning when a file doesn't exist by always executing warning_errno()
in the case where strbuf_read_file() fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will need read_oneliner() to accept flags other
than just `skip_if_empty`. Instead of having an argument for each flag,
teach read_oneliner() to accept the bitfield `flags` instead. For now,
only recognize the `READ_ONELINER_SKIP_IF_EMPTY` flag. More flags will
be added in a future commit.
The result of this is that parallel topics which introduce invocations
of read_oneliner() will still be compatible with this new function
signature since, instead of passing 1 or 0 for `skip_if_empty`, they'll
be passing 1 or 0 to `flags`, which gives equivalent behavior.
Mechanically fix up invocations of read_oneliner() with the following
spatch
@@
expression a, b;
@@
read_oneliner(a, b,
- 1
+ READ_ONELINER_SKIP_IF_EMPTY
)
and manually break up long lines in the result.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently check whether a file exists and return early before reading
the file. Instead of accessing the file twice, always read the file and
check `errno` to see if the file doesn't exist.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08)
optimized "diff" by prefetching blobs in a partial clone, but there are
some cases wherein blobs do not need to be prefetched. In these cases,
any command that uses the diff machinery will unnecessarily fetch blobs.
diffcore_std() may read blobs when it calls the following functions:
(1) diffcore_skip_stat_unmatch() (controlled by the config variable
diff.autorefreshindex)
(2) diffcore_break() and diffcore_merge_broken() (for break-rewrite
detection)
(3) diffcore_rename() (for rename detection)
(4) diffcore_pickaxe() (for detecting addition/deletion of specified
string)
Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(),
diffcore_break(), and diffcore_rename() to prefetch blobs upon the first
read of a missing object. This covers (1), (2), and (3): to cover the
rest, teach diffcore_std() to prefetch if the output type is one that
includes blob data (and hence blob data will be required later anyway),
or if it knows that (4) will be run.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the object reads in diff_populate_filespec() to have the first
object read not be in an if/else branch, because in a future patch, a
retry will be added to that first object read.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of diff_populate_filespec() currently can be customized
through a bitflag, but a subsequent patch requires it to support a
non-boolean option. Replace the bitflag with an options struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a lot of code to honor GIT_REFLOG_ACTION throughout git,
including some in sequencer.c; unfortunately, reflog_message() and its
callers ignored it. Instruct reflog_message() to check the existing
environment variable, and use it when present as an override to
action_name().
Also restructure pick_commits() to only temporarily modify
GIT_REFLOG_ACTION for a short duration and then restore the old value,
so that when we do this setting within a loop we do not keep adding "
(pick)" substrings and end up with a reflog message of the form
rebase (pick) (pick) (pick) (finish): returning to refs/heads/master
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will add new Travis Job for linux-musl.
Most of other code in this file could be reuse for that job.
Move the code to install dependencies to a common script.
Should we add new CI system that can run directly in container,
we can reuse this script for installation step.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will support alpine check in docker later in this series.
While we're at it, tell people to run as root in podman,
if podman is used as drop-in replacement for docker,
because podman will map host-user to container's root,
therefore, mapping their permission.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, the remaining of this command will be re-used for the
CI job for linux with musl libc.
Allow customisation of the emulator, now.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using "su -m" to preserve environment variables in the shell run
by "su". But, that options will be ignored while "-l" (aka "--login") is
specified in util-linux and busybox's su.
In a later patch this script will be reused for checking Git for Linux
with musl libc on Alpine Linux, Alpine Linux uses "su" from busybox.
Since we don't have interest in all environment variables,
pass only those necessary variables to the inner script.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation refers to "initialized" or "populated" submodules,
to explain which submodules are affected by '--recurse-submodules', but
the real terminology here is 'active' submodules. Update the
documentation accordingly.
Some terminology:
- Active is defined in gitsubmodules(7), it only involves the
configuration variables 'submodule.active', 'submodule.<name>.active'
and 'submodule.<name>.url'. The function
submodule.c::is_submodule_active checks that a submodule is active.
- Populated means that the submodule's working tree is present (and the
gitfile correctly points to the submodule repository), i.e. either the
superproject was cloned with ` --recurse-submodules`, or the user ran
`git submodule update --init`, or `git submodule init [<path>]` and
`git submodule update [<path>]` separately which populated the
submodule working tree. This does not involve the 3 configuration
variables above.
- Initialized (at least in the context of the man pages involved in this
patch) means both "populated" and "active" as defined above, i.e. what
`git submodule update --init` does.
The --recurse-submodules option mostly affects active submodules. An
exception is `git fetch` where the option affects populated submodules.
As a consequence, in `git pull --recurse-submodules` the fetch affects
populated submodules, but the resulting working tree update only affects
active submodules.
In the documentation of `git-pull`, let's distinguish between the
fetching part which affects populated submodules, and the updating of
worktrees, which only affects active submodules.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default value also depends on the value of submodule.recurse.
Use this opportunity to correct some grammar mistakes in
Documentation/config/fetch.txt signaled by Robert P. J. Day.
Also mention `fetch.recurseSubmodules` in fetch-options.txt. In
git-push.txt, `push.recurseSubmodules` is implicitly mentioned (by
explaining how to disable it), so no need to add it there.
Lastly add a link to `git-fetch` in `git-pull.txt` to explain the
meaning of `--recurse-submodules` there.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also unify the formulation about --no-recurse-submodules for checkout
and switch, which we reuse for restore.
And correct the formulation about submodules' HEAD in read-tree, which
we reuse in reset.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note that `ls-files` is not affected, even though it has a
`--recurse-submodules` option, so list it as an exception too.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a custom hash in fast-import to store the set of objects we've
imported so far. It has a fixed set of 2^16 buckets and chains any
collisions with a linked list. As the number of objects grows larger
than that, the load factor increases and we degrade to O(n) lookups and
O(n^2) insertions.
We can scale better by using our hashmap.c implementation, which will
resize the bucket count as we grow. This does incur an extra memory cost
of 8 bytes per object, as hashmap stores the integer hash value for each
entry in its hashmap_entry struct (which we really don't care about
here, because we're just reusing the embedded object hash). But I think
the numbers below justify this (and our per-object memory cost is
already much higher).
I also looked at using khash, but it seemed to perform slightly worse
than hashmap at all sizes, and worse even than the existing code for
small sizes. It's also awkward to use here, because we want to look up a
"struct object_entry" from a "struct object_id", and it doesn't handle
mismatched keys as well. Making a mapping of object_id to object_entry
would be more natural, but that would require pulling the embedded oid
out of the object_entry or incurring an extra 32 bytes per object.
In a synthetic test creating as many cheap, tiny objects as possible
perl -e '
my $bits = shift;
my $nr = 2**$bits;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' $bits | git fast-import
I got these results:
nr_objects master khash hashmap
2^20 0m4.317s 0m5.109s 0m3.890s
2^21 0m10.204s 0m9.702s 0m7.933s
2^22 0m27.159s 0m17.911s 0m16.751s
2^23 1m19.038s 0m35.080s 0m31.963s
2^24 4m18.766s 1m10.233s 1m6.793s
which points to hashmap as the winner. We didn't have any perf tests for
fast-export or fast-import, so I added one as a more real-world case.
It uses an export without blobs since that's significantly cheaper than
a full one, but still is an interesting case people might use (e.g., for
rewriting history). It will emphasize this change in some ways (as a
percentage we spend more time making objects and less shuffling blob
bytes around) and less in others (the total object count is lower).
Here are the results for linux.git:
Test HEAD^ HEAD
----------------------------------------------------------------------------
9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3%
9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3%
It only has ~5.2M commits and trees, so this is a larger effect than I
expected (the 2^23 case above only improved by 50s or so, but here we
gained almost 90s). This is probably due to actually performing more
object lookups in a real import with trees and commits, as opposed to
just dumping a bunch of blobs into a pack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag to the test setup suite
in order to toggle writing Bloom filters when running any of the git tests.
If set to true, we will compute and write Bloom filters every time a test
calls `git commit-graph write`, as if the `--changed-paths` option was
passed in.
The test suite passes when GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS are enabled.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests exercises writing commit graph with Bloom filters
and exercises 'git log -- path' with all the applicable
options. They check that the output is the same with and
without Bloom filters, confirm Bloom filters were used by
checking if trace2 statistics were logged correctly.
Also confirms cases where Bloom filters are not used:
1. Multiple path specs,
2. --walk-reflogs (see patch titled 'revision.c: use Bloom filters...'
for details,
3. If the latest commit graph does not have Bloom filters
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 statistics around Bloom filter usage and behavior
for 'git log -- path' commands that are hoping to benefit from
the presence of computed changed paths Bloom filters.
These statistics are great for performance analysis work and
for formal testing, which we will see in the commit following
this one.
Helped-by: Derrick Stolee <dstolee@microsoft.com
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revision walk will now use Bloom filters for commits to speed up
revision walks for a particular path (for computing history for
that path), if they are present in the commit-graph file.
We load the Bloom filters during the prepare_revision_walk step,
currently only when dealing with a single pathspec. Extending
it to work with multiple pathspecs can be explored and built on
top of this series in the future.
While comparing trees in rev_compare_trees(), if the Bloom filter
says that the file is not different between the two trees, we don't
need to compute the expensive diff. This is where we get our
performance gains. The other response of the Bloom filter is '`:maybe',
in which case we fall back to the full diff calculation to determine
if the path was changed in the commit.
We do not try to use Bloom filters when the '--walk-reflogs' option
is specified. The '--walk-reflogs' option does not walk the commit
ancestry chain like the rest of the options. Incorporating the
performance gains when walking reflog entries would add more
complexity, and can be explored in a later series.
Performance Gains:
We tested the performance of `git log -- <path>` on the git repo, the linux
and some internal large repos, with a variety of paths of varying depths.
On the git and linux repos:
- we observed a 2x to 5x speed up.
On a large internal repo with files seated 6-10 levels deep in the tree:
- we observed 10x to 20x speed ups, with some paths going up to 28 times
faster.
Helped-by: Derrick Stolee <dstolee@microsoft.com
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --changed-paths option to git commit-graph write. This option will
allow users to compute information about the paths that have changed
between a commit and its first parent, and write it into the commit graph
file. If the option is passed to the write subcommand we set the
COMMIT_GRAPH_WRITE_BLOOM_FILTERS flag and pass it down to the
commit-graph logic.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add logic to
a) parse Bloom filter information from the commit graph file and,
b) re-use existing Bloom filters.
See Documentation/technical/commit-graph-format for the format in which
the Bloom filter information is written to the commit graph file.
To read Bloom filter for a given commit with lexicographic position
'i' we need to:
1. Read BIDX[i] which essentially gives us the starting index in BDAT for
filter of commit i+1. It is essentially the index past the end
of the filter of commit i. It is called end_index in the code.
2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT
for filter of commit i. It is called the start_index in the code.
For the first commit, where i = 0, Bloom filter data starts at the
beginning, just past the header in the BDAT chunk. Hence, start_index
will be 0.
3. The length of the filter will be end_index - start_index, because
BIDX[i] gives the cumulative 8-byte words including the ith
commit's filter.
We toggle whether Bloom filters should be recomputed based on the
compute_if_not_present flag.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the technical documentation for commit-graph-format with
the formats for the Bloom filter index (BIDX) and Bloom filter
data (BDAT) chunks. Write the computed Bloom filters information
to the commit graph file using this format.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f269048754 (fetch: opportunistically update tracking refs,
2013-05-11), the underlying `git fetch` in `git pull <remote> <branch>`
updates the configured remote-tracking branch for <branch>.
However, an example in the 'Examples' section of the `git pull`
documentation still states that this is not the case.
Correct the description of this example.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the `<refspec>` parameter in the `git fetch`
documentation refers to the section "CONFIGURED REMOTE-TRACKING
BRANCHES" in this same documentation page.
In the `git pull` documentation, let's also refer specifically to this
section instead of just linking to the `git fetch` documentation.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 94a88e2524 (Makefile: avoid running curl-config multiple times,
2020-03-26) put the call to $(CURL_CONFIG) into a "simple" variable
which is expanded immediately, rather than expanding it each time it's
needed. However, that also means that we expand it whenever the Makefile
is parsed, whether we need it or not.
This is wasteful, but also breaks the ci/test-documentation.sh job, as
it does not have curl at all and complains about the extra messages to
stderr. An easy way to see it is just:
$ make CURL_CONFIG=does-not-work check-builtins
make: does-not-work: Command not found
make: does-not-work: Command not found
GIT_VERSION = 2.26.0.108.gb3f3f45f29
make: does-not-work: Command not found
make: does-not-work: Command not found
./check-builtins.sh
We can get the best of both worlds if we're willing to accept a little
Makefile hackery. Courtesy of the article at:
http://make.mad-scientist.net/deferred-simple-variable-expansion/
this patch uses a lazily-evaluated recursive variable which replaces its
contents with an immediately assigned simple one on first use.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read_populate_opts(), we call read_oneliner() _after_ calling
strbuf_release(). This means that `buf` is leaked at the end of the
function.
Always clean up the strbuf by going to `done_rebase_i` whether or not
we return an error.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge with --gpg-sign option, and clarify that --no-gpg-sign also
override earlier --gpg-sign.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
{cherry-pick,revert} --edit hasn't honoured --no-gpg-sign yet.
Pass this option down to git-commit to honour it.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are 3 callers to promisor_remote_get_direct() that first check if
the number of objects to be fetched is equal to 0. Fold that check into
promisor_remote_get_direct(), and in doing so, be explicit as to what
promisor_remote_get_direct() does if oid_nr is 0 (it returns 0, success,
immediately).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have an environment variable `jobs=16` defined in our CI system, and
this environment makes our build job failed with the following message:
error: pathspec '16' did not match any file(s) known to git
The pathspec '16' for Git command is from the environment variable
"jobs".
This is because "git-submodule" command is implemented in shell script,
and environment variables may change its behavior. Set values for
uninitialized variables, such as "jobs" and "recommend_shallow" will
fix this issue.
Helped-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Li Xuejiang <xuejiang@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) command can only handle queueing transactions
right now via its "--stdin" parameter, but there is no way for users to
handle the transaction itself in a more explicit way. E.g. in a
replicated scenario, one may imagine a coordinator that spawns
git-update-ref(1) for multiple repositories and only if all agree that
an update is possible will the coordinator send a commit. Such a
transactional session could look like
> start
< start: ok
> update refs/heads/master $OLD $NEW
> prepare
< prepare: ok
# All nodes have returned "ok"
> commit
< commit: ok
or
> start
< start: ok
> create refs/heads/master $OLD $NEW
> prepare
< fatal: cannot lock ref 'refs/heads/master': reference already exists
# On all other nodes:
> abort
< abort: ok
In order to allow for such transactional sessions, this commit
introduces four new commands for git-update-ref(1), which matches those
we have internally already with the exception of "start":
- start: start a new transaction
- prepare: prepare the transaction, that is try to lock all
references and verify their current value matches the
expected one
- commit: explicitly commit a session, that is update references to
match their new expected state
- abort: abort a session and roll back all changes
By design, git-update-ref(1) will commit as soon as standard input is
being closed. While fine in a non-transactional world, it is definitely
unexpected in a transactional world. Because of this, as soon as any of
the new transactional commands is used, the default will change to
aborting without an explicit "commit". To avoid a race between queueing
updates and the first "prepare" that starts a transaction, the "start"
command has been added to start an explicit transaction.
Add some tests to exercise this new functionality.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) supports a `--stdin` mode that allows it to read
all reference updates from standard input. This is mainly used to allow
for atomic reference updates that are all or nothing, so that either all
references will get updated or none.
Currently, git-update-ref(1) reads all commands as a single block of up
to 1000 characters and only starts processing after stdin gets closed.
This is less flexible than one might wish for, as it doesn't really
allow for longer-lived transactions and doesn't allow any verification
without committing everything. E.g. one may imagine the following
exchange:
> start
< start: ok
> update refs/heads/master $NEWOID1 $OLDOID1
> update refs/heads/branch $NEWOID2 $OLDOID2
> prepare
< prepare: ok
> commit
< commit: ok
When reading all input as a whole block, the above interactive protocol
is obviously impossible to achieve. But by converting the command to
read commands linewise, we can make it more interactive than before.
Obviously, the linewise interface is only a first step in making
git-update-ref(1) work in a more transaction-oriented way. Missing is
most importantly support for transactional commands that manage the
current transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the actual logic to update the transaction is handled in
`update_refs_stdin()`, the transaction itself is started and committed
in `cmd_update_ref()` itself. This makes it hard to handle transaction
abortion and commits as part of `update_refs_stdin()` itself, which is
required in order to introduce transaction handling features to `git
update-refs --stdin`.
Refactor the code to move all transaction handling into
`update_refs_stdin()` to prepare for transaction handling features.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently pass both an `strbuf` containing the current command line
as well as the `next` pointer pointing to the first argument to
commands. This is both confusing and makes code more intertwined.
Convert this to use a simple pointer as well as a pointer pointing to
the end of the input as a preparatory step to line-wise reading of
stdin.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `parse_refname` function accepts a `struct strbuf *input` argument
that isn't used at all. As we're about to convert commands to not use a
strbuf anymore but instead an end pointer, let's drop this argument now
to make the converting commit easier to review.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently manually wire up all commands known to `git-update-ref
--stdin`, making it harder than necessary to preprocess arguments after
the command is determined. To make this more extensible, let's refactor
the code to use an array of known commands instead. While this doesn't
add a lot of value now, it is a preparatory step to implement line-wise
reading of commands.
As we're going to introduce commands without trailing spaces, this
commit also moves whitespace parsing into the respective commands.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time we ran 'make --jobs=2 ...' to build Git, its
documentation, or to apply Coccinelle semantic patches. Then commit
eaa62291ff (ci: inherit --jobs via MAKEFLAGS in run-build-and-tests,
2019-01-27) came along, and started using the MAKEFLAGS environment
variable to centralize setting the number of parallel jobs in
'ci/libs.sh'. Alas, it forgot to update 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container running the 32
bit Linux job, and, consequently, since then that job builds Git
sequentially, and it ignores any Makefile knobs that we might set in
MAKEFLAGS (though we don't set any for the 32 bit Linux job at the
moment).
So update the 'docker run' invocation in 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container as well. Set
CC=gcc for the 32 bit Linux job, because that's the compiler installed
in the 32 bit Linux Docker image that we use (Travis CI nowadays sets
CC=clang by default, but clang is not installed in this image).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of 'touch -m' to modify a file's mtime is slightly less
portable than using our own 'test-tool chmtime'. The important
thing is that these pack-files are ordered in a special way to
ensure the multi-pack-index selects some as the "newer" pack-files
when resolving duplicate objects.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph builtin has an --expire-time option that takes a
datetime using OPT_EXPIRY_DATE(). However, the implementation inside
expire_commit_graphs() was treating a non-zero value as a number of
seconds to subtract from "now".
Update t5323-split-commit-graph.sh to demonstrate the correct value
of the --expire-time option by actually creating a crud .graph file
with mtime earlier than the expire time. Instead of using a super-
early time (1980) we use an explicit, and recent, time. Using
test-tool chmtime to create two files on either end of an exact
second, we create a test that catches this failure no matter the
current time. Using a fixed date is more portable than trying to
format a relative date string into the --expiry-date input.
I noticed this when inspecting some Scalar repos that had an excess
number of commit-graph files. In Scalar, we were using this second
interpretation by using "--expire-time=3600" to mean "delete graphs
older than one hour ago" to avoid deleting a commit-graph that a
foreground process may be trying to load.
Also I noticed that the help text was copied from the --max-commits
option. Fix that help text.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported on the git mailing list, since git-2.25,
git add untracked-dir/
has been tab completing to
git add untracked-dir/./
The cause for this was that with commit b9670c1f5e (dir: fix checks on
common prefix directory, 2019-12-19),
git ls-files -o --directory untracked-dir/
(or the equivalent `git -C untracked-dir ls-files -o --directory`) began
reporting
untracked-dir/
instead of listing paths underneath that directory. It may also be
worth noting that the real command in question was
git -C untracked-dir ls-files -o --directory '*'
which is equivalent to
git ls-files -o --directory 'untracked-dir/*'
which behaves the same for the purposes of this issue (the '*' can match
the empty string), but becomes relevant for the proposed fix.
At first, based on the report, I decided to try to view this as a
regression and tried to find a way to recover the old behavior without
breaking other stuff, or at least breaking as little as possible.
However, in the end, I couldn't figure out a way to do it that wouldn't
just cause lots more problems than it solved. The old behavior was a
bug:
* Although older git would avoid cleaning anything with `git clean -f
.git`, it would wipe out everything under that direcotry with `git
clean -f .git/`. Despite the difference in command used, this is
relevant because the exact same change that fixed clean changed the
behavior of ls-files.
* Older git would report different results based solely on presence or
absence of a trailing slash for $SUBDIR in the command `git ls-files
-o --directory $SUBDIR`.
* Older git violated the documented behavior of not recursing into
directories that matched the pathspec when --directory was
specified.
* And, after all, commit b9670c1f5e (dir: fix checks on common prefix
directory, 2019-12-19) didn't overlook this issue; it explicitly
stated that the behavior of the command was being changed to bring
it inline with the docs.
(Also, if it helps, despite that commit being merged during the 2.25
series, this bug was not reported during the 2.25 cycle, nor even during
most of the 2.26 cycle -- it was reported a day before 2.26 was
released. So the impact of the change is at least somewhat small.)
Instead of relying on a bug of ls-files in reporting the wrong content,
change the invocation of ls-files used by git-completion to make it grab
paths one depth deeper. Do this by changing '$DIR/*' (match $DIR/ plus
0 or more characters) into '$DIR/?*' (match $DIR/ plus 1 or more
characters). Note that the '?' character should not be added when
trying to complete a filename (e.g. 'git ls-files -o --directory
"merge.c?*"' would not correctly return "merge.c" when such a file
exists), so we have to make sure to add the '?' character only in cases
where the path specified so far is a directory.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the expected calling convention for the dir.c API was:
fill_directory(&dir, ..., pathspec)
foreach entry in dir->entries:
if (dir_path_match(entry, pathspec))
process_or_display(entry)
This may have made sense once upon a time, because the fill_directory() call
could use cheap checks to avoid doing full pathspec matching, and an external
caller may have wanted to do other post-processing of the results anyway.
However:
* this structure makes it easy for users of the API to get it wrong
* this structure actually makes it harder to understand
fill_directory() and the functions it uses internally. It has
tripped me up several times while trying to fix bugs and
restructure things.
* relying on post-filtering was already found to produce wrong
results; pathspec matching had to be added internally for multiple
cases in order to get the right results (see commits 404ebceda0
(dir: also check directories for matching pathspecs, 2019-09-17)
and 89a1f4aaf7 (dir: if our pathspec might match files under a
dir, recurse into it, 2019-09-17))
* it's bad for performance: fill_directory() already has to do lots
of checks and knows the subset of cases where it still needs to do
more checks. Forcing external callers to do full pathspec
matching means they must re-check _every_ path.
So, add the pathspec matching within the fill_directory() internals, and
remove it from external callers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
treat_directory() had a call to both do_match_pathspec() and
match_pathspec(). These calls have migrated through the code somewhat
since their introduction, but we don't actually need both. Replace the
two calls with one, and while at it, move the check earlier in order to
reduce the need for callers of fill_directory() to do post-filtering of
results.
The next patch will address post-filtering more forcefully and provide
more relevant history and context.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handling DIR_KEEP_UNTRACKED_CONTENTS within treat_directory() instead of
as a post-processing step in read_directory():
* allows us to directly access and remove the relevant entries instead
of needing to calculate which ones need to be removed
* keeps the logic for directory handling in one location (and puts it
closer the the logic for stripping out extra ignored entries, which
seems logical).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir's read_directory_recursive() naturally operates recursively in order
to walk the directory tree. Treating of directories is sometimes weird
because there are so many different permutations about how to handle
directories. Some examples:
* 'git ls-files -o --directory' only needs to know that a directory
itself is untracked; it doesn't need to recurse into it to see what
is underneath.
* 'git status' needs to recurse into an untracked directory, but only
to determine whether or not it is empty. If there are no files
underneath, the directory itself will be omitted from the output.
If it is not empty, only the directory will be listed.
* 'git status --ignored' needs to recurse into untracked directories
and report all the ignored entries and then report the directory as
untracked -- UNLESS all the entries under the directory are
ignored, in which case we don't print any of the entries under the
directory and just report the directory itself as ignored. (Note
that although this forces us to walk all untracked files underneath
the directory as well, we strip them from the output, except for
users like 'git clean' who also set DIR_KEEP_TRACKED_CONTENTS.)
* For 'git clean', we may need to recurse into a directory that
doesn't match any specified pathspecs, if it's possible that there
is an entry underneath the directory that can match one of the
pathspecs. In such a case, we need to be careful to omit the
directory itself from the list of paths (see commit 404ebceda0
("dir: also check directories for matching pathspecs", 2019-09-17))
Part of the tension noted above is that the treatment of a directory can
change based on the files within it, and based on the various settings
in dir->flags. Trying to keep this in mind while reading over the code,
it is easy to think in terms of "treat_directory() tells us what to do
with a directory, and read_directory_recursive() is the thing that
recurses". Since we need to look into a directory to know how to treat
it, though, it is quite easy to decide to (also) recurse into the
directory from treat_directory() by adding a read_directory_recursive()
call. Adding such a call is actually fine, IF we make sure that
read_directory_recursive() does not also recurse into that same
directory.
Unfortunately, commit df5bcdf83a ("dir: recurse into untracked dirs
for ignored files", 2017-05-18), added exactly such a case to the code,
meaning we'd have two calls to read_directory_recursive() for an
untracked directory. So, if we had a file named
one/two/three/four/five/somefile.txt
and nothing in one/ was tracked, then 'git status --ignored' would
call read_directory_recursive() twice on the directory 'one/', and
each of those would call read_directory_recursive() twice on the
directory 'one/two/', and so on until read_directory_recursive() was
called 2^5 times for 'one/two/three/four/five/'.
Avoid calling read_directory_recursive() twice per level by moving a
lot of the special logic into treat_directory().
Since dir.c is somewhat complex, extra cruft built up around this over
time. While trying to unravel it, I noticed several instances where the
first call to read_directory_recursive() would return e.g.
path_untracked for some directory and a later one would return e.g.
path_none, despite the fact that the directory clearly should have been
considered untracked. The code happened to work due to the side-effect
from the first invocation of adding untracked entries to dir->entries;
this allowed it to get the correct output despite the supposed override
in return value by the later call.
I am somewhat concerned that there are still bugs and maybe even
testcases with the wrong expectation. I have tried to carefully
document treat_directory() since it becomes more complex after this
change (though much of this complexity came from elsewhere that probably
deserved better comments to begin with). However, much of my work felt
more like a game of whackamole while attempting to make the code match
the existing regression tests than an attempt to create an
implementation that matched some clear design. That seems wrong to me,
but the rules of existing behavior had so many special cases that I had
a hard time coming up with some overarching rules about what correct
behavior is for all cases, forcing me to hope that the regression tests
are correct and sufficient. Such a hope seems likely to be ill-founded,
given my experience with dir.c-related testcases in the last few months:
Examples where the documentation was hard to parse or even just wrong:
* 3aca58045f (git-clean.txt: do not claim we will delete files with
-n/--dry-run, 2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
Examples where testcases were declared wrong and changed:
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* a2b13367fe (Revert "dir.c: make 'git-status --ignored' work within
leading directories", 2019-12-10)
Examples where testcases were clearly inadequate:
* 502c386ff9 (t7300-clean: demonstrate deleting nested repo with an
ignored file breakage, 2019-08-25)
* 7541cc5302 (t7300: add testcases showing failure to clean specified
pathspecs, 2019-09-17)
* a5e916c745 (dir: fix off-by-one error in match_pathspec_item,
2019-09-17)
* 404ebceda0 (dir: also check directories for matching pathspecs,
2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* 452efd11fb (t3011: demonstrate directory traversal failures,
2019-12-10)
* b9670c1f5e (dir: fix checks on common prefix directory, 2019-12-19)
Examples where "correct behavior" was unclear to everyone:
https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
Other commits of note:
* 902b90cf42 (clean: fix theoretical path corruption, 2019-09-17)
However, on the positive side, it does make the code much faster. For
the following simple shell loop in an empty repository:
for depth in $(seq 10 25)
do
dirs=$(for i in $(seq 1 $depth) ; do printf 'dir/' ; done)
rm -rf dir
mkdir -p $dirs
>$dirs/untracked-file
/usr/bin/time --format="$depth: %e" git status --ignored >/dev/null
done
I saw the following timings, in seconds (note that the numbers are a
little noisy from run-to-run, but the trend is very clear with every
run):
10: 0.03
11: 0.05
12: 0.08
13: 0.19
14: 0.29
15: 0.50
16: 1.05
17: 2.11
18: 4.11
19: 8.60
20: 17.55
21: 33.87
22: 68.71
23: 140.05
24: 274.45
25: 551.15
For the above run, using strace I can look for the number of untracked
directories opened and can verify that it matches the expected
2^($depth+1)-2 (the sum of 2^1 + 2^2 + 2^3 + ... + 2^$depth).
After this fix, with strace I can verify that the number of untracked
directories that are opened drops to just $depth, and the timings all
drop to 0.00. In fact, it isn't until a depth of 190 nested directories
that it sometimes starts reporting a time of 0.01 seconds and doesn't
consistently report 0.01 seconds until there are 240 nested directories.
The previous code would have taken
17.55 * 2^220 / (60*60*24*365) = 9.4 * 10^59 YEARS
to have completed the 240 nested directories case. It's not often
that you get to speed something up by a factor of 3*10^69.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic in treat_directory() is handled by a multi-case
switch statement, but this switch is very asymmetrical, as
the first two cases are simple but the third is more
complicated than the rest of the method. In fact, the third
case includes a "break" statement that leads to the block
of code outside the switch statement. That is the only way
to reach that block, as the switch handles all possible
values from directory_exists_in_index();
Extract the switch statement into a series of "if" statements.
This simplifies the trivial cases, while clarifying how to
reach the "show_other_directories" case. This is particularly
important as the "show_other_directories" case will expand
in a later change.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite having contributed several fixes in this area, I have for months
(years?) assumed that the "exclude" variable was a directive; this
caused me to think of it as a different mode we operate in and left me
confused as I tried to build up a mental model around why we'd need such
a directive. I mostly tried to ignore it while focusing on the pieces I
was trying to understand.
Then I finally traced this variable all back to a call to is_excluded(),
meaning it was actually functioning as an adjective. In particular, it
was a checked property ("Does this path match a rule in .gitignore?"),
rather than a mode passed in from the caller. Change the variable name
to match the part of speech used by the function called to define it,
which will hopefully make these bits of code slightly clearer to the
next reader.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 16e2cfa909 ("read_directory(): further split treat_path()",
2010-01-08) split treat_one_path() out of treat_path(), because
treat_leading_path() would not have access to a dirent but wanted to
re-use as much of treat_path() as possible. Not re-using all of
treat_path() caused other bugs, as noted in commit b9670c1f5e ("dir:
fix checks on common prefix directory", 2019-12-19). Finally, in commit
ad6f2157f9 ("dir: restructure in a way to avoid passing around a
struct dirent", 2020-01-16), dirents were removed from treat_path() and
other functions entirely.
Since the only reason for splitting these functions was the lack of a
dirent -- which no longer applies to either function -- and since the
split caused problems in the past resulting in us not using
treat_one_path() separately anymore, just undo the split.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds seven new ls-files tests. While currently all seven test
pass, my earlier rounds of restructuring dir.c to replace an exponential
algorithm with a linear one passed all the tests in the testsuite but
failed six of these seven new tests. Add these tests to increase our
case coverage.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out the t7063 has some testcases that even without using the
untracked cache cover situations that nothing else in the testsuite
handles. Checking the results of
git status --porcelain
both with and without the untracked cache, and comparing both against
our expected results helped uncover a critical bug in some dir.c
restructuring.
Unfortunately, it's not easy to run status and tell it to ignore the
untracked cache; the only knob we have is core.untrackedCache=false,
which is used to instruct git to *delete* the untracked cache (which
might also ignore the untracked cache when it operates, but that isn't
specified in the docs).
Create a simple helper that will create a clone of the index that is
missing the untracked cache bits, and use it to compare that the results
with the untracked cache match the results we get without the untracked
cache.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning with --single-branch, we implement git-fetch's usual
tag-following behavior, grabbing any tag objects that point to objects
we have locally.
When we're a partial clone, though, our has_object_file() check will
actually lazy-fetch each tag. That not only defeats the purpose of
--single-branch, but it does it incredibly slowly, potentially kicking
off a new fetch for each tag. This is even worse for a shallow clone,
which implies --single-branch, because even tags which are supersets of
each other will be fetched individually.
We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call,
which is what git-fetch does in this case.
Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch
does. The rationale is discussed in 5827a03545 (fetch: use "quick"
has_sha1_file for tag following, 2016-10-13), but here the tradeoff
would apply even more so because clone is very unlikely to be racing
with another process repacking our newly-created repository.
This may provide a very small speedup even in the non-partial case case,
as we'd avoid calling reprepare_packed_git() for each tag (though in
practice, we'd only have a single packfile, so that reprepare should be
quite cheap).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the config file we use when we build the user manual with
AsciiDoc. The comment at the top of this chunk that we're removing says
the following:
"unbreak" docbook-xsl v1.68 for manpages (sic!). v1.69 works with or
without this.
This comes from d19fbc3c17 ("Documentation: add git user's manual",
2007-01-07), where it looks like this conf file in general and this
snippet in particular was copy-pasted from asciidoc.conf.
This chunk is very similar to something we just got rid of for the
manpages, and because this appears to be aimed at v1.68 -- which we no
longer support for the manpages as of a few commits ago --, it's
tempting to get rid of this. That reveals an interesting aspect of
"works with or without this": it turns out it actually works /better/
without!
Dropping this makes us render code snippets and shell listings using
<screen> rather than <literallayout>, just like Asciidoctor does. In
user-manual.pdf, this puts the contents into dimmed-background,
easy-to-distinguish-from-the-surrounding-text boxes, as opposed to
white-background (transparent) boxes.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast forwarding, `git --merge' should act the same whether
`rebase.abbreviateCommands' is set or not, but so far it was not the
case. This duplicates the tests ensuring that `--merge' works when fast
forwarding to check if it also works with abbreviated commands.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the sequencer is requested to abbreviate commands, it will replace
those that do not have a short form (eg. `noop') by a comment mark.
`noop' serves no purpose, except when fast-forwarding (ie. by running
`git rebase'). Removing it will break this command when
`rebase.abbreviateCommands' is set to true.
Teach todo_list_to_strbuf() to check if a command has an actual
short form, and to ignore it if not.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the given example, `commit` cannot be `NULL` (because this is the
loop condition: if it was `NULL`, the loop body would not be entered at
all). It took this developer a moment or two to see that this is
therefore dead code.
Let's remove it, to avoid puzzling future readers.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ths has been oid_array for some time, though the source only recently
moved from sha1-array.[ch] to oid-array.[ch]. In either case, we should
say "oid-array" here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment refers to the "sha1s in the given sha1 array". But this became
an oid_array along with everywhere else in 910650d2f8 (Rename sha1_array
to oid_array, 2017-03-31). Plus there's an extra line of leftover
editing cruft we can drop.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our join_sha1_array_hex() function long ago switched to using an
oid_array; let's change the name to match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This matches the actual data structure name, as well as the source file
that contains the code we're testing. The test scripts need updating to
use the new name, as well.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit started using size_t for our allocations. There are
some iterations that use int or unsigned, though. These aren't dangerous
with respect to memory, but they could produce incorrect results.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The oid_array object uses an "int" to store the number of items and the
allocated size. It's rather unlikely for somebody to have more than 2^31
objects in a repository (the sha1's alone would be 40GB!), but if they
do, we'd overflow our alloc variable.
You can reproduce this case with something like:
git init repo
cd repo
# make a pack with 2^24 objects
perl -e '
my $nr = 2**24;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' | git fast-import
# now make 256 copies of it; most of these objects will be duplicates,
# but oid_array doesn't de-dup until all values are read and it can
# sort the result.
cd .git/objects/pack/
pack=$(echo *.pack)
idx=$(echo *.idx)
for i in $(seq 0 255); do
# no need to waste disk space
ln "$pack" "pack-extra-$i.pack"
ln "$idx" "pack-extra-$i.idx"
done
# and now force an oid_array to store all of it
git cat-file --batch-all-objects --batch-check
which results in:
fatal: size_t overflow: 32 * 18446744071562067968
So the good news is that st_mult() sees the problem (the large number is
because our int wraps negative, and then that gets cast to a size_t),
doing the job it was meant to: bailing in crazy situations rather than
causing an undersized buffer.
But we should avoid hitting this case at all, and instead limit
ourselves based on what malloc() is willing to give us. We can easily do
that by switching to size_t.
The cat-file process above made it to ~120GB virtual set size before the
integer overflow (our internal hash storage is 32-bytes now in
preparation for sha256, so we'd expect ~128GB total needed, plus
potentially more to copy from one realloc'd block to another)). After
this patch (and about 130GB of RAM+swap), it does eventually read in the
whole set. No test for obvious reasons.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git is an enormously flexible and powerful piece of software. However,
it can be intimidating for many users and there are a set of common
questions that users often ask. While we already have some new user
documentation, it's worth adding a FAQ to address common questions that
users often have. Even though some of this is addressed elsewhere in
the documentation, experience has shown that it is difficult for users
to find, so a centralized location is helpful.
Add such a FAQ and fill it with some common questions and answers.
While there are few entries now, we can expand it in the future to cover
more things as we find new questions that users have. Let's also add
section markers so that people answering questions can directly link
users to the proper answer.
The FAQ also addresses common configuration questions that apply not
only to Git as an independent piece of software but also the ecosystem
of CI tools and hosting providers that people use, since these are the
source of common questions. An attempt has been made to avoid
mentioning any particular provider or tool, but to nevertheless cover
common configurations that apply to a wide variety of such tools.
Note that the long lines for certain questions are required, since
Asciidoctor does not permit broken lines there.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the strbuf interface already provides functions to read a line
into it that completely replaces its current contents, we do not have an
interface that allows for appending lines without discarding current
contents.
Add a new function `strbuf_appendwholeline` that reads a line including
its terminating character into a strbuf non-destructively. This is a
preparatory step for git-update-ref(1) reading standard input line-wise
instead of as a block.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description for the "verify" command is lacking a single word "is",
which this commit corrects.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cleaning up a transaction that has no updates queued, then the
transaction's backend data will not have been allocated. We correctly
handle this for the packed backend, where the cleanup function checks
whether the backend data has been allocated at all -- if not, then there
is nothing to clean up. For the files backend we do not check this and
as a result will hit a segfault due to dereferencing a `NULL` pointer
when cleaning up such a transaction.
Fix the issue by checking whether `backend_data` is set in the files
backend, too.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git commit-graph write --changed-paths', we sort the
commits by pack-order to save time when computing the changed-paths
bloom filters. This does not help when finding the commits via the
'--reachable' flag.
If not using pack-order, then sort by generation number before
examining the diff. Commits with similar generation are more likely
to have many trees in common, making the diff faster.
On the Linux kernel repository, this change reduced the computation
time for 'git commit-graph write --reachable --changed-paths' from
3m00s to 1m37s.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Looking at the diff of commit objects in pack order is much faster than
in sha1 order, as it gives locality to the access of tree deltas
(whereas sha1 order is effectively random). Unfortunately the
commit-graph code sorts the commits (several times, sometimes as an oid
and sometimes a pointer-to-commit), and we ultimately traverse in sha1
order.
Instead, let's remember the position at which we see each commit, and
traverse in that order when looking at bloom filters. This drops my time
for "git commit-graph write --changed-paths" in linux.git from ~4
minutes to ~1.5 minutes.
Probably the "--reachable" code path would want something similar.
Or alternatively, we could use a different data structure (either a
hash, or maybe even just a bit in "struct commit") to keep track of
which oids we've seen, etc instead of sorting. And then we could keep
the original order.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new COMMIT_GRAPH_WRITE_CHANGED_PATHS flag that makes Git compute
Bloom filters for the paths that changed between a commit and it's
first parent, for each commit in the commit-graph. This computation
is done on a commit-by-commit basis.
We will write these Bloom filters to the commit-graph file, to store
this data on disk, in the next change in this series.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing the changed-paths bloom filters for the commit-graph,
we limit the size of the filter by restricting the number of paths
in the diff. Instead of computing a large diff and then ignoring the
result, it is better to halt the diff computation early.
Create a new "max_changes" option in struct diff_options. If non-zero,
then halt the diff computation after discovering strictly more changed
paths. This includes paths corresponding to trees that change.
Use this max_changes option in the bloom filter calculations. This
reduces the time taken to compute the filters for the Linux kernel
repo from 2m50s to 2m35s. On a large internal repository with ~500
commits that perform tree-wide changes, the time reduced from
6m15s to 3m48s.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the core implementation for computing Bloom filters for
the paths changed between a commit and it's first parent.
We fill the Bloom filters as (const char *data, int len) pairs
as `struct bloom_filters" within a commit slab.
Filters for commits with no changes and more than 512 changes,
is represented with a filter of length zero. There is no gain
in distinguishing between a computed filter of length zero for
a commit with no changes, and an uncomputed filter for new commits
or for commits with more than 512 changes. The effect on
`git log -- path` is the same in both cases. We will fall back to
the normal diffing algorithm when we can't benefit from the
existence of Bloom filters.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the constructs for Bloom filters, Bloom filter keys
and Bloom filter settings.
For details on what Bloom filters are and how they work, refer
to Dr. Derrick Stolee's blog post [1]. It provides a concise
explanation of the adoption of Bloom filters as described in
[2] and [3].
Implementation specifics:
1. We currently use 7 and 10 for the number of hashes and the
size of each entry respectively. They served as great starting
values, the mathematical details behind this choice are
described in [1] and [4]. The implementation, while not
completely open to it at the moment, is flexible enough to allow
for tweaking these settings in the future.
Note: The performance gains we have observed with these values
are significant enough that we did not need to tweak these
settings. The performance numbers are included in the cover letter
of this series and in the commit message of the subsequent commit
where we use Bloom filters to speed up `git log -- path`.
2. As described in [1] and [3], we do not need 7 independent hashing
functions. We use the Murmur3 hashing scheme, seed it twice and
then combine those to procure an arbitrary number of hash values.
3. The filters will be sized according to the number of changes in
each commit, in multiples of 8 bit words.
[1] Derrick Stolee
"Supercharging the Git Commit Graph IV: Bloom Filters"
https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-Bloom-filters/
[2] Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese
"An Improved Construction for Counting Bloom Filters"
http://theory.stanford.edu/~rinap/papers/esa2006b.pdfhttps://doi.org/10.1007/11841036_61
[3] Peter C. Dillinger and Panagiotis Manolios
"Bloom Filters in Probabilistic Verification"
http://www.ccs.neu.edu/home/pete/pub/Bloom-filters-verification.pdfhttps://doi.org/10.1007/978-3-540-30494-4_26
[4] Thomas Mueller Graf, Daniel Lemire
"Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters"
https://arxiv.org/abs/1912.08258
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a minor cleanup to make it easier to change
the number of chunks being written to the commit
graph.
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 50033772d5 ("connected: verify promisor-ness of partial clone",
2020-01-30), the fast path (checking promisor packs) in
check_connected() now passes a subset of the slow path (rev-list) - if
all objects to be checked are found in promisor packs, both the fast
path and the slow path will pass; otherwise, the fast path will
definitely not pass. This means that we can always attempt the fast path
whenever we need to do the slow path.
The fast path is currently guarded by a flag; therefore, remove that
flag. Also, make the fast path fallback to the slow path - if the fast
path fails, the failing OID and all remaining OIDs will be passed to
rev-list.
The main user-visible benefit is the performance of fetch from a partial
clone - specifically, the speedup of the connectivity check done before
the fetch. In particular, a no-op fetch into a partial clone on my
computer was sped up from 7 seconds to 0.01 seconds. This is a
complement to the work in 2df1aa239c ("fetch: forgo full
connectivity check if --filter", 2020-01-30), which is the child of the
aforementioned 50033772d5. In that commit, the connectivity check
*after* the fetch was sped up.
The addition of the fast path might cause performance reductions in
these cases:
- If a partial clone or a fetch into a partial clone fails, Git will
fruitlessly run rev-list (it is expected that everything fetched
would go into promisor packs, so if that didn't happen, it is most
likely that rev-list will fail too).
- Any connectivity checks done by receive-pack, in the (in my opinion,
unlikely) event that a partial clone serves receive-pack.
I think that these cases are rare enough, and the performance reduction
in this case minor enough (additional object DB access), that the
benefit of avoiding a flag outweighs these.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'pack.useSparse' configuration variable now defaults to 'true',
enabling an optimization that has been experimental since Git 2.21.
* ds/default-pack-use-sparse-to-true:
pack-objects: flip the use of GIT_TEST_PACK_SPARSE
config: set pack.useSparse=true by default
Several of the previous commits have been bumping the minimum supported
version of docbook-xsl and dropping various workarounds. Most recently,
we made the minimum be 1.73.0.
In INSTALL, we claim that with 1.73, one needs a certain patch in
contrib/patches/. There is no such patch. It was added in 2ec39edad9
("INSTALL: add warning on docbook-xsl 1.72 and 1.73", 2007-08-03) and
dropped in 9721ac9010 ("contrib: remove continuous/ and patches/",
2013-06-03).
Rather than resurrecting version 1.73 and the patch and testing them,
just raise our minimum supported docbook-xsl version to 1.74.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After an earlier commit, we only include manpage-base.xsl from a single
file, manpage-normal.xsl. Fold the former into the latter.
We only ever needed the "base, normal and non-normal" construct to
support a single non-normal case, namely to work around issues with
docbook-xsl 1.72 handling backslashes and dots. If we ever need
something like this again, we can re-introduce manpage-base.xsl and
friends. Whatever issue we'd be trying to work around, it probably
wouldn't involve dots and backslashes like this, anyway.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to assign git.docbook.backslash one of two different values --
one "normal" and one for working around a problem with docbook-xsl 1.72.
After the previous commit, we don't support that version anymore and
always use the "normal" value, a literal backslash.
Just explicitly use a backslash instead of using git.docbook.backslash.
The next commit will drop the definition of git.docbook.backslash
entirely.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop the DOCBOOK_XSL_172 config knob, which was needed with docbook-xsl
1.72 (but neither 1.71 nor 1.73). Version 1.73.0 is more than twelve
years old.
Together with the last few commits, we are now at a point where we don't
have any Makefile knobs to cater to old/broken versions of docbook-xsl.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
docbook-xsl 1.72.0 is thirteen years old. Drop the ASCIIDOC_ROFF knob
which was needed to support 1.68.1 - 1.71.1. The next commit will
increase the required/assumed version further.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop the DOCBOOK_SUPPRESS_SP mechanism, which needs to be used with
docbook-xsl versions 1.69.1 through 1.71.0.
We probably broke this for Asciidoctor builds in f6461b82b9
("Documentation: fix build with Asciidoctor 2", 2019-09-15). That is, we
should/could fix this similar to 55aca515eb ("manpage-bold-literal.xsl:
match for namespaced "d:literal" in template", 2019-10-31). But rather
than digging out such an old version of docbook-xsl to test that, let's
just use this as an excuse for dropping this decade-old workaround.
DOCBOOK_SUPPRESS_SP was not needed with docbook-xsl 1.69.0 and older.
Maybe such old versions still work fine on our docs, or maybe not. Let's
just refer to everything before 1.71.1 as "not supported". The next
commit will increase the required/assumed version further.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code path in packetize() for reading stdin needs to handle NUL
bytes, so we can't rely on shell variables. However, the current code
takes a whopping 4 processes and uses a temporary file. We can do this
much more simply and efficiently by using a single perl invocation (and
we already rely on perl in the matching depacketize() function).
We'll keep the non-stdin code path as it is, since that uses zero extra
processes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The construct has been in POSIX for the past 10+ years, and we have
used in t9xxx (subversion) series of the tests, so we know it is at
portable across systems that people have run those tests, which is
almost everything we'd care about.
Let's loosen the rule; luckily, the check-non-portable-shell script
does not have any rule to find its use, so the only change needed is
a removal of one paragraph from the documentation.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over time, we added the support to our test framework to make it
easy to leave a test early with failure, but it was not clearly
documented in t/README to help developers writing new tests.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch options --deepen, --negotiation-tip, --server-option,
--shallow-exclude, and --shallow-since are documented for git pull as
well, but are not actually accepted by that command. Pass them on to
make the code match its documentation.
Reported-by: 天几 <muzimuzhi@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git pull' implicitly passes --update-head-ok to 'git fetch', but
doesn't itself accept that option from users. That makes sense, as it
wouldn't work without the possibility to update HEAD. Remove the option
from the command's documentation to match its actual behavior.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codebase uses tabs for indentation. Convert an erroneous space
indent into a tab indent.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verifying a midx index with 0 objects, the
m->num_objects - 1
underflows and wraps around to 4294967295.
Fix this both by checking that the midx contains at least one oid,
and also that we don't write any midx when there is no packfiles.
Update the tests to check that `git multi-pack-index write` does
not write an midx when there is no objects, and another to check
that `git multi-pack-index verify` warns when it verifies an midx with no
objects. For this last test, use t5319/no-objects.midx which was
generated by an older version of git.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When opt_rebase is true, we still first check if we can fast-forward.
If the branch is fast-forwardable, then we can avoid the rebase and just
use merge to do the fast-forward logic. However, when commit a6d7eb2c7a
("pull: optionally rebase submodules (remote submodule changes only)",
2017-06-23) added the ability to rebase submodules it accidentally
caused us to run BOTH a merge and a rebase. Add a flag to avoid doing
both.
This was found when a user had both pull.rebase and rebase.autosquash
set to true. In such a case, the running of both merge and rebase would
cause ORIG_HEAD to be updated twice (and match HEAD at the end instead
of the commit before the rebase started), against expectation.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We add the result of "curl-config --libs" when linking curl programs,
but we never bother calling "curl-config --cflags". Presumably nobody
noticed because:
- a system libcurl installed into /usr/include/curl wouldn't need any
flags ("/usr/include" is already in the search path, and the
#include lines all look <curl/curl.h>, etc).
- using CURLDIR sets up both the includes and the library path
However, if you prefer CURL_CONFIG to CURLDIR, something simple like:
make CURL_CONFIG=/path/to/curl-config
doesn't work. We'd link against the libcurl specified by that program,
but not find its header files when compiling.
Let's invoke "curl-config --cflags" similar to the way we do for
"--libs". Note that we'll feed the result into BASIC_CFLAGS. The rest of
the Makefile doesn't distinguish which files need curl support during
compilation and which do not. That should be OK, though. At most this
should be adding a "-I" directive, and this is how CURLDIR already
behaves. And since we follow the immediate-variable pattern from
CURL_LDFLAGS, we won't accidentally invoke curl-config once per
compilation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user hasn't set the CURL_LDFLAGS Makefile variable, we invoke
curl-config like this:
CURL_LIBCURL += $(shell $(CURL_CONFIG) --libs)
Because the shell function is run when the value is expanded, we invoke
curl-config each time we need to link something (which generally ends up
being four times for a full build).
Instead, let's use an immediate Makefile variable, which only needs
expanding once. We can't combine that with the existing "+=", but since
we only do this when CURL_LDFLAGS is undefined, we can just set that
variable.
That also allows us to simplify our conditional a bit, since both sides
will then put the result into CURL_LIBCURL. While we're touching it,
let's fix the indentation to match the nearby code (we're inside an
outer conditional, so everything else is indented one level).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 645c432d61 (pack-objects: use reachability bitmap index when
generating non-stdout pack, 2016-09-10) added two timing tests for
packing to an on-disk file, both with and without bitmaps. However, the
non-bitmap one isn't interesting to have as part of p5310's regression
suite. It _could_ be used as a baseline to show off the improvement in
the bitmap case, but:
- the point of the t/perf suite is to find performance regressions,
and it won't help with that. We don't compare the numbers between
two tests (which the perf suite has no idea are even related), and
any change in its numbers would have nothing to do with bitmaps.
- it did show off the improvement in the commit message of 645c432d61,
but it wasn't even necessary there. The bitmap case already shows an
improvement (because before the patch, it behaved the same as the
non-bitmap case), and the perf suite is even able to show the
difference between the before and after measurements.
On top of that, it's one of the most expensive tests in the suite,
clocking in around 60s for linux.git on my machine (as compared to 16s
for the bitmapped version). And by default when using "./run", we'd run
it three times!
So let's just drop it. It's not useful and is adding minutes to perf
runs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing the arguments list for a v2 ls-refs or fetch command, we
loop like this:
while (packet_reader_read(request) != PACKET_READ_FLUSH) {
const char *arg = request->line;
...handle arg...
}
to read and handle packets until we see a flush. The hidden assumption
here is that anything except PACKET_READ_FLUSH will give us valid packet
data to read. But that's not true; PACKET_READ_DELIM or PACKET_READ_EOF
will leave packet->line as NULL, and we'll segfault trying to look at
it.
Instead, we should follow the more careful model demonstrated on the
client side (e.g., in process_capabilities_v2): keep looping as long
as we get normal packets, and then make sure that we broke out of the
loop due to a real flush. That fixes the segfault and correctly
diagnoses any unexpected input from the client.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packetize() function takes its input on stdin, and requires 4
separate sub-processes to format a simple string. We can do much better
by getting the length via the shell's "${#packet}" construct. The one
caveat is that the shell can't put a NUL into a variable, so we'll have
to continue to provide the stdin form for a few calls.
There are a few other cleanups here in the touched code:
- the stdin form of packetize() had an extra stray "%s" when printing
the packet
- the converted calls in t5562 can be made simpler by redirecting
output as a block, rather than repeated appending
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If commands like merge or rebase materialize files as part of their work,
or a previous sparse-checkout command failed to update individual files
due to dirty changes, users may want a command to simply 'reapply' the
sparsity rules. Provide one.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting and clearing of the SKIP_WORKTREE bit is not only done when
users run 'sparse-checkout'; other commands such as 'checkout' also run
through unpack_trees() which has logic for handling this special bit.
As such, we need to consider how they handle special cases. A couple
comparison points should help explain the rationale for changing how
unpack_trees() handles these bits:
Ignoring sparse checkouts for a moment, if you are switching
branches and have dirty changes, it is only considered an error that
will prevent the branch switching from being successful if the dirty
file happens to be one of the paths with different contents.
SKIP_WORKTREE has always been considered advisory; for example, if
rebase or merge need or even want to materialize a path as part of
their work, they have always been allowed to do so regardless of the
SKIP_WORKTREE setting. This has been used for unmerged paths, but
it was often used for paths it wasn't needed just because it made
the code simpler. It was a best-effort consideration, and when it
materialized paths contrary to the SKIP_WORKTREE setting, it was
never required to even print a warning message.
In the past if you trying to run e.g. 'git checkout' and:
1) you had a path that was materialized and had some dirty changes
2) the path was listed in $GITDIR/info/sparse-checkout
3) this path did not different between the current and target branches
then despite the comparison points above, the inability to set
SKIP_WORKTREE was treated as a *hard* error that would abort the
checkout operation. This is completely inconsistent with how
SKIP_WORKTREE is handled elsewhere, and rather annoying for users as
leaving the paths materialized in the working copy (with a simple
warning) should present no problem at all.
Downgrade any errors from inability to toggle the SKIP_WORKTREE bit to a
warning and allow the operations to continue.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sparse-checkout runs to update the list of sparsity patterns, it
gives warnings if it can't remove paths from the working tree because
those files have dirty changes. Add a similar warning for unmerged
paths as well.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The messages for problems with sparse paths are phrased as errors that
cause the operation to abort, even though we are not making the
operation abort. Reword the messages to make sense in their new
context.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
display_error_msgs() is never called to show messages of both ERROR_*
and WARNING_* types at the same time; it is instead called multiple
times, separately for each type. Since we want to display these types
differently, make two slightly different versions of this function.
A subsequent commit will further modify unpack_trees() and how it calls
the new display_warning_msgs().
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to treat issues with setting the SKIP_WORKTREE bit as a warning
rather than an error; rename the enum values to reflect this intent as
a simple step towards that goal.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A minor change, but we want to convert the sparse messages to warnings
and this allows us to group warnings and errors.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup_unpack_trees_porcelain() provides much improved error/warning
messages; instead of a message that assumes that there is only one path
with a given problem despite being used by code that intentionally is
grouping and showing errors together, it uses a message designed to be
used with groups of paths. For example, this transforms
error: Entry ' folder1/a
folder2/a
' not uptodate. Cannot update sparse checkout.
into
error: Cannot update sparse checkout: the following entries are not up to date:
folder1/a
folder2/a
In the past the suboptimal messages were never actually triggered
because we would error out if the working directory wasn't clean before
we even called unpack_trees(). The previous commit changed that,
though, so let's use the better error messages.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the equivalent of 'git read-tree -mu HEAD' in the sparse-checkout
codepaths for setting the SKIP_WORKTREE bits and instead use the new
update_sparsity() function.
Note that when an issue is hit, the error message splits 'error' and
'Cannot update sparse checkout' on separate lines. For now, we use two
greps to find both pieces of the error message but subsequent commits
will clean up the messages reported to the user.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the only way to update the SKIP_WORKTREE bits for various
paths was invoking `git read-tree -mu HEAD` or calling the same code
that this codepath invoked. This however had a number of problems if
the index or working directory were not clean. First, let's consider
the case:
Flipping SKIP_WORKTREE -> !SKIP_WORKTREE (materializing files)
If the working tree was clean this was fine, but if there were files or
directories or symlinks or whatever already present at the given path
then the operation would abort with an error. Let's label this case
for later discussion:
A) There is an untracked path in the way
Now let's consider the opposite case:
Flipping !SKIP_WORKTREE -> SKIP_WORKTREE (removing files)
If the index and working tree was clean this was fine, but if there were
any unclean paths we would run into problems. There are three different
cases to consider:
B) The path is unmerged
C) The path has unstaged changes
D) The path has staged changes (differs from HEAD)
If any path fell into case B or C, then the whole operation would be
aborted with an error. With sparse-checkout, the whole operation would
be aborted for case D as well, but for its predecessor of using `git
read-tree -mu HEAD` directly, any paths that fell into case D would be
removed from the working copy and the index entry for that path would be
reset to match HEAD -- which looks and feels like data loss to users
(only a few are even aware to ask whether it can be recovered, and even
then it requires walking through loose objects trying to match up the
right ones).
Refusing to remove files that have unsaved user changes is good, but
refusing to work on any other paths is very problematic for users. If
the user is in the middle of a rebase or has made modifications to files
that bring in more dependencies, then for their build to work they need
to update the sparse paths. This logic has been preventing them from
doing so. Sometimes in response, the user will stage the files and
re-try, to no avail with sparse-checkout or to the horror of losing
their changes if they are using its predecessor of `git read-tree -mu
HEAD`.
Add a new update_sparsity() function which will not error out in any of
these cases but behaves as follows for the special cases:
A) Leave the file in the working copy alone, clear the SKIP_WORKTREE
bit, and print a warning (thus leaving the path in a state where
status will report the file as modified, which seems logical).
B) Do NOT mark this path as SKIP_WORKTREE, and leave it as unmerged.
C) Do NOT mark this path as SKIP_WORKTREE and print a warning about
the dirty path.
D) Mark the path as SKIP_WORKTREE, but do not revert the version
stored in the index to match HEAD; leave the contents alone.
I tried a different behavior for A (leave the SKIP_WORKTREE bit set),
but found it very surprising and counter-intuitive (e.g. the user sees
it is present along with all the other files in that directory, tries to
stage it, but git add ignores it since the SKIP_WORKTREE bit is set). A
& C seem like optimal behavior to me. B may be as well, though I wonder
if printing a warning would be an improvement. Some might be slightly
surprised by D at first, but given that it does the right thing with
`git commit` and even `git commit -a` (`git add` ignores entries that
are marked SKIP_WORKTREE and thus doesn't delete them, and `commit -a`
is similar), it seems logical to me.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a populate_from_existing_patterns() function for reading the
path_patterns from $GIT_DIR/info/sparse-checkout so that we can re-use
it elsewhere.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a path is dirty, removing from the working tree risks losing data.
As such, we want to make sure any such path is not marked with
SKIP_WORKTREE. While the current callers of this code detect this case
and re-populate with a previous set of sparsity patterns, we want to
allow some paths to be marked with SKIP_WORKTREE while others are left
unmarked without it being considered an error. The reason this
shouldn't be considered an error is that SKIP_WORKTREE has always been
an advisory-only setting; merge and rebase for example were free to
materialize paths and clear the SKIP_WORKTREE bit in order to accomplish
their work even though they kept the SKIP_WORKTREE bit set for other
paths. Leaving dirty working files in the working tree is thus a
natural extension of what we have already been doing.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_updates() previously assumed it was working on o->result. We want
to use this function in combination with a different index_state, so
take the intended index_state as a parameter.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit e091228e17 ("sparse-checkout: update working directory
in-process", 2019-11-21) allowed passing a pre-defined set of patterns
to unpack_trees(). However, if o->pl was NULL, it would still read the
existing patterns and use those. If those patterns were read into a
data structure that was allocated, naturally they needed to be free'd.
However, despite the same function being responsible for knowing about
both the allocation and the free'ing, the logic for tracking whether to
free the pattern_list was hoisted to an outer function with an
additional flag in unpack_trees_options. Put the logic back in the
relevant function and discard the now unnecessary flag.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify_absent_sparse() was introduced in commit 08402b0409
("merge-recursive: distinguish "removed" and "overwritten" messages",
2010-08-11), and has always had exactly one caller which always passes
error_type == ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN. This function
then checks whether error_type is this value, and if so, sets it instead
to ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN. It has been nearly a decade
and no other caller has been created, and no other value has ever been
passed, so just pass the expected value to begin with.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit 08402b0409 ("merge-recursive: distinguish "removed" and
"overwritten" messages", 2010-08-11) split
ERROR_WOULD_LOSE_UNTRACKED
into both
ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN
ERROR_WOULD_LOSE_UNTRACKED_REMOVED
and also split
ERROR_WOULD_LOSE_ORPHANED
into both
ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN
ERROR_WOULD_LOSE_ORPHANED_REMOVED
However, despite the split only three of these four types were used.
ERROR_WOULD_LOSE_ORPHANED_REMOVED was not put into use when it was
introduced and nothing else has used it in the intervening decade
either. Remove it.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need for shell libraries to have the executable bit. They're
meant to be sourced, and running them stand-alone is pointless. Let's
reduce any possible confusion by making it more clear they're not meant
to be run this way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of lib-credential.sh is to be sourced into other test
scripts. It doesn't need a "#!/bin/sh" line, as running it directly
makes no sense. Nor does it serve any real filetype documentation
purpose, as the file is clearly named with a ".sh" extension.
In the spirit of c74c72034f (test: replace shebangs with descriptions in
shell libraries, 2013-11-25), let's replace it with a human-readable
description.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Cygwin, the codepath for POSIX-like systems is taken in
run-command.c::start_command(). The prepare_cmd() helper
function is called to decide if the command needs to be looked
up in the PATH. The logic there is to do the PATH-lookup if
and only if it does not have any slash '/' in it. If this test
passes we end up attempting to run the command by appending the
string after each colon-separated component of PATH.
The Cygwin environment supports both Windows and POSIX style
paths, so both forwardslahes '/' and back slashes '\' can be
used as directory separators for any external program the user
supplies.
Examples for path strings which are being incorrectly searched
for in the PATH instead of being executed as is:
- "C:\Program Files\some-program.exe"
- "a\b\c.exe"
To handle these, the PATH lookup detection logic in prepare_cmd()
is taught to know about this Cygwin quirk, by introducing
has_dir_sep(path) helper function to abstract away the difference
between true POSIX and Cygwin systems.
Signed-off-by: Andras Kucsma <r0maikx02b@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, testing if two refs weren't equal with compare_refs() was done
with `test_must_fail compare_refs`. This was wrong for two reasons.
First, test_must_fail should only be used on git commands. Second,
negating the error code is a little heavy-handed since in the case where
one of the git invocations within compare_refs() fails, we will report
success, even though it failed at an unexpected point.
Teach compare_refs() to accept `!` as the first argument which would
_only_ negate the test_cmp()'s return code.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on only allowing `test_must_fail` to work on a
restricted subset of commands, including `git`. Reorder the commands so
that `nongit` comes before `test_must_fail`. This way, `test_must_fail`
operates on a git command.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the 'did not use upload-pack service' test, we have a complicated
song-and-dance to ensure that there are no "/git-upload-pack" lines in
"$HTTPD_ROOT_PATH/access.log". Simplify this by just checking that grep
returns a non-zero exit code.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected references are generated using a here-doc with some inline
command substitutions. If one of the `git rev-parse` invocations within
the command substitutions fails, its return code is swallowed and we
won't know about it. Replace these command substitutions with
generate_references(), which actually reports when `git rev-parse`
fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
busybox's sed isn't binary clean.
Thus, triggers false-negative on this test.
We could replace sed with perl on this usecase.
But, we could slightly modify the helper to discard unwanted data in the
beginning.
Fix the false negative by updating this helper.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff(1) implementation of busybox produces the unified context
format even without being asked, and it cannot produce the default
format, but a test in this script relies on.
We could rewrite the test so that we count the lines in the
postimage out of the unified context format, but the format is not
supported by some implementations of diff (e.g. HP-UX).
Accomodate busybox by adding a fallback code to count postimage
lines in unified context output, when counting in the output in the
default format finds nothing.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
[jc: applied Documentation/CodingGuidelines and tweaked the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull" learned to warn when no pull.rebase configuration
exists, and neither --[no-]rebase nor --ff-only is given (which
would result a merge).
* ah/force-pull-rebase-configuration:
pull: warn if the user didn't say whether to rebase or to merge
"git stash" has kept an escape hatch to use the scripted version
for a few releases, which got stale. It has been removed.
* tg/retire-scripted-stash:
stash: remove the stash.useBuiltin setting
stash: get git_stash_config at the top level
When "git describe C" finds an annotated tag with tagname A to be
the best name to explain commit C, and the tag is stored in a
"wrong" place in the refs/tags hierarchy, e.g. refs/tags/B, the
command gave a warning message but used A (not B) to describe C.
If C is exactly at the tag, the describe output would be "A", but
"git rev-parse A^0" would not be equal as "git rev-parse C^0". The
behavior of the command has been changed to use the "long" form
i.e. A-0-gOBJECTNAME, which is correctly interpreted by rev-parse.
* jc/describe-misnamed-annotated-tag:
describe: force long format for a name based on a mislocated tag
The "--fork-point" mode of "git rebase" regressed when the command
was rewritten in C back in 2.20 era, which has been corrected.
* at/rebase-fork-point-regression-fix:
rebase: --fork-point regression fix
The code to interface with GnuPG has been refactored.
* hi/gpg-prefer-check-signature:
gpg-interface: prefer check_signature() for GPG verification
t: increase test coverage of signature verification output
Provide more information (e.g. the object of the tree-ish in which
the blob being converted appears, in addition to its path, which
has already been given) to smudge/clean conversion filters.
* bc/filter-process:
t0021: test filter metadata for additional cases
builtin/reset: compute checkout metadata for reset
builtin/rebase: compute checkout metadata for rebases
builtin/clone: compute checkout metadata for clones
builtin/checkout: compute checkout metadata for checkouts
convert: provide additional metadata to filters
convert: permit passing additional metadata to filter processes
builtin/checkout: pass branch info down to checkout_worktree
SHA-256 transition continues.
* bc/sha-256-part-1-of-4: (22 commits)
fast-import: add options for rewriting submodules
fast-import: add a generic function to iterate over marks
fast-import: make find_marks work on any mark set
fast-import: add helper function for inserting mark object entries
fast-import: permit reading multiple marks files
commit: use expected signature header for SHA-256
worktree: allow repository version 1
init-db: move writing repo version into a function
builtin/init-db: add environment variable for new repo hash
builtin/init-db: allow specifying hash algorithm on command line
setup: allow check_repository_format to read repository format
t/helper: make repository tests hash independent
t/helper: initialize repository if necessary
t/helper/test-dump-split-index: initialize git repository
t6300: make hash algorithm independent
t6300: abstract away SHA-1-specific constants
t: use hash-specific lookup tables to define test constants
repository: require a build flag to use SHA-256
hex: add functions to parse hex object IDs in any algorithm
hex: introduce parsing variants taking hash algorithms
...
Fix "git checkout --recurse-submodules" of a nested submodule
hierarchy.
* pb/recurse-submodules-fix:
t/lib-submodule-update: add test removing nested submodules
unpack-trees: check for missing submodule directory in merged_entry
unpack-trees: remove outdated description for verify_clean_submodule
t/lib-submodule-update: move a test to the right section
t/lib-submodule-update: remove outdated test description
t7112: remove mention of KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED
Especially when debugging a test failure that can only be reproduced in
the CI build (e.g. when the developer has no access to a macOS machine
other than running the tests on a macOS build agent), output should not
be suppressed.
In the instance of `hi/gpg-prefer-check-signature`, where one
GPG-related test failed for no apparent reason, the entire output of
`gpg` and `gpgsm` was suppressed, even in verbose mode, leaving
interested readers no clue what was going wrong.
Let's fix this by no longer redirecting the output not to `/dev/null`.
This is now possible because the affected prereqs were turned into lazy
ones (and are therefore evaluated via `test_eval_` which respects the
`--verbose` option).
Note that we _still_ redirect `stdout` to `/dev/null` for those commands
that sign their `stdin`, as the output would be binary (and useless
anyway, because the reader would not have anything against which to
compare the output).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to set those prereqs is executed completely outside of any
`test_eval_` block. As a consequence, its output had to be suppressed so
that it does not clutter the output of a regular test script run.
Unfortunately, the output *stays* suppressed even when the `--verbose`
option is in effect.
This hid important output when debugging why the GPG prereq was not
enabled in the Windows part of our CI builds.
In preparation for fixing that, let's move all of this code into lazy
prereqs.
The only slightly tricky part is the global environment variable
`GNUPGHOME`. Originally, it was configured only when we verified that
there is a `gpg` in the `PATH` that we can use. This is now no longer
possible, as lazy prereqs are evaluated in a subshell that changes the
working directory to a temporary one. Therefore, we simply _always_ set
that environment variable: it does not hurt anything because it does not
indicate the presence of a working GPG.
Side note: it was quite tempting to use a hack that is possible because
we do not validate what is passed to `test_lazy_prereq` (and it is
therefore possible to "break out" of the lazy_prereq subshell:
test_lazy_prereq GPG '...) && GNUPGHOME=... && (...'
However, this is rather tricksy hobbitses code, and the current patch is
_much_ easier to understand.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_expect_*` functions use `test_eval_` and so does
`test_run_lazy_prereq_`. If tracing is enabled via the `-x` option,
`test_eval_` turns on tracing while evaluating the code block, and turns
it off directly after it.
This is unwanted for nested invocations.
One somewhat surprising example of this is when running a test that
calls `test_i18ngrep`: that function requires the `C_LOCALE_OUTPUT`
prereq, and that prereq is a lazy one, so it is evaluated via
`test_eval_`, the command tracing is turned off, and the test case
continues to run _without tracing the commands_.
Another somewhat surprising example is when one lazy prereq depends on
another lazy prereq: the former will call `test_have_prereq` with the
latter one, which in turn calls `test_eval_` and -- you guessed it --
tracing (if enabled) will be turned off _before_ returning to evaluating
the other lazy prereq.
As we will introduce just such a scenario with the GPG, GPGSM and
RFC1991 prereqs, let's fix that by introducing a variable that keeps
track of the current trace level: nested `test_eval_` calls will
increment and then decrement the level, and only when it reaches 0, the
tracing will _actually_ be turned off.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It makes no sense to call `./lib-gpg.sh`. Therefore the hash-bang line
is unnecessary.
There are other similar instances in `t/`, but they are too far from the
context of the enclosing patch series, so they will be addressed
separately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mechanism to prevent "git commit" from making an empty commit
or amending during an interrupted cherry-pick was broken during the
rewrite of "git rebase" in C, which has been corrected.
* pw/advise-rebase-skip:
commit: give correct advice for empty commit during a rebase
commit: encapsulate determine_whence() for sequencer
commit: use enum value for multiple cherry-picks
sequencer: write CHERRY_PICK_HEAD for reword and edit
cherry-pick: check commit error messages
cherry-pick: add test for `--skip` advice in `git commit`
t3404: use test_cmp_rev
Update "git p4" to work with Python 3.
* yz/p4-py3:
ci: use python3 in linux-gcc and osx-gcc and python2 elsewhere
git-p4: use python3's input() everywhere
git-p4: simplify regex pattern generation for parsing diff-tree
git-p4: use dict.items() iteration for python3 compatibility
git-p4: use functools.reduce instead of reduce
git-p4: fix freezing while waiting for fast-import progress
git-p4: use marshal format version 2 when sending to p4
git-p4: open .gitp4-usercache.txt in text mode
git-p4: convert path to unicode before processing them
git-p4: encode/decode communication with git for python3
git-p4: encode/decode communication with p4 for python3
git-p4: remove string type aliasing
git-p4: change the expansion test from basestring to list
git-p4: make python2.7 the oldest supported version
The real_path() convenience function can easily be misused; with a
bit of code refactoring in the callers' side, its use has been
eliminated.
* am/real-path-fix:
get_superproject_working_tree(): return strbuf
real_path_if_valid(): remove unsafe API
real_path: remove unsafe API
set_git_dir: fix crash when used with real_path()
A handful of options to configure SSL when talking to proxies have
been added.
* js/https-proxy-config:
http: add environment variable support for HTTPS proxies
http: add client cert support for HTTPS proxies
Revamping of the advise API to allow more systematic enumeration of
advice knobs in the future.
* hw/advise-ng:
tag: use new advice API to check visibility
advice: revamp advise API
advice: change "setupStreamFailure" to "setUpstreamFailure"
advice: extract vadvise() from advise()
In Git for Windows' SDK, we use the MSYS2 version of OpenSSH, meaning
that the `gpg-agent` will fail horribly when being passed a `--homedir`
that contains colons.
Previously, we did pass the Windows version of the absolute path,
though, which starts in the drive letter followed by, you guessed it, a
colon.
Let's use the same trick found elsewhere in our test suite where `$PWD`
is used to refer to the pseudo-Unix path (which works only within the
MSYS2 Bash/OpenSSH/Perl/etc, as opposed to `$(pwd)` which refers to the
Windows path that `git.exe` understands, too).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging a test (or a set of tests), it's common to execute it
with some combination of short options, such as:
$ ./txxx-testname.sh -d -x -i
In cases like this, CLIs usually allow the short options to be bundled
in a single argument, for convenience and agility. Let's add this
feature to test-lib, allowing the above command to be run as:
$ ./txxx-testname.sh -dxi
(or any other permutation, e.g. '-ixd')
Note: Short options that require an argument can also be used in a
bundle, in any position. So, for example, '-r 5 -x', '-xr 5' and '-rx 5'
are all valid and equivalent. A special case would be having a bundle
with more than one of such options. To keep things simple, this case is
not allowed for now. This shouldn't be a major limitation, though, as
the only short option that requires an argument today is '-r'. And
concatenating '-r's as in '-rr 5 6' would probably not be very
practical: its unbundled format would be '-r 5 -r 6', for which test-lib
currently considers only the last argument. Therefore, if '-rr 5 6' were
to be allowed, it would have the same effect as just typing '-r 6'.
Note: the test-lib currently doesn't support '-r5', as an alternative
for '-r 5', so the former is not supported in bundles as well.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
update feature, 2016-08-03), we started to use ls as a trick to update
directory's mtime.
However, `-ls` flag isn't required by POSIX's find(1), and
busybox(1) doesn't implement it.
Use "-exec ls -ld {} +" instead.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only HEAD's object_id is necessary, rev-list is an overkill.
Despite POSIX requires grep(1) treat single pattern with <newline>
as multiple patterns.
busybox's grep(1) (as of v1.31.1) haven't implemented it yet.
Use rev-parse to simplify the test and avoid busybox unimplemented
features.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alpine Linux's default unzip(1) doesn't support `-a`.
Skip those tests on that platform.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_lazy_prereq will be evaluated in a throw-away directory.
Drop unnecessary subshell and mkdir.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shell recognises first non-assignment token as command name.
With /bin/sh linked to either /bin/bash or /bin/dash,
`cd t/perf && ./p0000-perf-lib-sanity.sh -d -i -v` reports:
> test_cmp:1: command not found: diff -u
Using `eval` to unquote $GIT_TEST_CMP as same as precedence in `git_editor`.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
BRE interprets `+` literally, and
`\+` is undefined for POSIX BRE, from:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_02
> The interpretation of an ordinary character preceded
> by an unescaped <backslash> ( '\\' ) is undefined, except for:
> - The characters ')', '(', '{', and '}'
> - The digits 1 to 9 inclusive
> - A character inside a bracket expression
This test is failing with busybox sed, the default sed of Alpine Linux
We have 2 options here:
- Using literal `+` because BRE will interpret it as-is, or
- Using character class `[+]` to defend against a sed that expects ERE
ERE-expected sed is theoretical at this point,
but we haven't found it, yet.
And, we may run into other problems with that sed.
Let's go with first option and fix it later if that sed could be found.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin.h, there exists the distinctly lib-ish function
prune_packed_objects(). This function can currently only be called by
built-in commands but, unlike all of the other functions in the header,
it does not make sense to impose this restriction as the functionality
can be logically reused in libgit.
Extract this function into prune-packed.c so that related definitions
can exist clearly in their own header file.
While we're at it, clean up #includes that are unused.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin.h, there exists the distinctly "lib-ish" function
fmt_merge_msg(). This function can currently only be called by built-in
commands but, unlike most of the other functions in the header, it does
not make sense to impose this restriction as the functionality can be
logically reused in libgit.
Extract this function into fmt-merge-msg.c so that related definitions
can exist clearly in their own header file.
While we're at it, clean up #includes that are unused.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7600, we were rewriting `printf '%s\n' ...` to create files from
parameters, one per line. However, we already have a function that wraps
this for us: test_write_lines(). Rewrite these instances to use that
function instead of open coding it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many += lists in the Makefile and, over time, they have gotten
slightly out of ASCII order. Sort all += lists to bring them back in
order.
ASCII sorting was chosen over strict alphabetical order even though, if
we omit file prefixes, the lists aren't sorted in strictly alphabetical
order (e.g. archive.o comes after archive-zip.o instead of before
archive-tar.o). This is intentional because the purpose of maintaining
the sorted list is to ensure line insertions are deterministic. By using
ASCII ordering, it is more easily mechanically reproducible in the
future, such as by using :sort in Vim.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tar importer in `contrib/fast-import/import-tars.perl` has a very
convenient feature: if _all_ paths stored in the imported `.tar` start
with a common prefix, e.g. `git-2.26.0/` in the tar at
https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is
stripped.
This feature makes a ton of sense because it is relatively common to
import two or more revisions of the same project into Git, and obviously
we don't want all files to live in a tree whose name changes from
revision to revision.
Now, the problem with that feature is that it breaks down if there is a
`pax_global_header` "file" located outside of said prefix, at the top of
the tree. This is the case for `.tar` files generated by Git's very own
`git archive` command: it inserts that header, and `git archive` allows
specifying a common prefix (that the header does _not_ share with the
other files contained in the archive) via `--prefix=my-project-1.0.0/`.
Let's just skip any global header when importing `.tar` files into Git.
Note: this global header might contain useful information. For example,
in the output of `git archive`, it lists the original commit, which _is_
useful information. A future improvement to the `import-tars.perl`
script might be to include that information in the commit message, or do
other things with the information (e.g. use `mtime` information
contained in the global header as date of the commit). This patch does
not prevent any future patch from making that happen, it only prevents
the header from being treated as if it was a regular file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Via trace2, Git can already log interesting config parameters (see the
trace2_cmd_list_config() function). However, this can grant an
incomplete picture because many config parameters also allow overrides
via environment variables.
To allow for more complete logs, we add a new trace2_cmd_list_env_vars()
function and supporting implementation, modeled after the pre-existing
config param logging implementation.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test case is run in a subshell, we finalize the JUnit-style XML
when said subshell exits. But then we continue to write into that XML as
if nothing had happened.
This leads to Azure Pipelines' Publish Test Results task complaining:
Failed to read /home/vsts/work/1/s/t/out/TEST-t0000-basic.xml.
Error : Unexpected end tag. Line 110, position 5.
And indeed, the resulting XML is incorrect.
Let's "re-open" the XML in such a case, i.e. remove the previously added
closing tags.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing spaces before '&&' and switch tabs around '&&' to spaces.
Also fix the space after redirection operator in t3701 while we're here.
These issues were found using `git grep '[^ ]&&$'` and
`git grep -P '&&\t' t/`.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out that the "--filter=<filter-spec>" option is not
documented anywhere in the "git clone" page, and instead is
detailed carefully in "git rev-list" where it serves a
different purpose.
Add a small bit about this option in the documentation. It
would be worth some time to create a subsection in the "git clone"
documentation about partial clone as a concept and how it can be
a surprising experience. For example, "git checkout" will likely
trigger a pack download.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When t3419 was originally written, it was designed to run a smaller test
for correctness, and then the same test with a larger number of patches
for performance. But it seems unlikely the latter was helping us:
- it was marked with EXPENSIVE, so hardly anybody ran it anyway
- there's no indication that it was more likely to find bugs than the
smaller case (the commit message isn't very helpful, but the original
cover letter describes it as: "The first patch adds correctness and
(optional) performance tests".
- the timing results are shown only via test_debug(). So also not run
unless the user says "-d", and then not provided in any
machine-readable form.
If we're interested in performance regressions, a script in t/perf would
be more appropriate. I didn't add one here, because it's not at all
clear to me that what the script is timing is even all that interesting.
Let's simplify the script by dropping the EXPENSIVE run. That in turn
lets us drop the do_tests() wrapper, which lets us consistently use
single-quotes for our test snippets. And we can drop the useless
test_debug() timings, as well as their run() helper. And finally, while
we're here, we can replace the count() helper with the standard
test_seq().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test runs a function which itself runs several assertions. The
last of these assertions cleans up the .git/rebase-apply directory,
since when run with EXPENSIVE set, the function is invoked a second time
to run the same tests with a larger data set.
However, as of 2ac0d6273f ("rebase: change the default backend from "am"
to "merge"", 2020-02-15), the default backend of rebase has changed, and
cleaning up the rebase-apply directory has no effect: it no longer
exists, since we're using rebase-merge instead.
Since we don't really care which rebase backend is in use, let's just
use the command "git rebase --quit", which will do the right thing
regardless.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable GIT_TEST_PACK_SPARSE was previously used
to allow testing the --sparse option for "git pack-objects" in
the test suite. This allowed interesting cases of "git push" to
also test this algorithm.
Since pack.useSparse is now true by default, we do not need this
variable to _enable_ the --sparse option, but instead to _disable_
it. This flips how we work with the variable a bit.
When checking for the variable, default to a value of -1 for
"unset". If unset, then take the default from the repo settings,
which is currently 1. Then, the --[no-]sparse command-line option
will override either of these settings.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pack.useSparse config option was introduced by 3d036eb0
(pack-objects: create pack.useSparse setting, 2019-01-19) and was
first available in v2.21.0. When enabled, the pack-objects process
during 'git push' will use a sparse tree walk when deciding which
trees and blobs to send to the remote. The algorithm was introduced
by d5d2e93 (revision: implement sparse algorithm, 2019-01-16) and
has been in production use by VFS for Git since around that time.
The features.experimental config option also enabled pack.useSparse,
so hopefully that has also increased exposure.
It is worth noting that pack.useSparse has a possibility of
sending more objects across a push, but requires a special
arrangement of exact _copies_ across directories. There is a test
in t5322-pack-objects-sparse.sh that demonstrates this possibility.
This test uses the --sparse option to "git pack-objects" but we
can make it implied by the config value to demonstrate that the
default value has changed.
While updating that test, I noticed that the documentation did not
include an option for --no-sparse, which is now more important than
it was before.
Since the downside is unlikely but the upside is significant, set
the default value of pack.useSparse to true. Remove it from the
set of options implied by features.experimental.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of https://github.com/prati0100/git-gui:
git-gui: create a new namespace for chord script evaluation
git-gui: reduce Tcl version requirement from 8.6 to 8.5
git-gui--askpass: coerce answers to UTF-8 on Windows
git-gui: fix error popup when doing blame -> "Show History Context"
git-gui: add missing close bracket
git-gui: update German translation
git-gui: extend translation glossary template with more terms
git-gui: update pot template and German translation to current source code
Reduce the Tcl version requirement to 8.5 to allow git-gui to run on
MacOS distributions like High Sierra. While here, fix a potential
variable name collision.
* py/remove-tcloo:
git-gui: create a new namespace for chord script evaluation
git-gui: reduce Tcl version requirement from 8.6 to 8.5
In 'submodule--helper.c', the structures and macros for callbacks belonging
to any subcommand are named in the format: 'subcommand_cb' and 'SUBCOMMAND_CB_INIT'
respectively.
This was an exception for the subcommand 'foreach' of the command
'submodule'. Rename the aforementioned structures and macros:
'struct cb_foreach' to 'struct foreach_cb' and 'CB_FOREACH_INIT'
to 'FOREACH_CB_INIT'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though there is only one configuration variable in the
namespace, it is not quite right to have tar.umask described
among the variables for tag.* namespace.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updates to the CI settings.
* js/ci-windows-update:
Azure Pipeline: switch to the latest agent pools
ci: prevent `perforce` from being quarantined
t/lib-httpd: avoid using macOS' sed
Both "git ls-remote -h" and "git grep -h" give short usage help,
like any other Git subcommand, but it is not unreasonable to expect
that the former would behave the same as "git ls-remote --head"
(there is no other sensible behaviour for the latter). The
documentation has been updated in an attempt to clarify this.
* jc/doc-single-h-is-for-help:
Documentation: clarify that `-h` alone stands for `help`
"git show" and others gave an object name in raw format in its
error output, which has been corrected to give it in hex.
* hd/show-one-mergetag-fix:
show_one_mergetag: print non-parent in hex form.
"git merge signed-tag" while lacking the public key started to say
"No signature", which was utterly wrong. This regression has been
reverted.
* hi/gpg-use-check-signature:
Revert "gpg-interface: prefer check_signature() for GPG verification"
Fix for a bug revealed by a recent change to make the protocol v2
the default.
* ds/partial-clone-fixes:
partial-clone: avoid fetching when looking for objects
partial-clone: demonstrate bugs in partial fetch
The merge-recursive machinery failed to refresh the cache entry for
a merge result in a couple of places, resulting in an unnecessary
merge failure, which has been fixed.
* en/t3433-rebase-stat-dirty-failure:
merge-recursive: fix the refresh logic in update_file_flags
t3433: new rebase testcase documenting a stat-dirty-like failure
"git check-ignore" did not work when the given path is explicitly
marked as not ignored with a negative entry in the .gitignore file.
* en/check-ignore:
check-ignore: fix documentation and implementation to match
The code to automatically shrink the fan-out in the notes tree had
an off-by-one bug, which has been killed.
* jh/notes-fanout-fix:
notes.c: fix off-by-one error when decreasing notes fanout
t3305: check notes fanout more carefully and robustly
The index-pack code now diagnoses a bad input packstream that
records the same object twice when it is used as delta base; the
code used to declare a software bug when encountering such an
input, but it is an input error.
* jk/index-pack-dupfix:
index-pack: downgrade twice-resolved REF_DELTA to die()
"git rebase -i" identifies existing commits in its todo file with
their abbreviated object name, which could become ambigous as it
goes to create new commits, and has a mechanism to avoid ambiguity
in the main part of its execution. A few other cases however were
not covered by the protection against ambiguity, which has been
corrected.
* js/rebase-i-with-colliding-hash:
rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
rebase -i: re-fix short SHA-1 collision
parse_insn_line(): improve error message when parsing failed
Running "git rm" on a submodule failed unnecessarily when
.gitmodules is only cache-dirty, which has been corrected.
* dt/submodule-rm-with-stale-cache:
git rm submodule: succeed if .gitmodules index stat info is zero
The "--recurse-submodules" option of various subcommands did not
work well when run in an alternate worktree, which has been
corrected.
* pb/recurse-submodule-in-worktree-fix:
submodule.c: use get_git_dir() instead of get_git_common_dir()
t2405: clarify test descriptions and simplify test
t2405: use git -C and test_commit -C instead of subshells
t7410: rename to t2405-worktree-submodule.sh
An earlier update to show the location of working tree in the error
message did not consider the possibility that a git command may be
run in a bare repository, which has been corrected.
* es/outside-repo-errmsg-hints:
prefix_path: show gitdir if worktree unavailable
prefix_path: show gitdir when arg is outside repo
Minor bugfixes to "git add -i" that has recently been rewritten in C.
* js/builtin-add-i-cmds:
built-in add -i: accept open-ended ranges again
built-in add -i: do not try to `patch`/`diff` an empty list of files
Evaluating the script in the same namespace as the chord itself creates
potential for variable name collision. And in that case the script would
unknowingly use the chord's variables.
For example, say the script has a variable called 'is_completed', which
also exists in the chord's namespace. The script then calls 'eval' and
sets 'is_completed' to 1 thinking it is setting its own variable,
completely unaware of how the chord works behind the scenes. This leads
to the chord never actually executing because it sees 'is_completed' as
true and thinks it has already completed.
Avoid the potential collision by creating a separate namespace for the
script that is a child of the chord's namespace.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
On some MacOS distributions like High Sierra, Tcl 8.5 is shipped by
default. This makes git-gui error out at startup because of the version
mismatch.
The only part that requires Tcl 8.6 is SimpleChord, which depends on
TclOO. So, don't use it and use our homegrown class.tcl instead.
This means some slight syntax changes. Since class.tcl doesn't have an
"unknown" method like TclOO does, we can't just call '$note', but have
to use '$note activate' instead. The constructor now needs a proper
namespace qualifier. Update the documentation to reflect the new syntax.
As of now, the only part of git-gui that needs Tcl 8.5 is a call to
'apply' in lib/index.tcl::lambda. Keep using it until someone shows up
shouting that their OS ships with 8.4 only. Then we would have to look
into implementing it in pure Tcl.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The option name "--use-mailmap" looks OK, but it becomes awkward
when you have to negate it, i.e. "--no-use-mailmap". I, perhaps
with many other users, always try "--no-mailmap" and become unhappy
to see it fail.
Add an alias "--[no-]mailmap" to remedy this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous step made an option that is an alias to another option
identify itself as an alias to the latter. Because it is easier to
scan the list when a pointer goes backward to what a reader already
has seen, mention "recurse-submodules" first with its true short
help string, and then "recurse" with the statement that it is a
synonym to "recurse-submodules".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a long-standing NEEDSWORK comment that complains about
inconsistency between how an aliased option ("git clone --recurse"
which is the only one that currently exists) gives a help text in
a usage-error message vs "git cmd -h"). Get rid of it and then
make sure we say an option is an alias for another, instead of
repeating the same short help text for both, which leads to "they
seem to do the same---is there any subtle difference?" puzzlement
to end-users.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check that we get the expected data when performing a merges or
generating archives. Note that we don't expect a ref for merges,
because we won't be checking out any particular ref, but instead a tree
of the merged data. For archives, however, we expect a ref as normal if
we have one.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the commit, and if we have it, the ref to the filters when we
perform a checkout. This should only be the case when we invoke git
reset --hard; the metadata will be unused otherwise.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking out a commit, provide metadata to the filter process
including the ref we're using.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide commit metadata for checkout code paths that use unpack_trees
and friends. When we're checking out a commit, use the commit
information, but don't provide commit information if we're checking out
from the index, since there need not be any particular commit associated
with the index, and even if there is one, we can't know what it is.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have the codebase wired up to pass any additional metadata
to filters, let's collect the additional metadata that we'd like to
pass.
The two main places we pass this metadata are checkouts and archives.
In these two situations, reading HEAD isn't a valid option, since HEAD
isn't updated for checkouts until after the working tree is written and
archives can accept an arbitrary tree. In other situations, HEAD will
usually reflect the refname of the branch in current use.
We pass a smaller amount of data in other cases, such as git cat-file,
where we can really only logically know about the blob.
This commit updates only the parts of the checkout code where we don't
use unpack_trees. That function and callers of it will be handled in a
future commit.
In the archive code, we leak a small amount of memory, since nothing we
pass in the archiver argument structure is freed.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of situations where a filter process can make use of
some additional metadata. For example, some people find the ident
filter too limiting and would like to include the commit or the branch
in their smudged files. This information isn't available during
checkout as HEAD hasn't been updated at that point, and it wouldn't be
available in archives either.
Let's add a way to pass this metadata down to the filter. We pass the
blob we're operating on, the treeish (preferring the commit over the
tree if one exists), and the ref we're operating on. Note that we won't
pass this information in all cases, such as when renormalizing or when
we're performing diffs, since it doesn't make sense in those cases.
The data we currently get from the filter process looks like the
following:
command=smudge
pathname=git.c
0000
With this change, we'll get data more like this:
command=smudge
pathname=git.c
refname=refs/tags/v2.25.1
treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b
blob=7be7ad34bd053884ec48923706e70c81719a8660
0000
There are a couple things to note about this approach. For operations
like checkout, treeish will always be a commit, since we cannot check
out individual trees, but for other operations, like archive, we can end
up operating on only a particular tree, so we'll provide only a tree as
the treeish. Similar comments apply for refname, since there are a
variety of cases in which we won't have a ref.
This commit wires up the code to print this information, but doesn't
pass any of it at this point. In a future commit, we'll have various
code paths pass the actual useful data down.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- context should be translated to ngữ cảnh instead of nội dung
- add missing accents
- switch adjective and secondary objects position:
* The formatted English text will be "To remove '+/-' lines",
it should be translated to "Để bỏ dòng bắt đầu với '+/-'
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
When commit 8b2f8cbcb1 ("oidset: use khash", 2018-10-04) moved from
using oidmap to khash, it replaced the oidmap.h include with both one
for hashmap.h and khash.h. Since the hashmap.h header is unnecessary,
and the point of the patch was to switch from hashmap (used by oidmap)
to khash.h, remove the unneccessary include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While updating the microsoft/git fork on top of v2.26.0-rc0 and
consuming that build into Scalar, I noticed a corner case bug around
partial clone.
The "scalar clone" command can create a Git repository with the
proper config for using partial clone with the "blob:none" filter.
Instead of calling "git clone", it runs "git init" then sets a few
more config values before running "git fetch".
In our builds on v2.26.0-rc0, we noticed that our "git fetch"
command was failing with
error: https://github.com/microsoft/scalar did not send all necessary objects
This does not happen if you copy the config file from a repository
created by "git clone --filter=blob:none <url>", but it does happen
when adding the config option "core.logAllRefUpdates = true".
By debugging, I was able to see that the loop inside
check_connnected() that checks if all refs are contained in
promisor packs actually did not have any packfiles in the packed_git
list.
I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
spot during the fetch operation. This approach requires a situation
where we use the remote helper process, which makes it difficult to
test.
It is possible to place a reprepare_packed_git() call in the fetch code
closer to where we receive a pack, but that leaves an opening for a
later change to re-introduce this problem. Further, a concurrent repack
operation could replace the pack-file list we already loaded into
memory, causing this issue in an even harder to reproduce scenario.
It is really the responsibility of anyone looping through the list of
pack-files for a certain object to fall back to reprepare_packed_git()
on a fail-to-find. The loop in check_connected() does not have this
fallback, leading to this bug.
We _could_ try looping through the packs and only reprepare the packs
after a miss, but that change is more involved and has little value.
Since this case is isolated to the case when
opt->check_refs_are_promisor_objects_only is true, we are confident that
we are verifying the refs after downloading new data. This implies that
calling reprepare_packed_git() in advance is not a huge cost compared to
the rest of the operations already made.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit refactors the use of verify_signed_buffer() outside of
gpg-interface.c to use check_signature() instead. It also turns
verify_signed_buffer() into a file-local function since it's now only
invoked internally by check_signature().
There were previously two globally scoped functions used in different
parts of Git to perform GPG signature verification:
verify_signed_buffer() and check_signature(). Now only
check_signature() is used.
The verify_signed_buffer() function doesn't guard against duplicate
signatures as described by Michał Górny [1]. Instead it only ensures a
non-erroneous exit code from GPG and the presence of at least one
GOODSIG status field. This stands in contrast with check_signature()
that returns an error if more than one signature is encountered.
The lower degree of verification makes the use of verify_signed_buffer()
problematic if callers don't parse and validate the various parts of the
GPG status message themselves. And processing these messages seems like
a task that should be reserved to gpg-interface.c with the function
check_signature().
Furthermore, the use of verify_signed_buffer() makes it difficult to
introduce new functionality that relies on the content of the GPG status
lines.
Now all operations that does signature verification share a single entry
point to gpg-interface.c. This makes it easier to propagate changed or
additional functionality in GPG signature verification to all parts of
Git, without having odd edge-cases that don't perform the same degree of
verification.
[1] https://dev.gentoo.org/~mgorny/articles/attack-on-git-signature-verification.html
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There weren't any tests for unsuccessful signature verification of
signed merge tags shown in 'git log'. There also weren't any tests for
the GPG output from 'git fmt-merge-msg'. This was noticed while
investigating a buggy refactor that slipped through the test suite; see
commit 72b006f4bf.
This commit adds signature verification tests to the 'log' and
'fmt-merge-msg' builtins.
Thanks to Linus Torvalds for reporting and finding the (now reverted)
commit that introduced the regression.
Note that the "log --show-signature for merged tag with GPG failure"
test case is really hacky. It relies on an implementation detail of
verify_signed_buffer() -- namely, it assumes that the signature is
written to a temporary file whose path is under TMPDIR.
The rationale for that test case is to check whether the code path that
yields the "No signature" message is reachable on failure. The
functionality in log-tree.c that may show this message does some
pre-parsing of a possible signature that prevents the GPG interface from
being invoked if a signature is actually missing. And I haven't been
able to construct a signature that both 1. satisfies that
pre-processing, and 2. causes GPG to fail without any sort of output on
stderr along the lines of "this is a bogus/corrupt/... signature" (the
"No signature" message should only be shown if GPG produce no output).
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
[jc: fixed missing test title noticed by Dscho]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This addresses the issue where Git for Windows asks the user for a
password, no credential helper is available, and then Git fails to pick
up non-ASCII characters from the Git GUI helper.
This can be verified e.g. via
echo host=http://abc.com |
git -c credential.helper= credential fill
and then pasting some umlauts.
The underlying reason is that Git for Windows tries to communicate using
the UTF-8 encoding no matter what the actual current code page is. So
let's indulge Git for Windows and do use that encoding.
This fixes https://github.com/git-for-windows/git/issues/2215
Signed-off-by: Luke Bonanomi <lbonanomi@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Fixes an error popup in blame because of a missing closing bracket.
* py/blame-status-error:
git-gui: fix error popup when doing blame -> "Show History Context"
In the future, we're going to want to use the branch info in
checkout_worktree, so let's pass the whole struct branch_info down, not
just the revision name. We hoist the definition of struct branch_info
so it's in scope.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The credential protocol can't handle values with newlines. We already
detect and block any such URLs from being used with credential helpers,
but let's also add an fsck check to detect and block gitmodules files
with such URLs. That will let us notice the problem earlier when
transfer.fsckObjects is turned on. And in particular it will prevent bad
objects from spreading, which may protect downstream users running older
versions of Git.
We'll file this under the existing gitmodulesUrl flag, which covers URLs
with option injection. There's really no need to distinguish the exact
flaw in the URL in this context. Likewise, I've expanded the description
of t7416 to cover all types of bogus URLs.
The credential protocol can't represent newlines in values, but URLs can
embed percent-encoded newlines in various components. A previous commit
taught the low-level writing routines to die() when encountering this,
but we can be a little friendlier to the user by detecting them earlier
and handling them gracefully.
This patch teaches credential_from_url() to notice such components,
issue a warning, and blank the credential (which will generally result
in prompting the user for a username and password). We blank the whole
credential in this case. Another option would be to blank only the
invalid component. However, we're probably better off not feeding a
partially-parsed URL result to a credential helper. We don't know how a
given helper would handle it, so we're better off to err on the side of
matching nothing rather than something unexpected.
The die() call in credential_write() is _probably_ impossible to reach
after this patch. Values should end up in credential structs only by URL
parsing (which is covered here), or by reading credential protocol input
(which by definition cannot read a newline into a value). But we should
definitely keep the low-level check, as it's our final and most accurate
line of defense against protocol injection attacks. Arguably it could
become a BUG(), but it probably doesn't matter much either way.
Note that the public interface of credential_from_url() grows a little
more than we need here. We'll use the extra flexibility in a future
patch to help fsck catch these cases.
The credential tests have a "check" function which feeds some input to
git-credential and checks the stdout and stderr. We look for exact
matches in the output. For stdout, this makes sense; the output is
the credential protocol. But for stderr, we may be showing various
diagnostic messages, or the prompts fed to the askpass program, which
could be translated. Let's mark them as such.
The credential protocol that we use to speak to helpers can't represent
values with newlines in them. This was an intentional design choice to
keep the protocol simple, since none of the values we pass should
generally have newlines.
However, if we _do_ encounter a newline in a value, we blindly transmit
it in credential_write(). Such values may break the protocol syntax, or
worse, inject new valid lines into the protocol stream.
The most likely way for a newline to end up in a credential struct is by
decoding a URL with a percent-encoded newline. However, since the bug
occurs at the moment we write the value to the protocol, we'll catch it
there. That should leave no possibility of accidentally missing a code
path that can trigger the problem.
At this level of the code we have little choice but to die(). However,
since we'd not ever expect to see this case outside of a malicious URL,
that's an acceptable outcome.
Reported-by: Felix Wilhelm <fwilhelm@google.com>
git pull accepts the options --dry-run, -p/--prune, --refmap, and
-t/--tags since a32975f516 (pull: pass git-fetch's options to git-fetch,
2015-06-18), -j/--jobs since 62104ba14a (submodules: allow parallel
fetching, add tests and documentation, 2015-12-15), and --set-upstream
since 24bc1a1292 (pull, fetch: add --set-upstream option, 2019-08-19).
Update its documentation to match.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of github.com:git/git: (27 commits)
Git 2.26-rc1
remote-curl: show progress for fetches over dumb HTTP
show_one_mergetag: print non-parent in hex form.
config.mak.dev: re-enable -Wformat-zero-length
rebase-interactive.c: silence format-zero-length warnings
mingw: workaround for hangs when sending STDIN
t6020: new test with interleaved lexicographic ordering of directories
t6022, t6046: test expected behavior instead of testing a proxy for it
t3035: prefer test_must_fail to bash negation for git commands
t6020, t6022, t6035: update merge tests to use test helper functions
t602[1236], t6034: modernize test formatting
merge-recursive: apply collision handling unification to recursive case
completion: add diff --color-moved[-ws]
t1050: replace test -f with test_path_is_file
am: support --show-current-patch=diff to retrieve .git/rebase-apply/patch
am: support --show-current-patch=raw as a synonym for--show-current-patch
am: convert "resume" variable to a struct
parse-options: convert "command mode" to a flag
parse-options: add testcases for OPT_CMDMODE()
stash push: support the --pathspec-from-file option
...
Often novice Git users forget to say "pull --rebase" and end up with an
unnecessary merge from upstream. What they usually want is either "pull
--rebase" in the simpler cases, or "pull --ff-only" to update the copy
of main integration branches, and rebase their work separately. The
pull.rebase configuration variable exists to help them in the simpler
cases, but there is no mechanism to make these users aware of it.
Issue a warning message when no --[no-]rebase option from the command
line and no pull.rebase configuration variable is given. This will
inconvenience those who never want to "pull --rebase", who haven't had
to do anything special, but the cost of the inconvenience is paid only
once per user, which should be a reasonable cost to help a number of new
users.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since 862e730ec1 (commit-slab: introduce slabname##_peek()
function, 2015-05-14) the slabname##_peek() function is documented as:
This function is similar to indegree_at(), but it will return NULL
until a call to indegree_at() was made for the commit.
This, however, is usually not the case. If indegree_at() allocates
memory, then it will do so not only for the single commit it got as
parameter, but it will allocate a whole new, ~512kB slab. Later on,
if any other commit's 'index' field happens to point into an already
allocated slab, then indegree_peek() for such a commit will return a
valid non-NULL pointer, pointing to a zero-initialized location in the
slab, even if no indegree_at() call has been made for that commit yet.
Update slabname##_peek()'s documentation to clarify this.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Together with the previous commits, this commit fully fixes the problem
of using shared buffer for `real_path()` in `get_superproject_working_tree()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.
There was an unfinished effort to abolish this [1].
Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.
This patch uses a local `strbuf` for most places where `real_path()` was
previously called.
However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
read_gitfile_gently()
get_superproject_working_tree()
[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python2 reached end of life, and we have been preparing our Python
scripts to work with Python3. 'git p4', the main in-tree user of
Python, has just received a number of compatibility updates. Our
other notable Python script 'contrib/svn-fe/svnrdump_sim.py' is only
used in 't9020-remote-svn.sh', and is apparently already compatible
with both Python2 and 3.
Our CI jobs currently only use Python2. We want to make sure that
these Python scripts do indeed work with Python3, and we also want to
make sure that these scripts keep working with Python2 as well, for
the sake of some older LTS/Enterprise setups.
Therefore, pick two jobs and use Python3 there, while leaving other
jobs to still stick to Python2 for now.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some parts of the workflow described in the document has got a bit
stale with the recent toolchain improvements. Update the procedure
a bit, and also describe the convention used around SQUASH??? fixups.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`real_path()` returns result from a shared buffer, inviting subtle
reentrance bugs. One of these bugs occur when invoked this way:
set_git_dir(real_path(git_dir))
In this case, `real_path()` has reentrance:
real_path
read_gitfile_gently
repo_set_gitdir
setup_git_env
set_git_dir_1
set_git_dir
Later, `set_git_dir()` uses its now-dead parameter:
!is_absolute_path(path)
Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the stash.useBuiltin setting which was added as an escape hatch
to disable the builtin version of stash first released with Git 2.22.
Carrying the legacy version is a maintenance burden, and has in fact
become out of date failing a test since the 2.23 release, without
anyone noticing until now. So users would be getting a hint to fall
back to a potentially buggy version of the tool.
We used to shell out to git config to get the useBuiltin configuration
to avoid changing any global state before spawning legacy-stash.
However that is no longer necessary, so just use the 'git_config'
function to get the setting instead.
Similar to what we've done in d03ebd411c ("rebase: remove the
rebase.useBuiltin setting", 2019-03-18), where we remove the
corresponding setting for rebase, we leave the documentation in place,
so people can refer back to it when searching for it online, and so we
can refer to it in the commit message.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 4 environment variables that can be used to configure the proxy
cert, proxy ssl key, the proxy cert password protected flag, and the
CA info for the proxy.
Documentation for the options was also updated.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git supports performing connections to HTTPS proxies, but we don't
support doing mutual authentication with them (through TLS).
Add the necessary options to be able to send a client certificate to
the HTTPS proxy.
A client certificate can provide an alternative way of authentication
instead of using 'ProxyAuthorization' or other more common methods of
authentication. Libcurl supports this functionality already, so changes
are somewhat minimal. The feature is guarded by the first available
libcurl version that supports these options.
4 configuration options are added and documented, cert, key, cert
password protected and CA info. The CA info should be used to specify a
different CA path to validate the HTTPS proxy cert.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
change the advise call in tag library from advise() to
advise_if_enabled() to construct an example of the usage of
the new API.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently it's very easy for the advice library's callers to miss
checking the visibility step before printing an advice. Also, it makes
more sense for this step to be handled by the advice library.
Add a new advise_if_enabled function that checks the visibility of
advice messages before printing.
Add a new helper advise_enabled to check the visibility of the advice
if the caller needs to carry out complicated processing based on that
value.
A list of advice_settings is added to cache the config variables names
and values, it's intended to replace advice_config[] and the global
variables once we migrate all the callers to use the new APIs.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit we're adding another config variable to be read
from 'git_stash_config', that is valid for the top level command
instead of just a subset. Move the 'git_config' invocation for
'git_stash_config' to the top-level to prepare for that.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fb6fbffbda (advice: keep config name in camelCase in advice_config[],
2018-05-26) changed the config names to camelCase, but one of the names
wasn't changed correctly. Fix it.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a new advice method, extract a version of advise()
that uses an explict 'va_list' parameter. Call it from advise() for a
functionally equivalent version.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d9c6469 (git-gui: update status bar to track operations, 2019-12-01)
the call to 'ui_status' in 'do_gitk' was updated to create the newly
introduced "status bar operation". This allowed this status text to show
along with other operations happening in parallel, and removed a race
between all these operations.
But in that refactor, the fact that 'ui_status' checks for the existence
of 'main_status' was overlooked. This leads to an error message popping
up when the user selects "Show History Context" from the blame window
context menu on a source line. The error occurs because when running
"blame" 'main_status' is not initialized.
So, add a check for the existence of 'main_status' in 'do_gitk'. This
fix reverts to the original behaviour. In the future, we might want to
look into a better way of telling 'do_gitk' which status bar to use.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
When converting a repository using submodules from one hash algorithm to
another, it is necessary to rewrite the submodules from the old
algorithm to the new algorithm, since only references to submodules, not
their contents, are written to the fast-export stream. Without rewriting
the submodules, fast-import fails with an "Invalid dataref" error when
encountering a submodule in another algorithm.
Add a pair of options, --rewrite-submodules-from and
--rewrite-submodules-to, that take a list of marks produced by
fast-export and fast-import, respectively, when processing the
submodule. Use these marks to map the submodule commits from the old
algorithm to the new algorithm.
We read marks into two corresponding struct mark_set objects and then
perform a mapping from the old to the new using a hash table. This lets
us reuse the same mark parsing code that is used elsewhere and allows us
to efficiently read and match marks based on their ID, since mark files
need not be sorted.
Note that because we're using a khash table for the object IDs, and this
table copies values of struct object_id instead of taking references to
them, it's necessary to zero the struct object_id values that we use to
insert and look up in the table. Otherwise, we would end up with SHA-1
values that don't match because of whatever stack garbage might be left
in the unused area.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, we can iterate over marks only to dump them to a file. In the
future, we'll want to perform an arbitrary operation over the items of a
mark set. Add a function, for_each_mark, that iterates over marks in a
set and performs an arbitrary callback function for each mark. Switch
the mark dumping routine to use this function now that it's available.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll use multiple different mark sets with this
function, so make it take an argument that points to the mark set to
operate on.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, everything we want to insert into a mark set is an object
entry. However, in the future, we will want to insert objects of other
types. Teach read_mark_file to take a function pointer which helps us
insert the object we want into our mark set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll want to read marks files for submodules as well.
Refactor the existing code to make it possible to read multiple marks
files, each into their own marks set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transition plan anticipates that we will allow signatures using
multiple algorithms in a single commit. In order to do so, we need to
use a different header per algorithm so that it will be obvious over
which data to compute the signature.
The transition plan specifies that we should use "gpgsig-sha256", so
wire up the commit code such that it can write and parse the current
algorithm, and it can remove the headers for any algorithm when creating
a new commit. Add tests to ensure that we write using the right header
and that git fsck doesn't reject these commits.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git supports both repository versions 0 and 1. These formats are
identical except for the presence of extensions. When using an
extension, such as for a different hash algorithm, a check for only
version 0 causes the check to fail. Instead, call
verify_repository_format to verify that we have an appropriate version
and no unknown extensions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we perform a clone, we won't know the remote side's hash algorithm
until we've read the heads. Consequently, we'll need to rewrite the
repository format version and hash algorithm once we know what the
remote side has. Move the code that does this into its own function so
that we can call it from clone in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the foreseeable future, SHA-1 will be the default algorithm for Git.
However, when running the testsuite, we want to be able to test an
arbitrary algorithm. It would be quite burdensome and very untidy to
have to specify the algorithm we'd like to test every time we
initialized a new repository somewhere in the testsuite, so add an
environment variable to allow us to specify the default hash algorithm
for Git.
This has the benefit that we can set it once for the entire testsuite
and not have to think about it. In the future, users can also use it to
set the default for their repositories if they would like to do so.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the user to specify the hash algorithm on the command line by
using the --object-format option to git init. Validate that the user is
not attempting to reinitialize a repository with a different hash
algorithm. Ensure that if we are writing a non-SHA-1 repository that we
set the repository version to 1 and write the objectFormat extension.
Restrict this option to work only when ENABLE_SHA256 is set until the
codebase is in a situation to fully support this.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we will want to not only check the repository format, but
extract the information that we've gained. To do so, allow
check_repository_format to take a pointer to struct repository_format.
Allow passing NULL for this argument if we're not interested in the
information, and pass NULL for all existing callers. A future patch
will make use of this information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test currently hard-codes the hash algorithm as SHA-1 when calling
repo_set_hash_algo so that the_hash_algo is properly initialized.
However, this does not work with SHA-256 repositories. Read the
repository value that repo_init has read into the local repository
variable and set the algorithm based on that value.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repository helper is used in t5318 to read commit graphs whether
we're in a repository or not. However, without a repository, we have no
way to properly initialize the hash algorithm, meaning that the file is
misread.
Fix this by calling setup_git_directory_gently, which will read the
environment variable the testsuite sets to ensure that the correct hash
algorithm is set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test helper, we read the index. In order to have the proper
hash algorithm set up, we must call setup_git_directory. Do so, so that
the test works when extensions.objectFormat is set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the git for-each-ref tests asks to sort by object ID. However,
when sorted, the order of the refs differs between SHA-1 and SHA-256.
Sort the expected output so that the test passes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll allow developers to run the testsuite with a hash
algorithm of their choice. To make this easier, compute the fixed
constants using test_oid. Move the constant initialization down below
the point where test-lib-functions.sh is loaded so the functions are
defined.
Note that we don't provide a value for the OID_REGEX value directly
because writing a large number of instances of "[0-9a-f]" in the
oid-info files is unwieldy and there isn't a way to compute it based on
those values. Instead, compute it based on ZERO_OID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At this point, SHA-256 support is experimental and some behavior may
change. To avoid surprising unsuspecting users, require a build flag,
ENABLE_SHA256, to allow use of a non-SHA-1 algorithm. Enable this flag
by default when the DEVELOPER make flag is set so that contributors can
test this case adequately.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some places where we need to parse a hex object ID in any
algorithm without knowing beforehand which algorithm is in use. An
example is when parsing fast-import marks.
Add a get_oid_hex_any to parse an object ID and return the algorithm it
belongs to, and additionally add parse_oid_hex_any which is the
equivalent change for parse_oid_hex. If the object is not parseable, we
return GIT_HASH_UNKNOWN.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce variants of get_oid_hex and parse_oid_hex that parse an
arbitrary hash algorithm, implementing internal functions to avoid
duplication. These functions can be used in the transport code to parse
refs properly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For all of our SHA-1 implementations and most of our SHA-256
implementations, the hash context we use is a real struct. For these
implementations, it's possible to copy a hash context by making a copy
of the struct.
However, for our libgcrypt implementation, our hash context is a
pointer. Consequently, copying it does not lead to an independent hash
context like we intended.
Fortunately, however, libgcrypt provides us with a handy function to
copy hash contexts. Let's add a cloning function to the hash algorithm
API, and use it in the one place we need to make a hash context copy.
With this change, our libgcrypt SHA-256 implementation is fully
functional with all of our other hash implementations.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid hard-coding a hash size, instead preferring to use the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An annotated tag has two names: where it sits in the refs/tags
hierarchy and the tagname recorded in the "tag" field in the object
itself. They usually should match.
Since 212945d4 ("Teach git-describe to verify annotated tag names
before output", 2008-02-28), a commit described using an annotated
tag bases its name on the tagname from the object. While this was a
deliberate design decision to make it easier to converse about tags
with others, even if the tags happen to be fetched to a different
name than it was given by its creator, it had one downside.
The output from "git describe", at least in the modern Git, should
be usable as an object name to name the exact commit given to the
"git describe" command. Using the tagname, when two names differ,
breaks this property, when describing a commit that is directly
pointed at by such a tag. An annotated tag Bob made as "v1.0" may
sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from
"git describe v1.0-bob^0" would say "v1.0", but there may not be
any tag at "refs/tags/v1.0" locally or there may be another tag that
points at a different object.
Note that this won't be a problem if a commit being described is not
directly pointed at by such a mislocated tag. In the example in the
previous paragraph, describing a commit whose parent is v1.0-bob
would result in "v1.0" (i.e. the tagname taken from the tag object)
followed by "-1-gXXXXX" where XXXXX is the abbreviated object name,
and a string that ends with "-g" followed by a hexadecimal string is
an object name for the object whose name begins with hexadecimal
string (as long as it is unique), so it does not matter if the
leading part is "v1.0" or "v1.0-bob".
Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when
the name we give is based on a mislocated annotated tag to ensure
that the output can be used as the object name for the object
originally given to the command to fix the issue.
While at it, remove an overly cautious dead code to protect against
an annotated tag object without the tagname. Such a tag is filtered
out much earlier in the codeflow, and will not reach this part of
the code.
Helped-by: Matheus Tavares <matheus.bernardino@usp.br>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit fixed a bug with the (no submodule) -> (nested
submodules) transition for commands in the unpack-trees machinery.
Let's add a test for the reverse transition (going from nested
submodules to no submodule), as it is not being tested currently.
While at it, uniformize the capitalization in the list of tests.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using `git checkout --recurse-submodules` to switch between a
branch with no submodules and a branch with initialized nested
submodules currently causes a fatal error:
$ git checkout --recurse-submodules branch-with-nested-submodules
fatal: exec '--super-prefix=submodule/nested/': cd to 'nested'
failed: No such file or directory
error: Submodule 'nested' could not be updated.
error: Submodule 'submodule/nested' cannot checkout new HEAD.
error: Submodule 'submodule' could not be updated.
M submodule
Switched to branch 'branch-with-nested-submodules'
The checkout succeeds but the worktree and index of the first level
submodule are left empty:
$ cd submodule
$ git -c status.submoduleSummary=1 status
HEAD detached at b3ce885
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: .gitmodules
deleted: first.t
deleted: nested
fatal: not a git repository: 'nested/.git'
Submodule changes to be committed:
* nested 1e96f59...0000000:
$ git ls-files -s
$ # empty
$ ls -A
.git
The reason for the fatal error during the checkout is that a child git
process tries to cd into the yet unexisting nested submodule directory.
The sequence is the following:
1. The main git process (the one running in the superproject) eventually
reaches write_entry() in entry.c, which creates the first level
submodule directory and then calls submodule_move_head() in submodule.c,
which spawns `git read-tree` in the submodule directory.
2. The first child git process (the one in the submodule of the
superproject) eventually calls check_submodule_move_head() at
unpack_trees.c:2021, which calls submodule_move_head in dry-run mode,
which spawns `git read-tree` in the nested submodule directory.
3. The second child git process tries to chdir() in the yet unexisting
nested submodule directory in start_command() at run-command.c:829 and
dies before exec'ing.
The reason why check_submodule_move_head() is reached in the first child
and not in the main process is that it is inside an
if(submodule_from_ce()) construct, and submodule_from_ce() returns a
valid struct submodule pointer, whereas it returns a null pointer in the
main git process.
The reason why submodule_from_ce() returns a null pointer in the main
git process is because the call to cache_lookup_path() in config_from()
(called from submodule_from_path() in submodule_from_ce()) returns a
null pointer since the hashmap "for_path" in the submodule_cache of
the_repository is not yet populated. It is not populated because both
repo_get_oid(repo, GITMODULES_INDEX, &oid) and repo_get_oid(repo,
GITMODULES_HEAD, &oid) in config_from_gitmodules() at
submodule-config.c:639-640 return -1, as at this stage of the operation,
neither the HEAD of the superproject nor its index contain any
.gitmodules file.
In contrast, in the first child the hashmap is populated because
repo_get_oid(repo, GITMODULES_HEAD, &oid) returns 0 as the HEAD of the
first level submodule, i.e. .git/modules/submodule/HEAD, points to a
commit where .gitmodules is present and records 'nested' as a submodule.
Fix this bug by checking that the submodule directory exists before
calling check_submodule_move_head() in merged_entry() in the `if(!old)`
branch, i.e. if going from a commit with no submodule to a commit with a
submodule present.
Also protect the other call to check_submodule_move_head() in
merged_entry() the same way as it is safer, even though the `else if
(!(old->ce_flags & CE_CONFLICTED))` branch of the code is not at play in
the present bug.
The other calls to check_submodule_move_head() in other functions in
unpack_trees.c are all already protected by calls to lstat() somewhere
in
the program flow so we don't need additional protection for them.
All commands in the unpack_trees machinery are affected, i.e. checkout,
reset and read-tree when called with the --recurse-submodules flag.
This bug was first reported in [1].
[1]
https://lore.kernel.org/git/7437BB59-4605-48EC-B05E-E2BDB2D9DABC@gmail.com/
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reported-by: Damien Robert <damien.olivier.robert@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function verify_clean_submodule() learned to verify if a submodule
working tree is clean in a7bc845a9a (unpack-trees: check if we can
perform the operation for submodules, 2017-03-14), but the commented
description above it was not updated to reflect that, such that this
description has been outdated since then.
Since Git has now learned to optionnally recursively check out
submodules during a superproject checkout, remove this outdated
description.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test "$command: submodule branch is not changed, detach HEAD
instead" is in the "Appearing submodule" section of
test_submodule_recursing_with_args_common(), but this test updates a
submodule; it does not test a transition from a state with no submodule
to a state with a submodule.
As such, for consistency, move it to the "Modified submodule" section of
the same function. While at it, add a comment describing the test.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commands in the unpack_trees machinery (checkout, reset, read-tree)
were fixed in 218c883783 (submodule: properly recurse for read-tree and
checkout, 2017-05-02) to correctly update nested submodules when called
with the `--recurse-submodules` flag.
However, a comment in t/lib-submodule-update.sh mentions that this use
case still doesn't work.
Remove this outdated comment.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The known failure mode KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED was
removed from lib-submodule-update.sh in 218c883783 (submodule: properly
recurse for read-tree and checkout, 2017-05-02) but at that time this
change was not ported over to topic sb/reset-recurse-submodules, such
that when this topic was merged in 5f074ca7e8 (Merge branch
'sb/reset-recurse-submodules', 2017-05-29), t7112-reset-submodules.sh
kept a mention of this removed failure mode.
Remove it now, as it does not mean anything anymore.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d9c6469 (git-gui: update status bar to track operations, 2019-12-01),
the status bar was refactored to allow multiple overlapping operations.
Since the refactor changed the status bar interface, all callsites had
to be refactored to use the new interface. During that refactoring, this
closing bracket was missed. This leads to an error message popping up
when doing 'Branch->Reset...'.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Update the German translation and extend glossary.
* cs/german-translation:
git-gui: update German translation
git-gui: extend translation glossary template with more terms
git-gui: update pot template and German translation to current source code
Update German translation (glossary and final translation) with
recent additions, but also switch several terms from uncommon
translations back to English vocabulary.
This most prominently concerns "commit" (noun, verb), "repository",
"branch", and some more. These uncommon translations have been introduced
long ago and never been changed since. In fact, the whole German
translation here hasn't been touched for a long time. However, in German
literature and magazines, git-gui is regularly noted for its uncommon
choice of translated vocabulary. This somewhat distracts from the actual
benefits of this tool. So it is probably better to abandon the uncommon
translations and rather stick to the common English vocabulary in git
version control.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The English glossary template was missing some terms, some of them
not only for git-gui, but also gitk and/or git core. Many such terms
have been added.
Also, the list has been sorted alphabetically so that comparison to
other glossary lists are easier.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
No content changes so far, only the preparation for subsequent edits.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
During the p4 submit process, git-p4 will attempt to apply a patch
to the files found in the p4 workspace. However, if P4 uses RCS
keyword expansion, this patch may fail.
When the patch fails, the user is alerted to the failure and that git-p4
will attempt to clear the expanded text from the files and re-apply
the patch. The current version of git-p4 does not tell the user the
result of the re-apply attempt after the RCS expansion has been removed
which can be confusing.
Add a new print statement after the git patch has been successfully
applied when the RCS keywords have been cleansed.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git command "commit" supports a number of hooks that support
changing the behavior of the commit command. The git-p4.py program only
has one existing hook, "p4-pre-submit". This command occurs early in
the process. There are no hooks in the process flow for modifying
the P4 changelist text programmatically.
Adds 3 new hooks to git-p4.py to the submit option.
The new hooks are:
* p4-prepare-changelist - Execute this hook after the changelist file
has been created. The hook will be executed even if the
--prepare-p4-only option is set. This hook ignores the --no-verify
option in keeping with the existing behavior of git commit.
* p4-changelist - Execute this hook after the user has edited the
changelist. Do not execute this hook if the user has selected the
--prepare-p4-only option. This hook will honor the --no-verify,
following the conventions of git commit.
* p4-post-changelist - Execute this hook after the P4 submission process
has completed successfully. This hook takes no parameters and is
executed regardless of the --no-verify option. It's return value will
not be checked.
The calls to the new hooks: p4-prepare-changelist, p4-changelist,
and p4-post-changelist should all be called inside the try-finally
block.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding new hooks to the submit method of git-p4,
restructure the applyCommit function in the P4Submit class. Make the
following changes:
* Move all the code after the definition of submitted = False into the
Try-Finally block. This ensures that any error that occurs will
properly recover. This is not necessary with the current code because
none of it should throw an exception, however the next set of changes
will introduce exceptional code.
Existing flow control can remain as defined - the if-block for
prepare-p4-only sets the local variable "submitted" to True and exits
the function. New early exits, leave submitted set to False so the
Finally block will undo changes to the P4 workspace.
* Make the small usability change of adding an empty string to the
print statements displayed to the user when the prepare-p4-only option
is selected. On Windows, the command print() may display a set of
parentheses "()" to the user when the print() function is called with
no parameters. By supplying an empty string, the intended blank line
will print as expected.
* Fix a small bug when the submittedTemplate is edited by the user
and all content in the file is removed. The existing code will throw
an exception if the separateLine is not found in the file. Change this
code to test for the separator line using a find() test first and only
split on the separator if it is found.
* Additionally, add the new behavior that if the changelist file has
been completely emptied that the Submit action for this changelist
will be aborted.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new command line option --no-verify:
Add a new command line option "--no-verify" to the Submit command of
git-p4.py. This option will function in the spirit of the existing
--no-verify command line option found in git commit. It will cause the
P4 Submit function to ignore the existing p4-pre-submit.
Change the execution of the existing trigger p4-pre-submit to honor the
--no-verify option. Before exiting on failure of this hook, display
text to the user explaining which hook has failed and the impact
of using the --no-verify option.
Change the call of the p4-pre-submit hook to use the new run_git_hook
function. This is in preparation of additional hooks to be added.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the p4-pre-submit exits with a non-zero exit code, the application
will abort the process with no additional information presented to the
user. This can be confusing for new users as it may not be clear that
the p4-pre-submit action caused the error.
Add text to explain why the program aborted the submit action.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is in preparation of introducing new p4 submit hooks.
The current code in the python script git-p4.py makes the assumption
that the git hooks can be executed by subprocess.call() function.
However, when git is run on Windows, this may not work as expected.
The subprocess.call() does not cover all the use cases for properly
executing the various types of executable files on Windows.
Prepare for remediation by adding a new function, run_git_hook, that
takes 2 parameters:
* the short filename of an optionally registered git hook
* an optional list of parameters
The run_git_hook function will honor the existing behavior seen in the
current code for executing the p4-pre-submit hook:
* Hooks are looked for in core.hooksPath directory.
* If core.hooksPath is not set, then the current .git/hooks directory
is checked.
* If the hook does not exist, the function returns True.
* If the hook file is not accessible, the function returns True.
* If the hook returns a zero exit code when executed, the function
return True.
* If the hook returns a non-zero exit code, the function returns False.
Add the following additional functionality if git-p4.py is run on
Windows.
* If hook file is not located without an extension, search for
any file in the associated hook directory (from the list above) that
has the same name but with an extension.
* If the file is still not found, return True (the hook is missing)
Add a new function run_hook_command() that wraps the OS dependent
functionality for actually running the subprocess.call() with OS
dependent behavior:
If a hook file exists on Windows:
* If there is no extension, set the launch executable to be SH.EXE
- Look for SH.EXE under the environmental variable EXEPATH in the
bin/ directory.
- If %EXEPATH%/bin/sh.exe exists, use this as the actual executable.
- If %EXEPATH%/bin/sh.exe does not exist, use sh.exe
- Execute subprocess.call() without the shell (shell=False)
* If there is an extension, execute subprocess.call() with teh shell
(shell=True) and consider the file to be the executable.
The return value from run_hook_command() is the subprocess.call()
return value.
These functions are added in this commit, but are only staged and not
yet used.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing function prompt(prompt_text) does not work correctly when
run on Windows 10 bash terminal when launched from the sourcetree
GUI application. The stdout is not flushed properly so the prompt text
is not displayed to the user until the next flush of stdout, which is
quite confusing.
Change this method by:
* Adding flush to stderr, stdout, and stdin
* Use readline from sys.stdin instead of raw_input.
The existing strip().lower() are retained.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --fork-point master" used to work OK, as it internally
called "git merge-base --fork-point" that knew how to handle short
refname and dwim it to the full refname before calling the
underlying get_fork_point() function.
This is no longer true after the command was rewritten in C, as its
internall call made directly to get_fork_point() does not dwim a
short ref.
Move the "dwim the refname argument to the full refname" logic that
is used in "git merge-base" to the underlying get_fork_point()
function, so that the other caller of the function in the
implementation of "git rebase" behaves the same way to fix this
regression.
Signed-off-by: Alex Torok <alext9@gmail.com>
[jc: revamped the fix and used Alex's tests]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python3 deprecates raw_input() from 2.7 and replaced it with input().
Since we do not need 2.7's input() semantics, `raw_input()` is aliased
to `input()` for easy forward compatability.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not clear why a generator was used to create the regex used to
parse git-diff-tree output; I assume an early implementation required
it, but is not part of the mainline change.
Simply use a lazily initialized global instead.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python3 uses dict.items() instead of .iteritems() to provide
iteratoration over dict. Although items() is technically less efficient
for python2.7 (allocates a new list instead of simply iterating), the
amount of data involved is very small and the penalty negligible.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For python3, reduce() has been moved to functools.reduce(). This is
also available in python2.7.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of its importing process, git-p4 sends a `checkpoint` followed
immediately by `progress` to fast-import to force synchronization.
Due to buffering, it is possible for the `progress` command to not be
flushed before git-p4 begins to wait for the corresponding response.
This causes the script to freeze completely, and is consistently
observable at least on python-3.6.9.
Make sure this command sequence is completely flushed before waiting.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
p4 does not appear to understand marshal format version 3 and above.
Version 2 was the latest supported by python-2.7.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Opening .gitp4-usercache.txt in text mode makes python 3 happy without
explicitly adding encoding and decoding.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
P4 allows essentially arbitrary encoding for path data while we would
perfer to be dealing only with unicode strings. Since path data need to
survive round-trip back to p4, this patch implements the general policy
that we store path data as-is, but decode them to unicode before doing
any non-trivial processing.
A new `decode_path()` method is provided that generally does the correct
conversion, taking into account `git-p4.pathEncoding` configuration.
For python2.7, path strings will be left as-is if it only contains ASCII
characters.
For python3, decoding is always done so that we have str objects.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under python3, calls to write() on the stream to `git fast-import` must
be encoded. This patch wraps the IO object such that this encoding is
done transparently.
Conversely, any text data read from subprocesses must also be decoded
before running through the rest of the pipeline.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The marshalled dict in the response given on STDOUT by p4 uses `str` for
keys and string values. When run using python3, these values are
deserialized as `bytes`, leading to a whole host of problems as the rest
of the code assumes `str` is used throughout.
This patch changes the deserialization behaviour such that, as much as
possible, text output from p4 is decoded to native unicode strings.
Exceptions are made for the field `data` as it is usually arbitrary
binary data. `depotFile[0-9]*`, `path`, and `clientFile` are also exempt
as they contain path strings not encoded with UTF-8, and must survive
round-trip back to p4.
Conversely, text data being piped to p4 must always be encoded when
running under python3.
encode_text_stream() and decode_text_stream() were added to make these
transformations more convenient.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that python2.7 is the minimum required version and we no longer use
the basestring type, it is not necessary to use type aliasing to ensure
python3 compatibility.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 3 handles strings differently than Python 2.7. Since Python 2
is reaching it's end of life, a series of changes are being submitted to
enable python 3.5 and following support. The current code fails basic
tests under python 3.5.
Some codepaths can represent a command line the program
internally prepares to execute either as a single string
(i.e. each token properly quoted, concatenated with $IFS) or
as a list of argv[] elements, and there are 9 places where
we say "if X is isinstance(_, basestring), then do this
thing to handle X as a command line in a single string; if
not, X is a command line in a list form".
This does not work well with Python 3, as there is no
basestring (everything is Unicode now), and even with Python
2, it was not an ideal way to tell the two cases apart,
because an internally formed command line could have been in
a single Unicode string.
Flip the check to say "if X is not a list, then handle X as
a command line in a single string; otherwise treat it as a
command line in a list form".
This will get rid of references to 'basestring', to migrate
the code ready for Python 3.
Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python2.6 and earlier have been end-of-life'd for many years now, and
we actually already use 2.7-only features in the code. Make the version
check reflect current realities.
This also removes the need to explicitly define CalledProcessError if
it's not available.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In dcb500dc16 (cherry-pick/revert: advise using --skip, 2019-07-02),
`git commit` learned to suggest to run `git cherry-pick --skip` when
trying to cherry-pick an empty patch.
However, it was overlooked that there are more conditions than just a
`git cherry-pick` when this advice is printed (which originally
suggested the neutral `git reset`): the same can happen during a rebase.
Let's suggest the correct command, even during a rebase.
While at it, we adjust more places in `builtin/commit.c` that
incorrectly assumed that the presence of a `CHERRY_PICK_HEAD` meant that
surely this must be a `cherry-pick` in progress.
Note: we take pains to handle the situation when a user runs a `git
cherry-pick` _during_ a rebase. This is quite valid (e.g. in an `exec`
line in an interactive rebase). On the other hand, it is not possible to
run a rebase during a cherry-pick, meaning: if both `rebase-merge/` and
`sequencer/` exist or CHERRY_PICK_HEAD and REBASE_HEAD point to the same
commit , we still want to advise to use `git cherry-pick --skip`.
Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Working out which command wants to create a commit requires detailed
knowledge of the sequencer internals and that knowledge is going to
increase in subsequent commits. With that in mind lets encapsulate that
knowledge in sequencer.c rather than spreading it into builtin/commit.c.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add FROM_CHERRY_PICK_MULTI for a sequence of cherry-picks rather than
using a separate variable.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git commit` relies on the presence of CHERRY_PICK_HEAD to show the
correct error message in the case of an empty pick. This fixes a
regression introduced by the conversion from shell to C. In the shell
version everything was a cherry-pick as far as the sequencer code was
concerned so it always wrote CHERRY_PICK_HEAD. The conversion to C
forgot to update the code that creates CHERRY_PICK_HEAD. We do not want
to create CHERRY_PICK_HEAD for fixup and squash commands as that would
prevent `git commit --amend` from running.
Note that the error message shown by `git commit` for an empty pick
during a rebase is currently wrong as it talks about running `git
cherry-pick --skip` rather than `git rebase --skip`. This will be fixed
in a future commit which is why the tests are in t3403-rebase-skip.sh.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We disallow partial commits and amending when CHERRY_PICK_HEAD
exists. Add a couple of tests to check the error message printed in each
case before we refactor the code responsible for this.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In dcb500dc16 (cherry-pick/revert: advise using --skip, 2019-07-02),
`git commit` learned to suggest to run `git cherry-pick --skip` when
trying to cherry-pick an empty patch, but that was never tested for.
Here is a test that verifies that a message is given to the user that
contains the correct invocation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a number of places where we compare two revisions with
test $(git rev-parse rev1) = $(git rev-parse rev2)
when these fail there's no indication what has gone wrong and you need
to be running with `-x` to see where the test has failed. Lets use
test_cmp_rev instead.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@ -271,7 +271,8 @@ will not be checked out by default; You can instruct 'clone' to recurse
into submodules. The 'init' and 'update' subcommands of 'git submodule'
will maintain submodules checked out and at an appropriate revision in
your working tree. Alternatively you can set 'submodule.recurse' to have
'checkout' recursing into submodules.
'checkout' recursing into submodules (note that 'submodule.recurse' also
affects other git commands, see linkgit:git-config[1] for a complete list).
SEE ALSO
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.