Partially revert the support for multiple hash functions to regain
hash comparison performance; we'd think of a way to do this better
in the next cycle.
* jk/hashcmp-optim-for-2.19:
hashcmp: assert constant hash size
Test fixes.
* sg/test-must-be-empty:
tests: use 'test_must_be_empty' instead of 'test_cmp <empty> <out>'
tests: use 'test_must_be_empty' instead of 'test_cmp /dev/null <out>'
tests: use 'test_must_be_empty' instead of 'test ! -s'
tests: use 'test_must_be_empty' instead of '! test -s'
"git cmd -h" updates.
* rs/opt-updates:
parseopt: group literal string alternatives in argument help
remote: improve argument help for add --mirror
checkout-index: improve argument help for --stage
"git help --config" (which is used in command line completion)
missed the configuration variables not described in the main
config.txt file but are described in another file that is included
by it, which has been corrected.
* nd/complete-config-vars:
generate-cmdlist.sh: collect config from all config.txt files
"git branch --list" learned to take the default sort order from the
'branch.sort' configuration variable, just like "git tag --list"
pays attention to 'tag.sort'.
* sm/branch-sort-config:
branch: support configuring --sort via .gitconfig
The meaning of the possible values the "core.checkStat"
configuration variable can take were not adequately documented,
which has been fixed.
* nd/config-core-checkstat-doc:
config.txt: clarify core.checkStat
275267937b (range-diff: make dual-color the default mode, 2018-08-13)
replaced --dual-color with --no-dual-color but left the option's
summary untouched. Rewrite the summary to describe --no-dual-color
rather than dual-color.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix typos and convert a question which does not expect to be replied
to a simple advice.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.commitGraph config setting was accidentally removed from
the config documentation. In that same patch, the config setting
that writes a commit-graph during garbage collection was incorrectly
written to the doc as "gc.commitGraph" instead of "gc.writeCommitGraph".
Reported-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The verbose output of the test 'reword without issues functions as
intended' in 't3423-rebase-reword.sh', added in a9279c6785 (sequencer:
do not squash 'reword' commits when we hit conflicts, 2018-06-19),
contains the following error output:
sed: -e expression #1, char 2: extra characters after command
This error comes from within the 'fake-editor.sh' script created by
'lib-rebase.sh's set_fake_editor() function, and the root cause is the
FAKE_LINES="pick 1 reword 2" variable in the test in question, in
particular the "pick" word. 'fake-editor.sh' assumes 'pick' to be the
default rebase command and doesn't support an explicit 'pick' command
in FAKE_LINES. As a result, 'pick' will be used instead of a line
number when assembling the following 'sed' script:
sed -n picks/^pick/pick/p
which triggers the aforementioned error.
Luckily, this didn't affect the test's correctness: the erroring 'sed'
command doesn't write anything to the todo script, and processing the
rest of FAKE_LINES generates the desired todo script, as if that
'pick' command were not there at all.
The minimal fix would be to remove the 'pick' word from FAKE_LINES,
but that would leave us susceptible to similar issues in the future.
Instead, teach the fake-editor script to recognize an explicit 'pick'
command, which is still a fairly trivial change.
In the future we might want to consider reinforcing this fake editor
script with an &&-chain and stricter parsing of the FAKE_LINES
variable (e.g. to error out when encountering unknown rebase commands
or commands and line numbers in the wrong order).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to 509f6f62a4 (cache: update object ID functions for
the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a
constant size of 20 bytes. Some compilers were able to turn
that into a few quad-word comparisons, which is faster than
actually calling memcmp().
In 509f6f62a4, we started using the_hash_algo->rawsz
instead. Even though this will always be 20, the compiler
doesn't know that while inlining hashcmp() and ends up just
generating a call to memcmp().
Eventually we'll have to deal with multiple hash sizes, but
for the upcoming v2.19, we can restore some of the original
performance by asserting on the size. That gives the
compiler enough information to know that the memcmp will
always be called with a length of 20, and it performs the
same optimization.
Here are numbers for p0001.2 run against linux.git on a few
versions. This is using -O2 with gcc 8.2.0.
Test v2.18.0 v2.19.0-rc0 HEAD
------------------------------------------------------------------------------
0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0%
You can see that v2.19 is a little slower than v2.18. This
commit ended up slightly faster than v2.18, but there's a
fair bit of run-to-run noise (the generated code in the two
cases is basically the same). This patch does seem to be
consistently 1-2% faster than v2.19.
I tried changing hashcpy(), which was also touched by
509f6f62a4, in the same way, but couldn't measure any
speedup. Which makes sense, at least for this workload. A
traversal of the whole commit graph requires looking up
every entry of every tree via lookup_object(). That's many
multiples of the numbers of objects in the repository (most
of the lookups just return "yes, we already saw that
object").
[jn: verified using "make object.s" that the memcmp call goes away.]
Reported-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Jeff King <peff@peff.net
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests in 't3420-rebase-autostash.sh' start various rebase
processes that are expected to fail because of merge conflicts. These
tests then run '! grep' to ensure that the autostash feature did its
job, and the dirty contents of a file is gone. However, due to the
test repo's history and the choice of upstream branch that file
shouldn't exist in the conflicted state at all. Consequently, this
'grep' doesn't fail as expected, because it can't find the dirty
content, but it fails because it can't open the file.
Tighten this check by using 'test_path_is_missing' instead, thereby
avoiding unexpected errors from 'grep' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test 'store updates stash ref and reflog' in 't3903-stash.sh'
creates a stash from a new file, runs 'git reset --hard' to throw away
any modifications to the work tree, and then runs '! grep' to ensure
that the staged contents are gone. Since the file didn't exist
before, it shouldn't exist after 'git reset' either. Consequently,
this 'grep' doesn't fail as expected, because it can't find the staged
content, but it fails because it can't open the file.
Tighten this check by using 'test_path_is_missing' instead, thereby
avoiding an unexpected error from 'grep' as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a recent update in 2.18 era, "git pack-objects" started
producing a larger than necessary packfiles by missing
opportunities to use large deltas.
* nd/pack-deltify-regression-fix:
pack-objects: fix performance issues on packing large deltas
Prior to d3c6751b18 (tests: make use of the test_must_be_empty
function, 2018-07-27), in the test 'rev-list should succeed with empty
output on empty stdin' in 't6018-rev-list-glob' the empty 'expect'
file served dual purpose: besides specifying the expected output, as
usual, it also served as empty input for 'git rev-list --stdin'.
Then d3c6751b18 came along, and, as part of the conversion to
'test_must_be_empty', removed this empty 'expect' file, not realizing
its secondary purpose. Redirecting stdin from the now non-existing
file failed the test, but since this test expects failure in the first
place, this issue went unnoticed.
Redirect 'git rev-list's stdin explicitly from /dev/null to provide
empty input. (Strictly speaking we don't need this redirection,
because the test script's stdin is already redirected from /dev/null
anyway, but I think it's better to be explicit about it.)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test ' context does not include preceding empty lines' in the
block of tests 'change with long common tail and no context' in
't4051-diff-function-context.sh' tries to read the file
'long_common_tail.diff.diff', but that file doesn't exist as its name
contains one more '.diff' suffixes than necessary.
Despite this error the test still succeeded without checking what it's
supposed to, because this erroneous read is done on the line:
test "$(first_context_line <long_common_tail.diff.diff)" != " "
which means that:
- the command substitution hides the error, so it won't fail the
test, and
- the result of the command substitution is the empty string, which
is, of course, not equal to a single space character, so the
condition is fulfilled, and the test succeeds.
As a minimal fix, fix the name of the file to be read.
In the future we might want to reorganize this test script (1) to use
'test_cmp' instead of 'test's and command substitutions to catch
failing commands and to provide helpful error messages, and (2) to
specify what the expected result actually _is_ instead of what it
isn't.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test 'checkout with autocrlf=input' in 't0020-crlf.sh', one of
the 'has_cr' checks looks at the non-existing file 'two' instead of
'dir/two'. The test still succeeds, without actually checking what it
was supposed to, because this check is expected to fail anyway.
As a minimal fix, fix the name of the file to be checked.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test '--dry-run with conflicts fixed from a merge' in
't7501-commit.sh', added in 8dc874b2ee (wt-status.c: set commitable
bit if there is a meaningful merge., 2016-02-15), runs the following
unnecessary and downright bogus command substitution:
! $(git merge --no-commit commit-1) &&
I.e. after 'git merge ...' is executed and expectedly fails, the test
attempts to execute its output:
Merging:
80f2ea2 commit 2
virtual commit-1
found 1 common ancestor:
e60d113 Initial commit
Auto-merging test-file
CONFLICT (content): Merge conflict in test-file
Automatic merge failed; fix conflicts and then commit the result.
as a command, which most likely fails, because there is no such
command as "Merging:". Then '!' negates the failed exit status, the
test continues, and eventually succeeds.
Remove this command substitution and use 'test_must_fail' to ensure
that 'git merge' fails.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The author_date_slab is used to store the author date of a commit
when walking with the --author-date flag in rev-list or log. This
was added as an 'unsigned long' in
81c6b38b "log: --author-date-order"
Since 'unsigned long' is ambiguous in its bit-ness across platforms
(64-bit in Linux, 32-bit in Windows, for example), most references
to the author dates in commit.c were converted to timestamp_t in
dddbad72 "timestamp_t: a new data type for timestamps"
However, the slab definition was missed, leading to a mismatch in
the data types in Windows. This would not reveal itself as a bug
unless someone authors a commit after February 2106, but commits
can store anything as their author date.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test-tool programs include "test-tool.h" as their first
include, which breaks our CodingGuideline of "the first
include must be git-compat-util.h or an equivalent".
Rather than change them all, let's instead make test-tool.h
one of those equivalents, just like we do for builtin.h
(which many of the actual git builtins include first).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using 'test_must_be_empty' is shorter and more idiomatic than
>empty &&
test_cmp empty out
as it saves the creation of an empty file. Furthermore, sometimes the
expected empty file doesn't have such a descriptive name like 'empty',
and its creation is far away from the place where it's finally used
for comparison (e.g. in 't7600-merge.sh', where two expected empty
files are created in the 'setup' test, but are used only about 500
lines later).
These cases were found by instrumenting 'test_cmp' to error out the
test script when it's used to compare empty files, and then converted
manually.
Note that even after this patch there still remain a lot of cases
where we use 'test_cmp' to check empty files:
- Sometimes the expected output is not hard-coded in the test, but
'test_cmp' is used to ensure that two similar git commands produce
the same output, and that output happens to be empty, e.g. the
test 'submodule update --merge - ignores --merge for new
submodules' in 't7406-submodule-update.sh'.
- Repetitive common tasks, including preparing the expected results
and running 'test_cmp', are often extracted into a helper
function, and some of this helper's callsites expect no output.
- For the same reason as above, the whole 'test_expect_success'
block is within a helper function, e.g. in 't3070-wildmatch.sh'.
- Or 'test_cmp' is invoked in a loop, e.g. the test 'cvs update
(-p)' in 't9400-git-cvsserver-server.sh'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using 'test_must_be_empty' is more idiomatic than 'test_cmp /dev/null
out', and its message on error is perhaps a bit more to the point.
This patch was basically created by running:
sed -i -e 's%test_cmp /dev/null%test_must_be_empty%' t[0-9]*.sh
with the exception of the change in 'should not fail in an empty repo'
in 't7401-submodule-summary.sh', where it was 'test_cmp output
/dev/null'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using 'test_must_be_empty' is preferable to 'test ! -s', because it
gives a helpful error message if the given file is unexpectedly no
empty, while the latter remains completely silent. Furthermore, it
also catches cases when the given file unexpectedly does not exist at
all.
This patch was created by:
sed -i -e 's/test ! -s/test_must_be_empty/' t[0-9]*.sh
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using 'test_must_be_empty' is preferable to '! test -s', because it
gives a helpful error message if the given file is unexpectedly not
empty, while the latter remains completely silent. Furthermore, it
also catches cases when the given file unexpectedly does not exist at
all.
This patch was basically created by:
sed -i -e 's/! test -s/test_must_be_empty/' t[0-9]*.sh
with the following notable exceptions:
- The '! test -s' check in '.gitmodules ignore=dirty suppresses
submodules with untracked content' in 't7508-status.sh' is left
as-is, because it's bogus and, therefore, it's subject of a
dedicated patch.
- The '! test -s' checks in 't9131-git-svn-empty-symlink.sh' and
't9135-git-svn-moved-branch-empty-file.sh' are immediately
preceeded by a 'test -f' to ensure that the files exist in the
first place. 'test_must_be_empty' ensures that as well, so those
'test -f' commands are removed as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This formally clarifies that the "--option=" part is the same for all
alternatives.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Group the possible values using a pair of parentheses and don't mark
them for translation, as they are literal strings that have to be used
as-is in any locale.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Spell out all alternatives and avoid using a numerical range operator,
as it is not mentioned in CodingGuidelines and the resulting string is
still concise. Wrap them in parentheses to document clearly that the
"--stage=" part is common among them.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script uses Documentation/config.txt as input for "git help
--config" and "git config" completion but it misses the fact that
config.txt includes other txt files. Include all *config.txt as input
when scanning for config keys. This could produce false positives, but
as long as we stick to the blah-config.txt naming convention, we
should be ok.
While at there, move diff.* from config.txt to diff-config.txt where
all other diff config keys are.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sideband code learned to optionally paint selected keywords at
the beginning of incoming lines on the receiving end.
* hn/highlight-sideband-keywords:
sideband: do not read beyond the end of input
sideband: highlight keywords in remote sideband output
"git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even
though we won't be in a cherry-pick session after it returns, which
has been corrected.
* nd/cherry-pick-quit-fix:
cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
A few preliminary minor clean-ups in the area around submodules.
* sb/submodule-cleanup:
builtin/submodule--helper: remove stray new line
t7410: update to new style
"git rebase -i", when a 'merge <branch>' insn in its todo list
fails, segfaulted, which has been (minimally) corrected.
* pw/rebase-i-merge-segv-fix:
rebase -i: fix SIGSEGV when 'merge <branch>' fails
t3430: add conflicting commit
When "git rebase -i" is told to squash two or more commits into
one, it labeled the log message for each commit with its number.
It correctly called the first one "1st commit", but the next one
was "commit #1", which was off-by-one. This has been corrected.
* pw/rebase-i-squash-number-fix:
rebase -i: fix numbering in squash message
Recent update to "git config" broke updating variable in a
subsection, which has been corrected.
* sb/config-write-fix:
git-config: document accidental multi-line setting in deprecated syntax
config: fix case sensitive subsection names on writing
t1300: document current behavior of setting options
Code hygiene improvement for the header files.
* en/incl-forward-decl:
Remove forward declaration of an enum
compat/precompose_utf8.h: use more common include guard style
urlmatch.h: fix include guard
Move definition of enum branch_track from cache.h to branch.h
alloc: make allocate_alloc_state and clear_alloc_state more consistent
Add missing includes and forward declarations
After a partial clone, repeated fetches from promisor remote would
have accumulated many packfiles marked with .promisor bit without
getting them coalesced into fewer packfiles, hurting performance.
"git repack" now learned to repack them.
* jt/repack-promisor-packs:
repack: repack promisor objects if -a or -A is set
repack: refactor setup of pack-objects cmd
A test prerequisite defined by various test scripts with slightly
different semantics has been consolidated into a single copy and
made into a lazily defined one.
* wc/make-funnynames-shared-lazy-prereq:
t: factor out FUNNYNAMES as shared lazy prereq
"git pull --rebase -v" in a repository with a submodule barfed as
an intermediate process did not understand what "-v(erbose)" flag
meant, which has been fixed.
* sb/pull-rebase-submodule:
git-submodule.sh: accept verbose flag in cmd_update to be non-quiet
"git tbdiff" that lets us compare individual patches in two
iterations of a topic has been rewritten and made into a built-in
command.
* js/range-diff: (21 commits)
range-diff: use dim/bold cues to improve dual color mode
range-diff: make --dual-color the default mode
range-diff: left-pad patch numbers
completion: support `git range-diff`
range-diff: populate the man page
range-diff --dual-color: skip white-space warnings
range-diff: offer to dual-color the diffs
diff: add an internal option to dual-color diffs of diffs
color: add the meta color GIT_COLOR_REVERSE
range-diff: use color for the commit pairs
range-diff: add tests
range-diff: do not show "function names" in hunk headers
range-diff: adjust the output of the commit pairs
range-diff: suppress the diff headers
range-diff: indent the diffs just like tbdiff
range-diff: right-trim commit messages
range-diff: also show the diff between patches
range-diff: improve the order of the shown commits
range-diff: first rudimentary implementation
Introduce `range-diff` to compare iterations of a topic branch
...
The more library-ish parts of the codebase learned to work on the
in-core index-state instance that is passed in by their callers,
instead of always working on the singleton "the_index" instance.
* nd/no-the-index: (24 commits)
blame.c: remove implicit dependency on the_index
apply.c: remove implicit dependency on the_index
apply.c: make init_apply_state() take a struct repository
apply.c: pass struct apply_state to more functions
resolve-undo.c: use the right index instead of the_index
archive-*.c: use the right repository
archive.c: avoid access to the_index
grep: use the right index instead of the_index
attr: remove index from git_attr_set_direction()
entry.c: use the right index instead of the_index
submodule.c: use the right index instead of the_index
pathspec.c: use the right index instead of the_index
unpack-trees: avoid the_index in verify_absent()
unpack-trees: convert clear_ce_flags* to avoid the_index
unpack-trees: don't shadow global var the_index
unpack-trees: add a note about path invalidation
unpack-trees: remove 'extern' on function declaration
ls-files: correct index argument to get_convert_attr_ascii()
preload-index.c: use the right index instead of the_index
dir.c: remove an implicit dependency on the_index in pathspec code
...
Improve built-in facility to catch broken &&-chain in the tests.
* es/chain-lint-more:
chainlint: add test of pathological case which triggered false positive
chainlint: recognize multi-line quoted strings more robustly
chainlint: let here-doc and multi-line string commence on same line
chainlint: recognize multi-line $(...) when command cuddled with "$("
chainlint: match 'quoted' here-doc tags
chainlint: match arbitrary here-docs tags rather than hard-coded names
Among the three codepaths we use O_APPEND to open a file for
appending, one used for writing GIT_TRACE output requires O_APPEND
implementation that behaves sensibly when multiple processes are
writing to the same file. POSIX emulation used in the Windows port
has been updated to improve in this area.
* js/mingw-o-append:
mingw: enable atomic O_APPEND
The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.
* jk/for-each-object-iteration:
for_each_*_object: move declarations to object-store.h
cat-file: use a single strbuf for all output
cat-file: split batch "buf" into two variables
cat-file: use oidset check-and-insert
cat-file: support "unordered" output for --batch-all-objects
cat-file: rename batch_{loose,packed}_object callbacks
t1006: test cat-file --batch-all-objects with duplicates
for_each_packed_object: support iterating in pack-order
for_each_*_object: give more comprehensive docstrings
for_each_*_object: take flag arguments as enum
for_each_*_object: store flag definitions in a single location
"git mergetool" stopped and gave an extra prompt to continue after
the last path has been handled, which did not make much sense.
* ng/mergetool-lose-final-prompt:
mergetool: don't suggest to continue after last file
"git verify-tag" and "git verify-commit" have been taught to use
the exit status of underlying "gpg --verify" to signal bad or
untrusted signature they found.
* jc/gpg-status:
gpg-interface: propagate exit status from gpg back to the callers
"git instaweb" has been adjusted to run better with newer Apache on
RedHat based distros.
* sk/instaweb-rh-update:
git-instaweb: fix apache2 config with apache >= 2.4
git-instaweb: support Fedora/Red Hat apache module path
Test fixes.
* en/t7406-fixes:
t7406: avoid using test_must_fail for commands other than git
t7406: prefer test_* helper functions to test -[feds]
t7406: avoid having git commands upstream of a pipe
t7406: simplify by using diff --name-only instead of diff --raw
t7406: fix call that was failing for the wrong reason
The "--exec" option to "git rebase --rebase-merges" placed the exec
commands at wrong places, which has been corrected.
* js/rebase-merges-exec-fix:
rebase --exec: make it work with --rebase-merges
t3430: demonstrate what -r, --autosquash & --exec should do
Documentation update.
* ab/newhash-is-sha256:
doc hash-function-transition: pick SHA-256 as NewHash
doc hash-function-transition: note the lack of a changelog
Checkout with the -p switch uses the "add interactive" framework which
is written in Perl.
One test added in 8d7b558bae ("checkout & worktree: introduce
checkout.defaultRemote", 2018-06-05) didn't declare the PERL
prerequisite, breaking the test when built with NO_PERL.
Reported-by: CB Bailey <cb@hashpling.org>
Signed-off-by: CB Bailey <cb@hashpling.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The caller of maybe_colorize_sideband() gives a counted buffer
<src, n>, but the callee checked src[] as if it were a NUL terminated
buffer. If src[] had all isspace() bytes in it, we would have made
n negative, and then
(1) made number of strncasecmp() calls to see if the remaining
bytes in src[] matched keywords, reading beyond the end of the
array (this actually happens even if n does not go negative),
and/or
(2) called strbuf_add() with negative count, most likely triggering
the "you want to use way too much memory" error due to unsigned
integer overflow.
Fix both issues by making sure we do not go beyond &src[n].
In the longer term we may want to accept size_t as parameter for
clarity (even though we know that a sideband message we are painting
typically would fit on a line on a terminal and int is sufficient).
Write it down as a NEEDSWORK comment.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the '--quiet' option to git worktree, as for the other git
commands. 'add' is the only command affected by it since all other
commands, except 'list', are currently silent by default.
[jc: appiled trivial fix-up to keep the tests from touching outside
the scratch area]
Helped-by: Martin Ågren <martin.agren@gmail.com>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull --rebase=interactive" learned "i" as a short-hand for
"interactive".
* js/pull-rebase-type-shorthand:
pull --rebase=<type>: allow single-letter abbreviations for the type
The end result of documentation update has been made to be
inspected more easily to help developers.
* jk/diff-rendered-docs:
add a script to diff rendered documentation
"git merge --abort" etc. did not clean things up properly when
there were conflicted entries in the index in certain order that
are involved in D/F conflicts. This has been corrected.
* en/abort-df-conflict-fixes:
read-cache: fix directory/file conflict handling in read_index_unmerged()
t1015: demonstrate directory/file conflict recovery failures
The http-backend (used for smart-http transport) used to slurp the
whole input until EOF, without paying attention to CONTENT_LENGTH
that is supplied in the environment and instead expecting the Web
server to close the input stream. This has been fixed.
* mk/http-backend-content-length:
t5562: avoid non-portable "export FOO=bar" construct
http-backend: respect CONTENT_LENGTH for receive-pack
http-backend: respect CONTENT_LENGTH as specified by rfc3875
http-backend: cleanup writing to child process
A few atoms like %(objecttype) and %(objectsize) in the format
specifier of "for-each-ref --format=<format>" can be filled without
getting the full contents of the object, but just with the object
header. These cases have been optimized by calling
oid_object_info() API (instead of reading and inspecting the data).
* ot/ref-filter-object-info:
ref-filter: use oid_object_info() to get object
ref-filter: merge get_obj and get_object
ref-filter: initialize eaten variable
ref-filter: fill empty fields with empty values
ref-filter: add info_source to valid_atom
Noiseword "extern" has been removed from function decls in the
header files.
* nd/no-extern:
submodule.h: drop extern from function declaration
revision.h: drop extern from function declaration
repository.h: drop extern from function declaration
rerere.h: drop extern from function declaration
line-range.h: drop extern from function declaration
diff.h: remove extern from function declaration
diffcore.h: drop extern from function declaration
convert.h: drop 'extern' from function declaration
cache-tree.h: drop extern from function declaration
blame.h: drop extern on func declaration
attr.h: drop extern from function declaration
apply.h: drop extern on func declaration
The parse-options machinery learned to refrain from enclosing
placeholder string inside a "<bra" and "ket>" pair automatically
without PARSE_OPT_LITERAL_ARGHELP. Existing help text for option
arguments that are not formatted correctly have been identified and
fixed.
* rs/parse-opt-lithelp:
parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP
shortlog: correct option help for -w
send-pack: specify --force-with-lease argument help explicitly
pack-objects: specify --index-version argument help explicitly
difftool: remove angular brackets from argument help
add, update-index: fix --chmod argument help
push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
Update to a few other topics around 'git fetch'.
* ab/fetch-nego:
fetch doc: cross-link two new negotiation options
negotiator: unknown fetch.negotiationAlgorithm should error out
"git fetch $there refs/heads/s" ought to fetch the tip of the
branch 's', but when "refs/heads/refs/heads/s", i.e. a branch whose
name is "refs/heads/s" exists at the same time, fetched that one
instead by mistake. This has been corrected to honor the usual
disambiguation rules for abbreviated refnames.
* jt/refspec-dwim-precedence-fix:
remote: make refspec follow the same disambiguation rule as local refs
The automatic tree-matching in "git merge -s subtree" was broken 5
years ago and nobody has noticed since then, which is now fixed.
* jk/merge-subtree-heuristics:
score_trees(): fix iteration over trees with missing entries
The test performed at the receiving end of "git push" to prevent
bad objects from entering repository can be customized via
receive.fsck.* configuration variables; we now have gained a
counterpart to do the same on the "git fetch" side, with
fetch.fsck.* configuration variables.
* ab/fsck-transfer-updates:
fsck: test and document unknown fsck.<msg-id> values
fsck: add stress tests for fsck.skipList
fsck: test & document {fetch,receive}.fsck.* config fallback
fetch: implement fetch.fsck.*
transfer.fsckObjects tests: untangle confusing setup
config doc: elaborate on fetch.fsckObjects security
config doc: elaborate on what transfer.fsckObjects does
config doc: unify the description of fsck.* and receive.fsck.*
config doc: don't describe *.fetchObjects twice
receive.fsck.<msg-id> tests: remove dead code
The description of this key does not really tell what the 'minimal'
mode checks and does not check. The description for the 'default'
mode is not much better and just says 'all fields', which is unclear
and is not even correct (e.g. we do not look at 'atime').
Spell out what are and what are not checked under the 'minimal' mode
relative to the 'default' mode to help those who want to decide if
they want to use the 'minimal' mode, also taking information about
this mode from the commit message of c08e4d5b5c (Enable minimal stat
checking - 2013-01-22).
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add support for configuring default sort ordering for git branches. Command
line option will override this configured value, using the exact same
syntax.
Signed-off-by: Samuel Maftoul <samuel.maftoul@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at it fix a typo (s/independed/independent) and
make sure git is not in a chain of pipes.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--quit is supposed to be --abort but without restoring HEAD. Leaving
CHERRY_PICK_HEAD behind could make other commands mistake that
cherry-pick is still ongoing (e.g. "git commit --amend" will refuse to
work). Clean it too.
For --abort, this job of deleting CHERRY_PICK_HEAD is on "git reset"
so we don't need to do anything else. But let's add extra checks in
--abort tests to confirm.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a merge command in the todo list specifies just a branch to merge
with no -C/-c argument then item->commit is NULL. This means that if
there are merge conflicts error_with_patch() is passed a NULL commit
which causes a segmentation fault when make_patch() tries to look it up.
This commit implements a minimal fix which fixes the crash and allows
the user to successfully commit a conflict resolution with 'git rebase
--continue'. It does not write .git/rebase-merge/patch,
.git/rebase-merge/stopped-sha or update REBASE_HEAD. To sensibly get the
hashes of the merge parents would require refactoring do_merge() to
extract the code that parses the merge parents into a separate function
which error_with_patch() could then use to write the parents into the
stopped-sha file. To create meaningful output make_patch() and 'git
rebase --show-current-patch' would also need to be modified to diff the
merge parent and merge base in this case.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the creation of conflicting-G from a test to the setup so that it
can be used in subsequent tests without creating the kind of implicit
dependencies that plague t3404. While we're at it simplify the
arguments to the test_commit() call the creates the conflicting commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" sometimes failed to update the remote-tracking refs,
which has been corrected.
* jt/connectivity-check-after-unshallow:
fetch-pack: unify ref in and out param
The Travis CI scripts were taught to ship back the test data from
failed tests.
* sg/travis-retrieve-trash-upon-failure:
travis-ci: include the trash directories of failed tests in the trace log
"git p4 submit" learns to ask its own pre-submit hook if it should
continue with submitting.
* cb/p4-pre-submit-hook:
git-p4: add the `p4-pre-submit` hook
Add a script (in contrib/) to help users of VSCode work better with
our codebase.
* js/vscode:
vscode: let cSpell work on commit messages, too
vscode: add a dictionary for cSpell
vscode: use 8-space tabs, no trailing ws, etc for Git's source code
vscode: wrap commit messages at column 72 by default
vscode: only overwrite C/C++ settings
mingw: define WIN32 explicitly
cache.h: extract enum declaration from inside a struct declaration
vscode: hard-code a couple defines
contrib: add a script to initialize VS Code configuration
It is too easy to misuse system API functions such as strcat();
these selected functions are now forbidden in this codebase and
will cause a compilation failure.
* jk/banned-function:
banned.h: mark strncpy() as banned
banned.h: mark sprintf() as banned
banned.h: mark strcat() as banned
automatically ban strcpy()
When the sparse checkout feature is in use, "git cherry-pick" and
other mergy operations lost the skip_worktree bit when a path that
is excluded from checkout requires content level merge, which is
resolved as the same as the HEAD version, without materializing the
merge result in the working tree, which made the path appear as
deleted. This has been corrected by preserving the skip_worktree
bit (and not materializing the file in the working tree).
* en/merge-recursive-skip-fix:
merge-recursive: preserve skip_worktree bit when necessary
t3507: add a testcase showing failure with sparse checkout
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
fetch $remote branch:branch" that asks tags that point into the
history leading to the "branch" automatically followed sent to
narrow prefix and broke the tag following, which has been fixed.
* jt/tag-following-with-proto-v2-fix:
fetch: send "refs/tags/" prefix upon CLI refspecs
t5702: test fetch with multiple refspecs at a time
Code clean-up to use size_t/ssize_t when they are the right type.
* jk/size-t:
strbuf_humanise: use unsigned variables
pass st.st_size as hint for strbuf_readlink()
strbuf_readlink: use ssize_t
strbuf: use size_t for length in intermediate variables
reencode_string: use size_t for string lengths
reencode_string: use st_add/st_mult helpers
Update the way we use Coccinelle to find out-of-style code that
need to be modernised.
* sg/coccicheck-updates:
coccinelle: extract dedicated make target to clean Coccinelle's results
coccinelle: put sane filenames into output patches
coccinelle: exclude sha1dc source files from static analysis
coccinelle: use $(addsuffix) in 'coccicheck' make target
coccinelle: mark the 'coccicheck' make target as .PHONY
"git diff --histogram" had a bad memory usage pattern, which has
been rearranged to reduce the peak usage.
* sb/histogram-less-memory:
xdiff/histogram: remove tail recursion
xdiff/xhistogram: move index allocation into find_lcs
xdiff/xhistogram: factor out memory cleanup into free_index()
xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff
Many more strings are prepared for l10n.
* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...
Teach "git tag -s" etc. a few configuration variables (gpg.format
that can be set to "openpgp" or "x509", and gpg.<format>.program
that is used to specify what program to use to deal with the format)
to allow x.509 certs with CMS via "gpgsm" to be used instead of
openpgp via "gnupg".
* hs/gpgsm:
gpg-interface t: extend the existing GPG tests with GPGSM
gpg-interface: introduce new signature format "x509" using gpgsm
gpg-interface: introduce new config to select per gpg format program
gpg-interface: do not hardcode the key string len anymore
gpg-interface: introduce an abstraction for multiple gpg formats
t/t7510: check the validation of the new config gpg.format
gpg-interface: add new config to select how to sign a commit
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement. "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.
* bw/clone-ref-prefixes:
clone: send ref-prefixes when using protocol v2
A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.
* jk/core-use-replace-refs:
add core.usereplacerefs config option
check_replace_refs: rename to read_replace_refs
check_replace_refs: fix outdated comment
"make DEVELOPER=1 DEVOPTS=pedantic" allows developers to compile
with -pedantic option, which may catch more problematic program
constructs and potential bugs.
* bb/make-developer-pedantic:
Makefile: add a DEVOPTS flag to get pedantic compilation
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".
* es/diff-color-moved-fix:
diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
Update the way we run static analysis tool at TravisCI to make it
easier to use its findings.
* sg/travis-cocci-diagnose-failure:
travis-ci: fail if Coccinelle static analysis found something to transform
travis-ci: run Coccinelle static analysis with two parallel jobs
According to http://c-faq.com/null/machexamp.html, sizeof(char*) !=
sizeof(int*) on some platforms. Since an enum could be a char or int
(or long or...), knowing the size of the enum thus is important to
knowing the size of a pointer to an enum, so we cannot just forward
declare an enum the way we can a struct. (Also, modern C++ compilers
apparently define forward declarations of an enum to either be useless
because the enum was defined, or require an explicit size specifier, or
be a compilation error.)
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'branch_track' feels more closely related to branching, and it is
needed later in branch.h; rather than #include'ing cache.h in branch.h
for this small enum, just move the enum and the external declaration
for git_branch_track to branch.h.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since both functions are using the same data type, they should either both
refer to it as void *, or both use the real type (struct alloc_state *).
Opt for the latter.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
#include "git-compat-util.h"
#include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e12a7ef597 ("rebase -i: Handle "combination of <n> commits" with
GETTEXT_POISON", 2018-04-27) changed the way that individual commit
messages are labelled when squashing commits together. In doing so a
regression was introduced where the numbering of the messages is off by
one. This commit fixes that and adds a test for the numbering.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `chainlint` target compares actual output to expected output, where
the actual output is generated from files that are specifically checked
out with LF-only line endings. So the expected output needs to be
checked out with LF-only line endings, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rendered documentation can be easier to read than raw text because
headings and emphasized phrases stand out. Add the missing markup and
Makefile rule required to render this design document using asciidoc.
Tested by running
make -C Documentation technical/partial-clone.html
and viewing the output in a browser.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests added in 63e95beb08 ("submodule: port resolve_relative_url
from shell to C", 2016-04-15) didn't do a good job of testing various
up-path invocations where the up-path would bring us beyond even the
URL in question without emitting an error.
These results look nonsensical, but it's worth exhaustively testing
them before fixing any of this code, so we can see which of these
cases were changed.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a56771a668 (builtin/pull: respect verbosity settings in submodules,
2018-01-25), we made sure to pass on both quiet and verbose flag from
builtin/pull.c to the submodule shell script. However git-submodule doesn't
understand a verbose flag, which results in a bug when invoking
git pull --recurse-submodules -v [...]
There are a few different approaches to fix this bug:
1) rewrite 'argv_push_verbosity' or its caller in builtin/pull.c to
cap opt_verbosity at 0. Then 'argv_push_verbosity' would only add
'-q' if any.
2) Have a flag in 'argv_push_verbosity' that specifies if we allow adding
-q or -v (or both).
3) Add -v to git-submodule.sh and make it a no-op
(1) seems like a maintenance burden: What if we add code after
the submodule operations or move submodule operations higher up,
then we have altered the opt_verbosity setting further down the line
in builtin/pull.c.
(2) seems like it could work reasonably well without more regressions
(3) seems easiest to implement as well as actually is a feature with the
last-one-wins rule of passing flags to Git commands.
Reported-by: Jochen Kühner
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The for_each_loose_object() and for_each_packed_object()
functions are meant to be part of a unified interface: they
use the same set of for_each_object_flags, and it's not
inconceivable that we might one day add a single
for_each_object() wrapper around them.
Let's put them together in a single file, so we can avoid
awkwardness like saying "the flags for this function are
over in cache.h". Moving the loose functions to packfile.h
is silly. Moving the packed functions to cache.h works, but
makes the "cache.h is a kitchen sink" problem worse. The
best place is the recently-created object-store.h, since
these are quite obviously related to object storage.
The for_each_*_in_objdir() functions do not use the same
flags, but they are logically part of the same interface as
for_each_loose_object(), and share callback signatures. So
we'll move those, as well, as they also make sense in
object-store.h.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're in batch mode, we end up in batch_object_write()
for each object, which allocates its own strbuf for each
call. Instead, we can provide a single "scratch" buffer that
gets reused for each output. When running:
git cat-file --batch-all-objects --batch-check='%(objectname)'
on git.git, my best-of-five time drops from:
real 0m0.171s
user 0m0.159s
sys 0m0.012s
to:
real 0m0.133s
user 0m0.121s
sys 0m0.012s
Note that we could do this just by putting the "scratch"
pointer into "struct expand_data", but I chose instead to
add an extra parameter to the callstack. That's more
verbose, but it makes it a bit more obvious what is going
on, which in turn makes it easy to see where we need to be
releasing the string in the caller (right after the loop
which uses it in each case).
Based-on-a-patch-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use the "buf" strbuf for two things: to read incoming
lines, and as a scratch space for test-expanding the
user-provided format. Let's split this into two variables
with descriptive names, which makes their purpose and
lifetime more clear.
It will also help in a future patch when we start using the
"output" buffer for more expansions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't need to check if the oidset has our object before
we insert it; that's done as part of the insertion. We can
just rely on the return value from oidset_insert(), which
saves one hash lookup per object.
This measurable speedup is tiny and within the run-to-run
noise, but the result is simpler to read, too.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test 'pack-objects to file can use bitmap' added in 645c432d61
(pack-objects: use reachability bitmap index when generating
non-stdout pack, 2016-09-10) is silently buggy and doesn't check what
it's supposed to.
In 't5310-pack-bitmaps.sh', the 'list_packed_objects' helper function
does what its name implies by running:
git show-index <"$1" | cut -d' ' -f2
The test in question invokes this function like this:
list_packed_objects <packa-$packasha1.idx >packa.objects &&
list_packed_objects <packb-$packbsha1.idx >packb.objects &&
test_cmp packa.objects packb.objects
Note how these two callsites don't specify the name of the pack index
file as the function's parameter, but redirect the function's standard
input from it. This triggers an error message from the shell, as it
has no filename to redirect from in the function, but this error is
ignored, because it happens upstream of a pipe. Consequently, both
invocations produce empty 'pack{a,b}.objects' files, and the
subsequent 'test_cmp' happily finds those two empty files identical.
Fix these two 'list_packed_objects' invocations by specifying the pack
index files as parameters. Furthermore, eliminate the pipe in that
function by replacing it with an &&-chained pair of commands using an
intermediate file, so a failure of 'git show-index' or the shell
redirection will fail the test.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Windows CRT implements O_APPEND "manually": on write() calls, the
file pointer is set to EOF before the data is written. Clearly, this is
not atomic. And in fact, this is the root cause of failures observed in
t5552-skipping-fetch-negotiator.sh and t5503-tagfollow.sh, where
different processes write to the same trace file simultanously; it also
occurred in t5400-send-pack.sh, but there it was worked around in
71406ed4d6 ("t5400: avoid concurrent writes into a trace file",
2017-05-18).
Fortunately, Windows does support atomic O_APPEND semantics using the
file access mode FILE_APPEND_DATA. Provide an implementation that does.
This implementation is minimal in such a way that it only implements
the open modes that are actually used in the Git code base. Emulation
for other modes can be added as necessary later. To become aware of
the necessity early, the unusal error ENOSYS is reported if an
unsupported mode is encountered.
Diagnosed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Side note, since we gain access to the right repository, we can stop
rely on the_repository in this code as well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use apply_state->repo->index instead of the_index (in most cases,
unless we need to use a temporary index in some functions). Let the
callers (am and apply) tell us what to use, instead of always assuming
to operate on the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're moving away from the_index in this code. "struct index_state *"
could be added to struct apply_state. But let's aim long term and put
struct repository here instead so that we could even avoid more global
states in the future. The index will be available via
apply_state->repo->index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
we're going to remove the dependency on the_index by moving 'struct
index_state *' to somewhere inside struct apply_state. Let's make sure
relevant functions have access to this struct now and reduce the diff
noise when the actual conversion happens.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 'struct archive_args' gaining new repository pointer, we don't
have to assume the_repository in the archive backends anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since attr checking API now take the index, there's no need to set an
index in advance with this call. Most call sites are straightforward
because they either pass the_index or NULL (which defaults back to
the_index previously). There's only one suspicious call site in
unpack-trees.c where it sets a different index.
This code in unpack-trees is about to check out entries from the
new/temporary index after merging is done in it. The attributes will
be used by entry.c code to do crlf conversion if needed. entry.c now
respects struct checkout's istate field, and this field is correctly
set in unpack-trees.c, there should be no regression from this change.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
checkout-index.c needs update because if checkout->istate is NULL,
ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..)
so it will not crash, but it is not technically correct either.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both functions that are updated in this commit are called by
verify_absent(), which is part of the "unpack-trees" operation that is
supposed to work on any index file specified by the caller. Thanks to
Brandon [1] [2], an implicit dependency on the_index is exposed. This
commit fixes it.
In both functions, it makes sense to use src_index to check for
exclusion because it's almost unchanged and should give us the same
outcome as if running the exclude check before the unpack.
It's "almost unchanged" because we do invalidate cache-tree and
untracked cache in the source index. But this should not affect how
exclude machinery uses the index: to see if a file is tracked, and to
read a blob from the index instead of worktree if it's marked
skip-worktree (i.e. it's not available in worktree)
[1] a0bba65b10 (dir: convert is_excluded to take an index - 2017-05-05
[2] 2c1eb10454 (dir: convert read_directory to take an index - 2017-05-05)
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to fba92be8f7, this code implicitly (and incorrectly) assumes
the_index when running the exclude machinery. fba92be8f7 helps show
this problem clearer because unpack-trees operation is supposed to
work on whatever index the caller specifies... not specifically
the_index.
Update the code to use "istate" argument that's originally from
mark_new_skip_worktree(). From the call sites, both in unpack_trees(),
you can see that this function works on two separate indexes:
o->src_index and o->result. The second mark_new_skip_worktree() so far
has incorecctly applied exclude rules on o->src_index instead of
o->result. It's unclear what is the consequences of this, but it's
definitely wrong.
[1] fba92be8f7 (dir: convert is_excluded_from_list to take an index -
2017-05-05)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function mark_new_skip_worktree() has an argument named the_index
which is also the name of a global variable. While they have different
types (the global the_index is not a pointer) mistakes can easily
happen and it's also confusing for readers. Rename the function
argument to something other than the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_eolinfo() does take an istate as function argument and it should
be used instead of the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.
There is one ugly temporary workaround added in attr.c that needs some
more explanation.
Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.
The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This puts update_main_cache_tree() and write_cache_as_tree() in the
same group of "index compat" functions that assume the_index
implicitly, which should only be used within builtin/ or t/helper.
sequencer.c is also updated to not use these functions. As of now, no
files outside builtin/ use these functions anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH]
Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the
preparation instead and avoid read_index() in diff.c code.
read_index() should be avoided (in addition to the_index) because it
uses get_index_file() underneath to get the path $GIT_DIR/index. This
effectively pulls the_repository in and may become the only reason to
pull a 'struct repository *' in diff.c. Let's keep the dependencies as
few as possible and kick it back to diff-tree.c
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you're going to access the contents of every object in a
packfile, it's generally much more efficient to do so in
pack order, rather than in hash order. That increases the
locality of access within the packfile, which in turn is
friendlier to the delta base cache, since the packfile puts
related deltas next to each other. By contrast, hash order
is effectively random, since the sha1 has no discernible
relationship to the content.
This patch introduces an "--unordered" option to cat-file
which iterates over packs in pack-order under the hood. You
can see the results when dumping all of the file content:
$ time ./git cat-file --batch-all-objects --buffer --batch | wc -c
6883195596
real 0m44.491s
user 0m42.902s
sys 0m5.230s
$ time ./git cat-file --unordered \
--batch-all-objects --buffer --batch | wc -c
6883195596
real 0m6.075s
user 0m4.774s
sys 0m3.548s
Same output, different order, way faster. The same speed-up
applies even if you end up accessing the object content in a
different process, like:
git cat-file --batch-all-objects --buffer --batch-check |
grep blob |
git cat-file --batch='%(objectname) %(rest)' |
wc -c
Adding "--unordered" to the first command drops the runtime
in git.git from 24s to 3.5s.
Side note: there are actually further speedups available
for doing it all in-process now. Since we are outputting
the object content during the actual pack iteration, we
know where to find the object and could skip the extra
lookup done by oid_object_info(). This patch stops short
of that optimization since the underlying API isn't ready
for us to make those sorts of direct requests.
So if --unordered is so much better, why not make it the
default? Two reasons:
1. We've promised in the documentation that --batch-all-objects
outputs in hash order. Since cat-file is plumbing,
people may be relying on that default, and we can't
change it.
2. It's actually _slower_ for some cases. We have to
compute the pack revindex to walk in pack order. And
our de-duplication step uses an oidset, rather than a
sort-and-dedup, which can end up being more expensive.
If we're just accessing the type and size of each
object, for example, like:
git cat-file --batch-all-objects --buffer --batch-check
my best-of-five warm cache timings go from 900ms to
1100ms using --unordered. Though it's possible in a
cold-cache or under memory pressure that we could do
better, since we'd have better locality within the
packfile.
And one final question: why is it "--unordered" and not
"--pack-order"? The answer is again two-fold:
1. "pack order" isn't a well-defined thing across the
whole set of objects. We're hitting loose objects, as
well as objects in multiple packs, and the only
ordering we're promising is _within_ a single pack. The
rest is apparently random.
2. The point here is optimization. So we don't want to
promise any particular ordering, but only to say that
we will choose an ordering which is likely to be
efficient for accessing the object content. That leaves
the door open for further changes in the future without
having to add another compatibility option.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're not really doing the batch-show operation in these
callbacks, but just collecting the set of objects. That
distinction will become more important in a future patch, so
let's rename them now to avoid cluttering that diff.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test for --batch-all-objects in t1006 covers a variety
of object storage situations, but one thing it doesn't cover
is that we avoid mentioning duplicate objects. We won't have
any because running "git repack -ad" will have packed them
all and deleted the loose ones.
This does work (because we sort and de-dup the output list),
but it's good to include it in our test. And doubly so for
when we add an unordered mode which has to de-dup in a
different way.
Note that we cannot just re-create one of the objects, as
Git will omit the write of an object that is already
present. However, we can create a new pack with one of the
objects, which forces the duplication.
One alternative would be to just use "git repack -a" instead
of "-ad". But then _every_ object would be duplicated as
loose and packed, and we might miss a bug that omits packed
objects (because we'd show their loose counterparts).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently iterate over objects within a pack in .idx
order, which uses the object hashes. That means that it
is effectively random with respect to the location of the
object within the pack. If you're going to access the actual
object data, there are two reasons to move linearly through
the pack itself:
1. It improves the locality of access in the packfile. In
the cold-cache case, this may mean fewer disk seeks, or
better usage of disk cache.
2. We store related deltas together in the packfile. Which
means that the delta base cache can operate much more
efficiently if we visit all of those related deltas in
sequence, as the earlier items are likely to still be
in the cache. Whereas if we visit the objects in
random order, our cache entries are much more likely to
have been evicted by unrelated deltas in the meantime.
So in general, if you're going to access the object contents
pack order is generally going to end up more efficient.
But if you're simply generating a list of object names, or
if you're going to end up sorting the result anyway, you're
better off just using the .idx order, as finding the pack
order means generating the in-memory pack-revindex.
According to the numbers in 8b8dfd5132 (pack-revindex:
radix-sort the revindex, 2013-07-11), that takes about 200ms
for linux.git, and 20ms for git.git (those numbers are a few
years old but are still a good ballpark).
That makes it a good optimization for some cases (we can
save tens of seconds in git.git by having good locality of
delta access, for a 20ms cost), but a bad one for others
(e.g., right now "cat-file --batch-all-objects
--batch-check="%(objectname)" is 170ms in git.git, so adding
20ms to that is noticeable).
Hence this patch makes it an optional flag. You can't
actually do any interesting timings yet, as it's not plumbed
through to any user-facing tools like cat-file. That will
come in a later patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already mention the local/alternate behavior of these
functions, but we can help clarify a few other behaviors:
- there's no need to mention LOCAL_ONLY specifically, since
we already reference the flags by type (and as we add
more flags, we don't want to have to mention each)
- clarify that reachability doesn't matter here; this is
all accessible objects
- what ordering/uniqueness guarantees we give
- how pack-specific flags are handled for the loose case
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's not wrong to pass our flags in an "unsigned", as we
know it will be at least as large as the enum. However,
using the enum in the declaration makes it more obvious
where to find the list of flags.
While we're here, let's also drop the "extern" noise-words
from the declarations, per our modern coding style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These flags were split between cache.h and packfile.h,
because some of the flags apply only to packs. However, they
share a single numeric namespace, since both are respected
for the packed variant. Let's make sure they're defined
together so that nobody accidentally adds a new flag in one
location that duplicates the other.
While we're here, let's also put them in an enum (which
helps debugger visibility) and use "(1<<n)" rather than
counting powers of 2 manually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It should be "is not an empty string" not "is not empty string". This
fixes wording originally introduced in ab9b31386b ("Documentation:
multi-head fetch.", 2005-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct a comment referring to the removal of just the branch to also
refer to the tag. This should have been changed in my
ca3065e7e7 ("fetch tests: add a tag to be deleted to the pruning
tests", 2018-02-09) when the tag deletion was added, but I missed it
at the time.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This extract from contrib/subtree/t7900 triggered a false positive due
to three chainlint limitations:
* recognizing only a "blessed" set of here-doc tag names in a subshell
("EOF", "EOT", "INPUT_END"), of which "TXT" is not a member
* inability to recognize multi-line $(...) when the first statement of
the body is cuddled with the opening "$("
* inability to recognize multiple constructs on a single line, such as
opening a multi-line $(...) and starting a here-doc
Now that all of these shortcomings have been addressed, turn this rather
pathological bit of shell coding into a chainlint test case.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chainlint.sed recognizes multi-line quoted strings within subshells:
echo "abc
def" >out &&
so it can avoid incorrectly classifying lines internal to the string as
breaking the &&-chain. To identify the first line of a multi-line
string, it checks if the line contains a single quote. However, this is
fragile and can be easily fooled by a line containing multiple strings:
echo "xyz" "abc
def" >out &&
Make detection more robust by checking for an odd number of quotes
rather than only a single one.
(Escaped quotes are not handled, but support may be added later.)
The original multi-line string recognizer rather cavalierly threw away
all but the final quote, whereas the new one is careful to retain all
quotes, so the "expected" output of a couple existing chainlint tests is
updated to account for this new behavior.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After swallowing a here-doc, chainlint.sed assumes that no other
processing needs to be done on the line aside from checking for &&-chain
breakage; likewise, after folding a multi-line quoted string. However,
it's conceivable (even if unlikely in practice) that both a here-doc and
a multi-line quoted string might commence on the same line:
cat <<\EOF && echo "foo
bar"
data
EOF
Support this case by sending the line (after swallowing and folding)
through the normal processing sequence rather than jumping directly to
the check for broken &&-chain.
This change also allows other somewhat pathological cases to be handled,
such as closing a subshell on the same line starting a here-doc:
(
cat <<-\INPUT)
data
INPUT
or, for instance, opening a multi-line $(...) expression on the same
line starting a here-doc:
x=$(cat <<-\END &&
data
END
echo "x")
among others.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For multi-line $(...) expressions nested within subshells, chainlint.sed
only recognizes:
x=$(
echo foo &&
...
but it is not unlikely that test authors may also cuddle the command
with the opening "$(", so support that style, as well:
x=$(echo foo &&
...
The closing ")" is already correctly recognized when cuddled or not.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A here-doc tag can be quoted ('EOF') or escaped (\EOF) to suppress
interpolation within the body. Although, chainlint recognizes escaped
tags, it does not know about quoted tags. For completeness, teach it to
recognize quoted tags, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
chainlint.sed swallows top-level here-docs to avoid being fooled by
content which might look like start-of-subshell. It likewise swallows
here-docs in subshells to avoid marking content lines as breaking the
&&-chain, and to avoid being fooled by content which might look like
end-of-subshell, start-of-nested-subshell, or other specially-recognized
constructs.
At the time of implementation, it was believed that it was not possible
to support arbitrary here-doc tag names since 'sed' provides no way to
stash the opening tag name in a variable for later comparison against a
line signaling end-of-here-doc. Consequently, tag names are hard-coded,
with "EOF" being the only tag recognized at the top-level, and only
"EOF", "EOT", and "INPUT_END" being recognized within subshells. Also,
special care was taken to avoid being confused by here-docs nested
within other here-docs.
In practice, this limited number of hard-coded tag names has been "good
enough" for the 13000+ existing Git test, despite many of those tests
using tags other than the recognized ones, since the bodies of those
here-docs do not contain content which would fool the linter.
Nevertheless, the situation is not ideal since someone writing new
tests, and choosing a name not in the "blessed" set could potentially
trigger a false-positive.
To address this shortcoming, upgrade chainlint.sed to handle arbitrary
here-doc tag names, both at the top-level and within subshells.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate an unnecessary prompt to continue after failed merger, by
not calling the prompt_after_failed_merge function when only one
iteration remains.
Uses positional parameters to count files in the list to make it
easier to see if we have any more paths to process from within the
loop.
Signed-off-by: Nicholas Guriev <guriev-ns@ya.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two tests added in dade47c06c (commit-graph: add repo arg to graph
readers, 2018-07-11) prepare the contents of 'expect' files by
'echo'ing the results of command substitutions. That's unncessary,
avoid them by directly saving the output of the commands executed in
those command substitutions.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph files are binary files, so they should not be
compared with 'test_cmp', because that might cause issues like
crashing[1] or infinite loop[2] on Windows, where 'test_cmp' is a
shell function to deal with random LF-CRLF conversions[3].
Use 'test_cmp_bin' instead.
1 - b93e6e3663 (t5000, t5003: do not use test_cmp to compare binary
files, 2014-06-04)
2 - f9f3851b4d (t9300: use test_cmp_bin instead of test_cmp to compare
binary files, 2014-09-12)
3 - 4d715ac05c (Windows: a test_cmp that is agnostic to random LF <>
CRLF conversions, 2013-10-26)
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It *is* a confusing thing to look at a diff of diffs. All too easy is it
to mix up whether the -/+ markers refer to the "inner" or the "outer"
diff, i.e. whether a `+` indicates that a line was added by either the
old or the new diff (or both), or whether the new diff does something
different than the old diff.
To make things easier to process for normal developers, we introduced
the dual color mode which colors the lines according to the commit diff,
i.e. lines that are added by a commit (whether old, new, or both) are
colored in green. In non-dual color mode, the lines would be colored
according to the outer diff: if the old commit added a line, it would be
colored red (because that line addition is only present in the first
commit range that was specified on the command-line, i.e. the "old"
commit, but not in the second commit range, i.e. the "new" commit).
However, this dual color mode is still not making things clear enough,
as we are looking at two levels of diffs, and we still only pick a color
according to *one* of them (the outer diff marker is colored
differently, of course, but in particular with deep indentation, it is
easy to lose track of that outer diff marker's background color).
Therefore, let's add another dimension to the mix. Still use
green/red/normal according to the commit diffs, but now also dim the
lines that were only in the old commit, and use bold face for the lines
that are only in the new commit.
That way, it is much easier not to lose track of, say, when we are
looking at a line that was added in the previous iteration of a patch
series but the new iteration adds a slightly different version: the
obsolete change will be dimmed, the current version of the patch will be
bold.
At least this developer has a much easier time reading the range-diffs
that way.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After using this command extensively for the last two months, this
developer came to the conclusion that even if the dual color mode still
leaves a lot of room for confusion about what was actually changed, the
non-dual color mode is substantially worse in that regard.
Therefore, we really want to make the dual color mode the default.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As pointed out by Elijah Newren, tbdiff has this neat little alignment
trick where it outputs the commit pairs with patch numbers that are
padded to the maximal patch number's width:
1: cafedead = 1: acefade first patch
[...]
314: beefeada < 314: facecab up to PI!
Let's do the same in range-diff, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tab completion of `git range-diff` is very convenient, especially
given that the revision arguments to specify the commit ranges to
compare are typically more complex than, say, what is normally passed
to `git log`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When displaying a diff of diffs, it is possible that there is an outer
`+` before a context line. That happens when the context changed between
old and new commit. When that context line starts with a tab (after the
space that marks it as context line), our diff machinery spits out a
white-space error (space before tab), but in this case, that is
incorrect.
Rather than adding a specific whitespace flag that specifically ignores
the first space in the output (and might miss other problems with the
white-space warnings), let's just skip handling white-space errors in
dual color mode to begin with.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing what changed between old and new commits, we show a diff of
the patches. This diff is a diff between diffs, therefore there are
nested +/- signs, and it can be relatively hard to understand what is
going on.
With the --dual-color option, the preimage and the postimage are colored
like the diffs they are, and the *outer* +/- sign is inverted for
clarity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffing diffs, it can be quite daunting to figure out what the heck
is going on, as there are nested +/- signs.
Let's make this easier by adding a flag in diff_options that allows
color-coding the outer diff sign with inverted colors, so that the
preimage and postimage is colored like the diff it is.
Of course, this really only makes sense when the preimage and postimage
*are* diffs. So let's not expose this flag via a command-line option for
now.
This is a feature that was invented by git-tbdiff, and it will be used
by `git range-diff` in the next commit, by offering it via a new option:
`--dual-color`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This "color" simply reverts background and foreground. It will be used
in the upcoming "dual color" mode of `git range-diff`, where we will
reverse colors for the -/+ markers and the fragment headers of the
"outer" diff.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Arguably the most important part of `git range-diff`'s output is the
list of commits in the two branches, together with their relationships.
For that reason, tbdiff introduced color-coding that is pretty
intuitive, especially for unchanged patches (all dim yellow, like the
first line in `git show`'s output) vs modified patches (old commit is
red, new commit is green). Let's imitate that color scheme.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are essentially lifted from https://github.com/trast/tbdiff, with
light touch-ups to account for the command now being named `git
range-diff`.
Apart from renaming `tbdiff` to `range-diff`, only one test case needed
to be adjusted: 11 - 'changed message'.
The underlying reason it had to be adjusted is that diff generation is
sometimes ambiguous. In this case, a comment line and an empty line are
added, but it is ambiguous whether they were added after the existing
empty line, or whether an empty line and the comment line are added
*before* the existing empty line. And apparently xdiff picks a different
option here than Python's difflib.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are comparing complete, formatted commit messages with patches. There
are no function names here, so stop looking for them.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This not only uses "dashed stand-ins" for "pairs" where one side is
missing (i.e. unmatched commits that are present only in one of the two
commit ranges), but also adds onelines for the reader's pleasure.
This change brings `git range-diff` yet another step closer to
feature parity with tbdiff: it now shows the oneline, too, and indicates
with `=` when the commits have identical diffs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing the diff between corresponding patches of the two branch
versions, we have to make up a fake filename to run the diff machinery.
That filename does not carry any meaningful information, hence tbdiff
suppresses it. So we should, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main information in the `range-diff` view comes from the list of
matching and non-matching commits, the diffs are additional information.
Indenting them helps with the reading flow.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When comparing commit messages, we need to keep in mind that they are
indented by four spaces. That is, empty lines are no longer empty, but
have "trailing whitespace". When displaying them in color, that results
in those nagging red lines.
Let's just right-trim the lines in the commit message, it's not like
trailing white-space in the commit messages are important enough to care
about in `git range-diff`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like tbdiff, we now show the diff between matching patches. This is
a "diff of two diffs", so it can be a bit daunting to read for the
beginner.
An alternative would be to display an interdiff, i.e. the hypothetical
diff which is the result of first reverting the old diff and then
applying the new diff.
Especially when rebasing frequently, an interdiff is often not feasible,
though: if the old diff cannot be applied in reverse (due to a moving
upstream), an interdiff can simply not be inferred.
This commit brings `range-diff` closer to feature parity with regard
to tbdiff.
To make `git range-diff` respect e.g. color.diff.* settings, we have
to adjust git_branch_config() accordingly.
Note: while we now parse diff options such as --color, the effect is not
yet the same as in tbdiff, where also the commit pairs would be colored.
This is left for a later commit.
Note also: while tbdiff accepts the `--no-patches` option to suppress
these diffs between patches, we prefer the `-s` (or `--no-patch`) option
that is automatically supported via our use of diff_opt_parse().
And finally note: to support diff options, we have to call
`parse_options()` such that it keeps unknown options, and then loop over
those and let `diff_opt_parse()` handle them. After that loop, we have
to call `parse_options()` again, to make sure that no unknown options
are left.
Helped-by: Thomas Gummerer <t.gummerer@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch lets `git range-diff` use the same order as tbdiff.
The idea is simple: for left-to-right readers, it is natural to assume
that the `git range-diff` is performed between an older vs a newer
version of the branch. As such, the user is probably more interested in
the question "where did this come from?" rather than "where did that one
go?".
To that end, we list the commits in the order of the second commit range
("the newer version"), inserting the unmatched commits of the first
commit range as soon as all their predecessors have been shown.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At this stage, `git range-diff` can determine corresponding commits
of two related commit ranges. This makes use of the recently introduced
implementation of the linear assignment algorithm.
The core of this patch is a straight port of the ideas of tbdiff, the
apparently dormant project at https://github.com/trast/tbdiff.
The output does not at all match `tbdiff`'s output yet, as this patch
really concentrates on getting the patch matching part right.
Note: due to differences in the diff algorithm (`tbdiff` uses the Python
module `difflib`, Git uses its xdiff fork), the cost matrix calculated
by `range-diff` is different (but very similar) to the one calculated
by `tbdiff`. Therefore, it is possible that they find different matching
commits in corner cases (e.g. when a patch was split into two patches of
roughly equal length).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This command does not do a whole lot so far, apart from showing a usage
that is oddly similar to that of `git tbdiff`. And for a good reason:
the next commits will turn `range-branch` into a full-blown replacement
for `tbdiff`.
At this point, we ignore tbdiff's color options, as they will all be
implemented later using diff_options.
Since f318d73915 (generate-cmds.sh: export all commands to
command-list.h, 2018-05-10), every new command *requires* a man page to
build right away, so let's also add a blank man page, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The problem solved by the code introduced in this commit goes like this:
given two sets of items, and a cost matrix which says how much it
"costs" to assign any given item of the first set to any given item of
the second, assign all items (except when the sets have different size)
in the cheapest way.
We use the Jonker-Volgenant algorithm to solve the assignment problem to
answer questions such as: given two different versions of a topic branch
(or iterations of a patch series), what is the best pairing of
commits/patches between the different versions?
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t5552 test script uses GIT_TRACE_PACKET to monitor what
git-fetch sends and receives. However, because we're
accessing a local repository, the child upload-pack also
sends trace output to the same file.
On Linux, this works out OK. We open the trace file with
O_APPEND, so all writes are atomically positioned at the end
of the file. No data can be overwritten or omitted. And
since we prepare our small writes in a strbuf and write them
with a single write(), we should see each line as an atomic
unit. The order of lines between the two processes is
undefined, but the test script greps only for "fetch>" or
"fetch<" lines. So under Linux, the test results are
deterministic.
The test fails intermittently on Windows, however,
reportedly even overwriting bits of the output file (i.e.,
O_APPEND does not seem to give us an atomic position+write).
Since the test only cares about the trace output from fetch,
we can just disable the output from upload-pack. That
doesn't solve the greater question of O_APPEND/trace issues
under Windows, but it easily fixes the flakiness from this
test.
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When gpg-interface API unified support for signature verification
codepaths for signed tags and signed commits in mid 2015 at around
v2.6.0-rc0~114, we accidentally loosened the GPG signature
verification.
Before that change, signed commits were verified by looking for
"G"ood signature from GPG, while ignoring the exit status of "gpg
--verify" process, while signed tags were verified by simply passing
the exit status of "gpg --verify" through. The unified code we
currently have ignores the exit status of "gpg --verify" and returns
successful verification when the signature matches an unexpired key
regardless of the trust placed on the key (i.e. in addition to "G"ood
ones, we accept "U"ntrusted ones).
Make these commands signal failure with their exit status when
underlying "gpg --verify" (or the custom command specified by
"gpg.program" configuration variable) does so. This essentially
changes their behaviour in a backward incompatible way to reject
signatures that have been made with untrusted keys even if they
correctly verify, as that is how "gpg --verify" behaves.
Note that the code still overrides a zero exit status obtained from
"gpg" (or gpg.program) if the output does not say the signature is
good or computes correctly but made with untrusted keys, to catch
a poorly written wrapper around "gpg" the user may give us.
We could exclude "U"ntrusted support from this fallback code, but
that would be making two backward incompatible changes in a single
commit, so let's avoid that for now. A follow-up change could do so
if desired.
Helped-by: Vojtech Myslivec <vojtech.myslivec@nic.cz>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, repack does not touch promisor packfiles at all, potentially
causing the performance of repositories that have many such packfiles to
drop. Therefore, repack all promisor objects if invoked with -a or -A.
This is done by an additional invocation of pack-objects on all promisor
objects individually given, which takes care of deduplication and allows
the resulting packfiles to respect flags such as --max-pack-size.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch will teach repack to run pack-objects with some same
and some different arguments if repacking of promisor objects is
required. Refactor the setup of the pack-objects cmd so that setting up
the arguments common to both is done in a function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of `--exec` is to append an `exec` call after each `pick`.
Since the introduction of fixup!/squash! commits, this idea was extended
to apply to "pick, possibly followed by a fixup/squash chain", i.e. an
exec would not be inserted between a `pick` and any of its corresponding
`fixup` or `squash` lines.
The current implementation uses a dirty trick to achieve that: it
assumes that there are only pick/fixup/squash commands, and then
*inserts* the `exec` lines before any `pick` but the first, and appends
a final one.
With the todo lists generated by `git rebase --rebase-merges`, this
simple implementation shows its problems: it produces the exact wrong
thing when there are `label`, `reset` and `merge` commands.
Let's change the implementation to do exactly what we want: look for
`pick` lines, skip any fixup/squash chains, and then insert the `exec`
line. Lather, rinse, repeat.
Note: we take pains to insert *before* comment lines whenever possible,
as empty commits are represented by commented-out pick lines (and we
want to insert a preceding pick's exec line *before* such a line, not
afterward).
While at it, also add `exec` lines after `merge` commands, because they
are similar in spirit to `pick` commands: they add new commits.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The colorization is controlled with the config setting "color.remote".
Supported keywords are "error", "warning", "hint" and "success". They
are highlighted if they appear at the start of the line, which is
common in error messages, eg.
ERROR: commit is missing Change-Id
The Git push process itself prints lots of non-actionable messages
(eg. bandwidth statistics, object counters for different phases of the
process). This obscures actionable error messages that servers may
send back. Highlighting keywords in the sideband draws more attention
to those messages.
The background for this change is that Gerrit does server-side
processing to create or update code reviews, and actionable error
messages (eg. missing Change-Id) must be communicated back to the user
during the push. User research has shown that new users have trouble
seeing these messages.
The highlighting is done on the client rather than server side, so
servers don't have to grow capabilities to understand terminal escape
codes and terminal state. It also consistent with the current state
where Git is control of the local display (eg. prefixing messages with
"remote: ").
The highlighting can be configured using color.remote.<KEYWORD>
configuration settings. Since the keys are matched case insensitively,
we match the keywords case insensitively too.
Finally, this solution is backwards compatible: many servers already
prefix their messages with "error", and they will benefit from this
change without requiring a server update. By contrast, a server-side
solution would likely require plumbing the TERM variable through the
git protocol, so it would require changes to both server and client.
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back when we removed `git apply --index-info` in 2007, we forgot to
adjust the documentation for update-index that reads its output.
Let's reorder the description of three formats to present the other
two formats that are still generated by git commands before this
format, and stop mentioning `git apply --index-info`.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following phrase could be interpreted multiple ways:
"To pretend you have a file with mode and sha1 at path"
In particular, I can think of two:
1. Pretend we have some new file, which happens to have a given mode
and sha1
2. Pretend one of the files we are already tracking has a different
mode and sha1 than what it really does
I think people could easily assume either case while reading, but the
example command provided doesn't actually handle the first case, which
caused some minor frustration to at least one user. Modify the example
command so that it correctly handles both cases, and re-order the
wording in a way that makes it more likely folks will assume the first
interpretation. I believe the new example shouldn't pose any obstacles
to those wanting the second interpretation (at worst, they pass an
unnecessary extra flag).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bug was noticed when writing the previous patch; a fix for this bug
is not easy though: If we choose to ignore the case of the subsection
(and revert most of the code of the previous patch, just keeping
s/strncasecmp/strcmp/), then we'd introduce new sections using the
new syntax, such that
--------
[section.subsection]
key = value1
--------
git config section.Subsection.key value2
would result in
--------
[section.subsection]
key = value1
[section.Subsection]
key = value2
--------
which is even more confusing. A proper fix would replace the first
occurrence of 'key'. As the syntax is deprecated, let's prefer to not
spend time on fixing the behavior and just document it instead.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A user reported a submodule issue regarding a section mix-up,
but it could be boiled down to the following test case:
$ git init test && cd test
$ git config foo."Bar".key test
$ git config foo."bar".key test
$ tail -n 3 .git/config
[foo "Bar"]
key = test
key = test
Sub sections are case sensitive and we have a test for correctly reading
them. However we do not have a test for writing out config correctly with
case sensitive subsection names, which is why this went unnoticed in
6ae996f2ac (git_config_set: make use of the config parser's event
stream, 2018-04-09)
Unfortunately we have to make a distinction between old style configuration
that looks like
[foo.Bar]
key = test
and the new quoted style as seen above. The old style is documented as
case-agnostic, hence we need to keep 'strncasecmp'; although the
resulting setting for the old style config differs from the configuration.
That will be fixed in a follow up patch.
Reported-by: JP Sugarbroad <jpsugar@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test -e, test -s, etc. do not provide nice error messages when we hit
test failures, so use the test_* helper functions from
test-lib-functions.sh.
Also, add test_path_exists() to test-lib-function.sh while at it, so
that we don't need to worry whether submodule/.git is a file or a
directory. It currently is a file with contents of the form
gitdir: ../.git/modules/submodule
but it could be changed in the future to be a directory; this test
only really cares that it exists.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a git command is on the left side of a pipe, the pipe will swallow
its exit status, preventing us from detecting failures in said commands.
Restructure the tests to put the output in a temporary file to avoid
this problem.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can get rid of some quoted tabs and make a few tests slightly easier
to read and edit by just asking for the names of the files modified,
since that's all these tests were interested in anyway.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A test making use of test_must_fail was failing like this:
fatal: ambiguous argument '|': unknown revision or path not in the working tree.
when the intent was to verify that a specific string was not found
in the output of the git diff command, i.e. that grep returned
non-zero. Fix the test to do that.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The words "save" and "safe" are both very wonderful words, each with
their own set of meanings. Let's not confuse them with one another save
on occasion of a pun.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The generated apache2 config fails with apache >= 2.4. The error log
states:
AH00136: Server MUST relinquish startup privileges before accepting
connections. Please ensure mod_unixd or other system security
module is loaded.
AH00016: Configuration Failed
Fix this by loading the unixd module. This works with older httpd as
well, so no IfVersion conditional is needed. (Tested with httpd-2.2.15
on CentOS-6.)
Written with assistance of Todd Zullinger <tmz@pobox.com>
Signed-off-by: Sebastian Kisela <skisela@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Fedora-derived systems, the apache httpd package installs modules
under /usr/lib{,64}/httpd/modules, depending on whether the system is
32- or 64-bit. A symlink from /etc/httpd/modules is created which
points to the proper module path. Use it to support apache on Fedora,
CentOS, and Red Hat systems.
Written with assistance of Todd Zullinger <tmz@pobox.com> and
Junio C Hamano <gitster@pobox.com>.
Signed-off-by: Sebastian Kisela <skisela@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From a security perspective, it seems that SHA-256, BLAKE2, SHA3-256,
K12, and so on are all believed to have similar security properties.
All are good options from a security point of view.
SHA-256 has a number of advantages:
* It has been around for a while, is widely used, and is supported by
just about every single crypto library (OpenSSL, mbedTLS, CryptoNG,
SecureTransport, etc).
* When you compare against SHA1DC, most vectorized SHA-256
implementations are indeed faster, even without acceleration.
* If we're doing signatures with OpenPGP (or even, I suppose, CMS),
we're going to be using SHA-2, so it doesn't make sense to have our
security depend on two separate algorithms when either one of them
alone could break the security when we could just depend on one.
So SHA-256 it is. Update the hash-function-transition design doc to
say so.
After this patch, there are no remaining instances of the string
"NewHash", except for an unrelated use from 2008 as a variable name in
t/t9700/test.pl.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Acked-by: Dan Shumow <danshu@microsoft.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fair number of tests need to check that the filesystem supports file
names including "funny" characters, like newline, tab, and double-quote.
Jonathan Nieder suggested that this be extracted into a lazy prereq in
the top-level `test-lib.sh`. This patch effects that change.
The FUNNYNAMES prereq now uniformly requires support for newlines, tabs,
and double-quotes in filenames. This very slightly decreases the power
of some tests, which might have run previously on a system that supports
(e.g.) newlines and tabs but not double-quotes, but now will not. This
seems to me like an acceptable tradeoff for consistency.
One test (`t/t9902-completion.sh`) defined FUNNYNAMES to further require
the separators \034 through \037, the test for which was implemented
using the Bash-specific $'\034' syntax. I've elected to leave this one
as is, renaming it to FUNNIERNAMES.
After this patch, `git grep 'test_\(set\|lazy\)_prereq.*FUNNYNAMES'` has
only one result.
Signed-off-by: William Chargin <wchargin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 3ac68a93fd (help: add --config to list all available config -
2018-05-26) makes generate-cmdlist.sh adds a new input source
config.txt but it's not a Makefile dependency. Any changes in
config.txt will not trigger command-list.h regeneration and the config
list in this file becomes outdated. Correct the dependency.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --exec option's implementation is not really well-prepared for
--rebase-merges. Demonstrate this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests for "git am --[no-]scissors" [1] work in the following way:
1. Create files with commit messages
2. Use these files to create expected commits
3. Generate eml file with patch from expected commits
4. Create commits using git am with these eml files
5. Compare these commits with expected
The test for "git am --scissors" is supposed to take an e-mail with a
scissors line and in-body "Subject:" header and demonstrate that the
subject line from the e-mail itself is overridden by the in-body header
and that only text below the scissors line is included in the commit
message of the commit created by the invocation of "git am --scissors".
However, the setup of the test incorrectly uses a commit without the
scissors line and without the in-body header in the commit message,
producing eml file not suitable for testing of "git am --scissors".
This can be checked by intentionally breaking is_scissors_line function
in mailinfo.c, for example, by changing string ">8", which is used by
the test. With such change the test should fail, but does not.
Fix broken test by generating eml file with scissors line and in-body
header "Subject:". Since the two tests for --scissors and --no-scissors
options are there to test cutting or keeping the commit message, update
both tests to change the test file in the same way, which allows us to
generate only one eml file to be passed to git am. To clarify the
intention of the test, give files and tags more explicit names.
[1]: introduced in bf72ac17d (t4150: tests for am --[no-]scissors,
2015-07-19)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Paul Tan <pyokagan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' original 4aa8b8c8283 (Teach 'git pull' to handle
--rebase=interactive, 2011-10-21) had support for the very convenient
abbreviation
git pull --rebase=i
which was later lost when it was ported to the builtin `git pull`, and
it was not introduced before the patch eventually made it into Git as
f5eb87b98d (pull: allow interactive rebase with --rebase=interactive,
2016-01-13).
However, it is *really* a useful short hand for the occasional rebasing
pull on branches that do not usually want to be rebased.
So let's reintroduce this convenience, at long last.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After making a change to the documentation, it's easy to
forget to check the rendered version to make sure it was
formatted as you intended. And simply doing a diff between
the two built versions is less trivial than you might hope:
- diffing the roff or html output isn't particularly
readable; what we really care about is what the end user
will see
- you have to tweak a few build variables to avoid
spurious differences (e.g., version numbers, build
times)
Let's provide a script that builds and installs the manpages
for two commits, renders the results using "man", and diffs
the result. Since this is time-consuming, we'll also do our
best to avoid repeated work, keeping intermediate results
between runs.
Some of this could probably be made a little less ugly if we
built support into Documentation/Makefile. But by relying
only on "make install-man" working, this script should work
for generating a diff between any two versions, whether they
include this script or not.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The color group in config.txt is actually sorted but changes in
sb/blame-color broke this. Reorder color.blame.* and move
blame.coloring back to the rest of blame.* (and reorder that group too
while we're there)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test description looks like it was written with the originally
observed behavior ("causes segfault") rather than the desired and now
current behavior ("does not cause segfault"). Fix it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
want_color_fd() is designed to work only with standard output and
error file descriptors and stores information about each descriptor in
an array. However, it doesn't verify that the passed-in descriptor
lives within that set, which, with a buggy caller, could lead to
access or assignment outside the array bounds.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parseopt wraps argument help strings in a pair of angular brackets by
default, to tell users that they need to replace it with an actual
value. This is useful in most cases, because most option arguments
are indeed single values of a certain type. The option
PARSE_OPT_LITERAL_ARGHELP needs to be used in option definitions with
arguments that have multiple parts or are literal strings.
Stop adding these angular brackets if special characters are present,
as they indicate that we don't deal with a simple placeholder. This
simplifies the code a bit and makes defining special options slightly
easier.
Remove the flag PARSE_OPT_LITERAL_ARGHELP in the cases where the new
and more cautious handling suffices.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap the placeholders in the option help string for -w in pairs of
angular brackets to document that users need to replace them with actual
numbers. Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt
from adding another pair.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap each part of the argument help string in angular brackets to show
that users need to replace them with actual values. Do that explicitly
to balance the pairs nicely in the code and avoid confusing casual
readers. Add the flag PARSE_OPT_LITERAL_ARGHELP to keep parseopt from
adding another pair.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Wrap both placeholders in the argument help string in angular brackets
to signal that users needs replace them with some actual value. Use the
flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding another
pair.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parseopt wraps arguments in a pair of angular brackets by default,
signifying that the user needs to replace it with a value of the
documented type. Remove the pairs from the option definitions to
duplication and confusion.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't translate the argument specification for --chmod; "+x" and "-x"
are the literal strings that the commands accept.
Separate alternatives using a pipe character instead of a slash, for
consistency.
Use the flag PARSE_OPT_LITERAL_ARGHELP to prevent parseopt from adding a
pair of angular brackets around the argument help string, as that would
wrongly indicate that users need to replace the literal strings with
some kind of value.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option help text for the force-with-lease option to "git push"
reads like this:
$ git push -h 2>&1 | grep -e force-with-lease
--force-with-lease[=<refname>:<expect>]
which comes from having N_("refname>:<expect") as the argument help
text in the source code, with an aparent lack of "<" and ">" at both
ends.
It turns out that parse-options machinery takes the whole string and
encloses it inside a pair of "<>", to make it easier for majority
cases that uses a single token placeholder.
The help string was written in a funnily unbalanced way knowing that
the end result would balance out, by somebody who forgot the
presence of PARSE_OPT_LITERAL_ARGHELP, which is the escape hatch
mechanism designed to help such a case. We just should use the
official escape hatch instead.
Because ":<expect>" part can be omitted to ask Git to guess, it may
be more correct to spell it as "<refname>[:<expect>]", but that is
not the focus of this topic.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Look for broken "&&" chains that are hidden in subshell, many of
which have been found and corrected.
* es/chain-lint-in-subshell:
t/chainlint.sed: drop extra spaces from regex character class
t/chainlint: add chainlint "specialized" test cases
t/chainlint: add chainlint "complex" test cases
t/chainlint: add chainlint "cuddled" test cases
t/chainlint: add chainlint "loop" and "conditional" test cases
t/chainlint: add chainlint "nested subshell" test cases
t/chainlint: add chainlint "one-liner" test cases
t/chainlint: add chainlint "whitespace" test cases
t/chainlint: add chainlint "basic" test cases
t/Makefile: add machinery to check correctness of chainlint.sed
t/test-lib: teach --chain-lint to detect broken &&-chains in subshells
The lazy clone support had a few places where missing but promised
objects were not correctly tolerated, which have been fixed.
* jt/tags-to-promised-blobs-fix:
tag: don't warn if target is missing but promised
revision: tolerate promised targets of tags
Add a server-side knob to skip commits in exponential/fibbonacci
stride in an attempt to cover wider swath of history with a smaller
number of iterations, potentially accepting a larger packfile
transfer, instead of going back one commit a time during common
ancestor discovery during the "git fetch" transaction.
* jt/fetch-negotiator-skipping:
negotiator/skipping: skip commits during fetch
"git send-email" when using in a batched mode that limits the
number of messages sent in a single SMTP session lost the contents
of the variable used to choose between tls/ssl, unable to send the
second and later batches, which has been fixed.
* jm/send-email-tls-auth-on-batch:
send-email: fix tls AUTH when sending batch
"git rebase" started exporting GIT_DIR environment variable and
exposing it to hook scripts when part of it got rewritten in C.
Instead of matching the old scripted Porcelains' behaviour,
compensate by also exporting GIT_WORK_TREE environment as well to
lessen the damage. This can harm existing hooks that want to
operate on different repository, but the current behaviour is
already broken for them anyway.
* bc/sequencer-export-work-tree-as-well:
sequencer: pass absolute GIT_WORK_TREE to exec commands
Tests to cover conflict cases that involve submodules have been
added for merge-recursive.
* en/t7405-recursive-submodule-conflicts:
t7405: verify 'merge --abort' works after submodule/path conflicts
t7405: add a directory/submodule conflict
t7405: add a file/submodule conflict
Tests to cover various conflicting cases have been added for
merge-recursive.
* en/t6036-merge-recursive-tests:
t6036: add a failed conflict detection case: regular files, different modes
t6036: add a failed conflict detection case with conflicting types
t6036: add a failed conflict detection case with submodule add/add
t6036: add a failed conflict detection case with submodule modify/modify
t6036: add a failed conflict detection case with symlink add/add
t6036: add a failed conflict detection case with symlink modify/modify
The recursive merge strategy did not properly ensure there was no
change between HEAD and the index before performing its operation,
which has been corrected.
* en/dirty-merge-fixes:
merge: fix misleading pre-merge check documentation
merge-recursive: enforce rule that index matches head before merging
t6044: add more testcases with staged changes before a merge is invoked
merge-recursive: fix assumption that head tree being merged is HEAD
merge-recursive: make sure when we say we abort that we actually abort
t6044: add a testcase for index matching head, when head doesn't match HEAD
t6044: verify that merges expected to abort actually abort
index_has_changes(): avoid assuming operating on the_index
read-cache.c: move index_has_changes() from merge.c
"git rebase --rebase-merges" mode now handles octopus merges as
well.
* js/rebase-merge-octopus:
rebase --rebase-merges: adjust man page for octopus support
rebase --rebase-merges: add support for octopus merges
merge: allow reading the merge commit message from a file
"git gc --auto" opens file descriptors for the packfiles before
spawning "git repack/prune", which would upset Windows that does
not want a process to work on a file that is open by another
process. The issue has been worked around.
* kg/gc-auto-windows-workaround:
gc --auto: release pack files before auto packing
For a large tree, the index needs to hold many cache entries
allocated on heap. These cache entries are now allocated out of a
dedicated memory pool to amortize malloc(3) overhead.
* jm/cache-entry-from-mem-pool:
block alloc: add validations around cache_entry lifecyle
block alloc: allocate cache entries from mem_pool
mem-pool: fill out functionality
mem-pool: add life cycle management functions
mem-pool: only search head block for available space
block alloc: add lifecycle APIs for cache_entry structs
read-cache: teach make_cache_entry to take object_id
read-cache: teach refresh_cache_entry to take istate
"git fetch" learned a new option "--negotiation-tip" to limit the
set of commits it tells the other end as "have", to reduce wasted
bandwidth and cycles, which would be helpful when the receiving
repository has a lot of refs that have little to do with the
history at the remote it is fetching from.
* jt/fetch-nego-tip:
fetch-pack: support negotiation tip whitelist
Various glitches in the heuristics of merge-recursive strategy have
been documented in new tests.
* en/t6042-insane-merge-rename-testcases:
t6042: add testcase covering long chains of rename conflicts
t6042: add testcase covering rename/rename(2to1)/delete/delete conflict
t6042: add testcase covering rename/add/delete conflict type
Parsing of -L[<N>][,[<M>]] parameters "git blame" and "git log"
take has been tweaked.
* is/parsing-line-range:
log: prevent error if line range ends past end of file
blame: prevent error if range ends past end of file
Code restructuring and a small fix to transport protocol v2 during
fetching.
* jt/fetch-pack-negotiator:
fetch-pack: introduce negotiator API
fetch-pack: move common check and marking together
fetch-pack: make negotiation-related vars local
fetch-pack: use ref adv. to prune "have" sent
fetch-pack: directly end negotiation if ACK ready
fetch-pack: clear marks before re-marking
fetch-pack: split up everything_local()
"git checkout" and "git worktree add" learned to honor
checkout.defaultRemote when auto-vivifying a local branch out of a
remote tracking branch in a repository with multiple remotes that
have tracking branches that share the same names.
* ab/checkout-default-remote:
checkout & worktree: introduce checkout.defaultRemote
checkout: add advice for ambiguous "checkout <branch>"
builtin/checkout.c: use "ret" variable for return
checkout: pass the "num_matches" up to callers
checkout.c: change "unique" member to "num_matches"
checkout.c: introduce an *_INIT macro
checkout.h: wrap the arguments to unique_tracking_name()
checkout tests: index should be clean after dwim checkout
"git diff --color-moved" feature has further been tweaked.
* sb/diff-color-move-more:
diff.c: offer config option to control ws handling in move detection
diff.c: add white space mode to move detection that allows indent changes
diff.c: factor advance_or_nullify out of mark_color_as_moved
diff.c: decouple white space treatment from move detection algorithm
diff.c: add a blocks mode for moved code detection
diff.c: adjust hash function signature to match hashmap expectation
diff.c: do not pass diff options as keydata to hashmap
t4015: avoid git as a pipe input
xdiff/xdiffi.c: remove unneeded function declarations
xdiff/xdiff.h: remove unused flags
"git fsck" learns to make sure the optional commit-graph file is in
a sane state.
* ds/commit-graph-fsck: (23 commits)
coccinelle: update commit.cocci
commit-graph: update design document
gc: automatically write commit-graph files
commit-graph: add '--reachable' option
commit-graph: use string-list API for input
fsck: verify commit-graph
commit-graph: verify contents match checksum
commit-graph: test for corrupted octopus edge
commit-graph: verify commit date
commit-graph: verify generation number
commit-graph: verify parent list
commit-graph: verify root tree OIDs
commit-graph: verify objects exist
commit-graph: verify corrupt OID fanout and lookup
commit-graph: verify required chunks are present
commit-graph: verify catches corrupt signature
commit-graph: add 'verify' subcommand
commit-graph: load a root tree from specific graph
commit: force commit to parse from object database
commit-graph: parse commit from chosen graph
...
Recent "security fix" to pay attention to contents of ".gitmodules"
while accepting "git push" was a bit overly strict than necessary,
which has been adjusted.
* jk/fsck-gitmodules-gently:
fsck: downgrade gitmodulesParse default to "info"
fsck: split ".gitmodules too large" error from parse failure
fsck: silence stderr when parsing .gitmodules
config: add options parameter to git_config_from_mem
config: add CONFIG_ERROR_SILENT handler
config: turn die_on_error into caller-facing enum
Conversion from uchar[40] to struct object_id continues.
* bc/object-id:
pretty: switch hard-coded constants to the_hash_algo
sha1-file: convert constants to uses of the_hash_algo
log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz
diff: switch GIT_SHA1_HEXSZ to use the_hash_algo
builtin/merge-recursive: make hash independent
builtin/merge: switch to use the_hash_algo
builtin/fmt-merge-msg: make hash independent
builtin/update-index: simplify parsing of cacheinfo
builtin/update-index: convert to using the_hash_algo
refs/files-backend: use the_hash_algo for writing refs
sha1-name: use the_hash_algo when parsing object names
strbuf: allocate space with GIT_MAX_HEXSZ
commit: express tree entry constants in terms of the_hash_algo
hex: switch to using the_hash_algo
tree-walk: replace hard-coded constants with the_hash_algo
cache: update object ID functions for the_hash_algo
Tests to cover more D/F conflict cases have been added for
merge-recursive.
* en/t6036-recursive-corner-cases:
t6036: fix broken && chain in sub-shell
t6036: add lots of detail for directory/file conflicts in recursive case
httpd tests saw occasional breakage due to the way its access log
gets inspected by the tests, which has been updated to make them
less flaky.
* sg/httpd-test-unflake:
t/lib-httpd: avoid occasional failures when checking access.log
t/lib-httpd: add the strip_access_log() helper function
t5541: clean up truncating access log
"git pull --rebase" on a corrupt HEAD caused a segfault. In
general we substitute an empty tree object when running the in-core
equivalent of the diff-index command, and the codepath has been
corrected to do so as well to fix this issue.
* jk/has-uncommitted-changes-fix:
has_uncommitted_changes(): fall back to empty tree
In score_trees(), we walk over two sorted trees to find
which entries are missing or have different content between
the two. So if we have two trees with these entries:
one two
--- ---
a a
b c
c d
we'd expect the loop to:
- compare "a" to "a"
- compare "b" to "c"; because these are sorted lists, we
know that the second tree does not have "b"
- compare "c" to "c"
- compare "d" to end-of-list; we know that the first tree
does not have "d"
And prior to d8febde370 (match-trees: simplify score_trees()
using tree_entry(), 2013-03-24) that worked. But after that
commit, we mistakenly increment the tree pointers for every
loop iteration, even when we've processed the entry for only
one side. As a result, we end up doing this:
- compare "a" to "a"
- compare "b" to "c"; we know that we do not have "b", but
we still increment both tree pointers; at this point
we're out of sync and all further comparisons are wrong
- compare "c" to "d" and mistakenly claim that the second
tree does not have "c"
- exit the loop, mistakenly not realizing that the first
tree does not have "d"
So contrary to the claim in d8febde370, we really do need to
manually use update_tree_entry(), because advancing the tree
pointer depends on the entry comparison.
That means we must stop using tree_entry() to access each
entry, since it auto-advances the pointer. Instead:
- we'll use tree_desc.size directly to know if there's
anything left to look at (which is what tree_entry() was
doing under the hood)
- rather than do an extra struct assignment to "e1" and
"e2", we can just access the "entry" field of tree_desc
directly
That makes us a little more intimate with the tree_desc
code, but that's not uncommon for its callers.
The included test shows off the bug by adding a new entry
"bar.t", which sorts early in the tree and de-syncs the
comparison for "foo.t", which comes after.
Reported-by: George Shammas <georgyo@gmail.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user fetches:
- at least one up-to-date ref and at least one non-up-to-date ref,
- using HTTP with protocol v0 (or something else that uses the fetch
command of a remote helper)
some refs might not be updated after the fetch.
This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
info in output parameter", 2018-06-28) which allowed transports to
report the refs that they have fetched in a new out-parameter
"fetched_refs". If they do so, transport_fetch_refs() makes this
information available to its caller.
Users of "fetched_refs" rely on the following 3 properties:
(1) it is the complete list of refs that was passed to
transport_fetch_refs(),
(2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
relevant), and
(3) it has updated OIDs if ref-in-want was used (introduced after
989b8c4452).
In an effort to satisfy (1), whenever transport_fetch_refs()
filters the refs sent to the transport, it re-adds the filtered refs to
whatever the transport supplies before returning it to the user.
However, the implementation in 989b8c4452 unconditionally re-adds the
filtered refs without checking if the transport refrained from reporting
anything in "fetched_refs" (which it is allowed to do), resulting in an
incomplete list, no longer satisfying (1).
An earlier effort to resolve this [1] solved the issue by readding the
filtered refs only if the transport did not refrain from reporting in
"fetched_refs", but after further discussion, it seems that the better
solution is to revert the API change that introduced "fetched_refs".
This API change was first suggested as part of a ref-in-want
implementation that allowed for ref patterns and, thus, there could be
drastic differences between the input refs and the refs actually fetched
[2]; we eventually decided to only allow exact ref names, but this API
change remained even though its necessity was decreased.
Therefore, revert this API change by reverting commit 989b8c4452, and
make receive_wanted_refs() update the OIDs in the sought array (like how
update_shallow() updates shallow information in the sought array)
instead. A test is also included to show that the user-visible bug
discussed at the beginning of this commit message no longer exists.
[1] https://public-inbox.org/git/20180801171806.GA122458@google.com/
[2] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `p4-pre-submit` hook is executed before git-p4 submits code.
If the hook exits with non-zero value, submit process not start.
Signed-off-by: Chen Bin <chenbin.sh@gmail.com>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Skip searching for better indentation heuristics if we'd slide a hunk more
than its size. This is the easiest fix proposed in the analysis[1] in
response to a patch that mercurial took for xdiff to limit searching
by a constant. Using a performance test as:
#!python
open('a', 'w').write(" \n" * 1000000)
open('b', 'w').write(" \n" * 1000001)
This patch reduces the execution of "git diff --no-index a b" from
0.70s to 0.31s. However limiting the sliding to the size of the diff hunk,
which was proposed as a solution (that I found easiest to implement for
now) is not optimal for cases like
open('a', 'w').write(" \n" * 1000000)
open('b', 'w').write(" \n" * 2000000)
as then we'd still slide 1000000 times.
In addition to limiting the sliding to size of the hunk, also limit by a
constant. Choose 100 lines as the constant as that fits more than a screen,
which really means that the diff sliding is probably not providing a lot
of benefit anyway.
[1] https://public-inbox.org/git/72ac1ac2-f567-f241-41d6-d0f83072e0b3@alum.mit.edu/
Reported-by: Jun Wu <quark@fb.com>
Analysis-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This documents current behavior of the config machinery, when changing
the value of some settings. This patch just serves to provide a baseline
for the follow up that will fix some issues with the current behavior.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users interested in the fetch.negotiationAlgorithm variable added in
42cc7485a2 ("negotiator/skipping: skip commits during fetch",
2018-07-16) are probably interested in the related --negotiation-tip
option added in 3390e42adb ("fetch-pack: support negotiation tip
whitelist", 2018-07-02).
Change the documentation for those two to reference one another to
point readers in the right direction.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the handling of fetch.negotiationAlgorithm=<str> to error out
on unknown strings, i.e. everything except "default" or "skipping".
This changes the behavior added in 42cc7485a2 ("negotiator/skipping:
skip commits during fetch", 2018-07-16) which would ignore all unknown
values and silently fall back to the "default" value.
For a feature like this it's much better to produce an error than
proceed. We don't want users to debug some amazingly slow fetch that
should benefit from "skipping", only to find that they'd forgotten to
deploy the new git version on that particular machine.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trash directory of a failed test might contain invaluable
information about the cause of the failure, but we have no access to
the trash directories of Travis CI build jobs. The only feedback we
get from there is the build job's trace log, so...
Modify 'ci/print-test-failures.sh' to create a tar.gz archive of the
trash directory of each failed test, encode that archive with base64,
and print the resulting block of ASCII text, so it gets embedded in
the trace log. Furthermore, run tests with '--immediate' to
faithfully preserve the failed state.
Extracting the trash directories from the trace log turned out to be a
bit of a hassle, partly because of the size of these logs (usually
resulting in several hundreds or even thousands of lines of
base64-encoded text), and partly because these logs have CRLF, CRCRLF
and occasionally even CRCRCRLF line endings, which cause 'base64 -d'
from coreutils to complain about "invalid input". For convenience add
a small script 'ci/util/extract-trash-dirs.sh', which will extract and
unpack all base64-encoded trash directories embedded in the log fed to
its standard input, and include an example command to be copy-pasted
into a terminal to do it all at the end of the failure report.
A few of our tests create sizeable trash directories, so limit the
size of each included base64-encoded block, let's say, to 1MB. And
just in case something fundamental gets broken and a lot of tests fail
at once, don't include trash directories when the combined size of the
included base64-encoded blocks would exceed 1MB.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch to the _DUP variant of string_list for remote_branches to allow
string_list_clear() to release the allocated memory at the end, and
actually call that function. Free the util pointer as well; it is
allocated in read_remote_branches().
NB: This string_list is empty until read_remote_branches() is called
via for_each_ref(), so there is no need to clean it up when returning
before that point.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_index_unmerged() has two intended purposes:
* return 1 if there are any unmerged entries, 0 otherwise
* drops any higher-stage entries down to stage #0
There are several callers of read_index_unmerged() that check the return
value to see if it is non-zero, all of which then die() if that condition
is met. For these callers, dropping higher-stage entries down to stage #0
is a waste of resources, and returning immediately on first unmerged entry
would be better. But it's probably only a very minor difference and isn't
the focus of this series.
The remaining callers ignore the return value and call this function for
the side effect of dropping higher-stage entries down to stage #0. As
mentioned in commit e11d7b5969 ("'reset --merge': fix unmerged case",
2009-12-31),
The _only_ reason we want to keep a previously unmerged entry in the
index at stage #0 is so that we don't forget the fact that we have
corresponding file in the work tree in order to be able to remove it
when the tree we are resetting to does not have the path.
In fact, prior to commit d1a43f2aa4 ("reset --hard/read-tree --reset -u:
remove unmerged new paths", 2008-10-15), read_index_unmerged() did just
remove unmerged entries from the cache immediately but that had the
unwanted effect of leaving around new untracked files in the tree from
aborted merges.
So, that's the intended purpose of this function. The problem is that
when directory/files conflicts are present, trying to add the file to the
index at stage 0 fails (because there is still a directory in the way),
and the function returns early with a -1 return code to signify the error.
As noted above, none of the callers who want the drop-to-stage-0 behavior
check the return status, though, so this means all remaining unmerged
entries remain in the index and the callers proceed assuming otherwise.
Users then see errors of the form:
error: 'DIR-OR-FILE' appears as both a file and as a directory
error: DIR-OR-FILE: cannot drop to stage #0
and potentially also messages about other unmerged entries which came
lexicographically later than whatever pathname was both a file and a
directory. Google finds a few hits searching for those messages,
suggesting there were probably a couple people who hit this besides me.
Luckily, calling `git reset --hard` multiple times would workaround
this bug.
Since the whole purpose here is to just put the entry *temporarily* into
the index so that any associated file in the working copy can be removed,
we can just skip the DFCHECK and allow both the file and directory to
appear in the index. The temporary simultaneous appearance of the
directory and file entries in the index will be removed by the callers
by calling unpack_trees(), which excludes these unmerged entries marked
with CE_CONFLICTED flag from the resulting index, before they attempt to
write the index anywhere.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several "recovery" commands outright fail or do not fully recover
when directory-file conflicts are present. This includes:
* git read-tree --reset HEAD
* git am --skip
* git am --abort
* git merge --abort
* git reset --hard
Add testcases documenting these shortcomings.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_author_ident() is careful to handle errors "gently" when parsing
"rebase-merge/author-script" by printing a suitable warning and
returning NULL; it never die()'s. One possible reason that parsing might
fail is that "rebase-merge/author-script" has been hand-edited in such a
way which corrupts it or the information it contains.
However, read_author_ident() invokes fmt_ident() which is not so careful
about failing "gently". It will die() if it encounters a malformed
timestamp. Since read_author_ident() doesn't want to die() and since
it's dealing with possibly hand-edited data, take care to avoid passing
a bogus timestamp to fmt_ident().
A more "correctly engineered" fix would be to add a "gentle" version of
fmt_ident(), however, such a change it outside the scope of the bug-fix
series. If fmt_ident() ever does grow a "gentle" cousin, then the manual
timestamp check added here can be retired.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rebase -i --root" creates a new root commit, it corrupts the
"author" header's timestamp by prepending a "@":
author A U Thor <author@example.com> @1112912773 -0700
The commit parser is very strict about the format of the "author"
header, and does not allow a "@" in that position.
The "@" comes from GIT_AUTHOR_DATE in "rebase-merge/author-script",
signifying a Unix epoch-based timestamp, however, read_author_ident()
incorrectly allows it to slip into the commit's "author" header, thus
corrupting it.
One possible fix would be simply to filter out the "@" when constructing
the "author" header timestamp, however, a more correct fix is to parse
the GIT_AUTHOR_DATE date (via parse_date()) and format the parsed result
into the "author" header. Since "rebase-merge/author-script" may be
edited by the user, this approach has the extra benefit of catching
other potential timestamp corruption due to hand-editing.
We can do better than calling parse_date() ourselves and constructing
the "author" header manually, however, by instead taking advantage of
fmt_ident() which does this work for us.
The benefits of using fmt_ident() are twofold. First, it simplifies the
logic considerably by allowing us to avoid the complexity of building
the "author" header in parallel with and in the same buffer from which
"rebase-merge/author-script" is being parsed. Instead, fmt_ident() is
invoked to compose the header after parsing is complete.
Second, fmt_ident() is careful to prevent "crud" from polluting the
composed ident. As with validating GIT_AUTHOR_DATE, this "crud"
avoidance prevents other (possibly hand-edited) bogus author information
from "rebase-merge/author-script" from corrupting the commit object.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rebase -i --root" creates a new root commit, it corrupts the
"author" header's timezone by repeating the last digit:
author A U Thor <author@example.com> @1112912773 -07000
This is due to two bugs.
First, write_author_script() neglects to add the closing quote to the
value of GIT_AUTHOR_DATE when generating "rebase-merge/author-script".
Second, although sq_dequote() correctly diagnoses the missing closing
quote, read_author_ident() ignores sq_dequote()'s return value and
blindly uses the result of the aborted dequote.
sq_dequote() performs dequoting in-place by removing quoting and
shifting content downward. When it detects misquoting (lack of closing
quote, in this case), it gives up and returns an error without inserting
a NUL-terminator at the end of the shifted content, which explains the
duplicated last digit in the timezone.
(Note that the "@" preceding the timestamp is a separate bug which
will be fixed subsequently.)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rebase -i --root" creates a new root commit (say, by swapping
in a different commit for the root), it corrupts the commit's "author"
header with trailing garbage:
author A U Thor <author@example.com> @1112912773 -07000or@example.com
This is a result of read_author_ident() neglecting to NUL-terminate the
buffer into which it composes the "author" header.
(Note that the "@" preceding the timestamp and the extra "0" in the
timezone are separate bugs which will be fixed subsequently.)
Security considerations: Construction of the "author" header by
read_author_ident() happens in-place and in parallel with parsing the
content of "rebase-merge/author-script" which occupies the same buffer.
This is possible because the constructed "author" header is always
smaller than the content of "rebase-merge/author-script". Despite
neglecting to NUL-terminate the constructed "author" header, memory is
never accessed (either by read_author_ident() or its caller) beyond the
allocated buffer since a NUL-terminator is present at the end of the
loaded "rebase-merge/author-script" content, and additional NUL's are
inserted as part of the parsing process.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This character class, like many others in this script, matches
horizontal whitespace consisting of spaces and tabs, however, a few
extra, entirely harmless, spaces somehow slipped into the expression.
Removing them is purely a cosmetic fix.
While at it, re-indent three lines with a single TAB each which were
incorrectly indented with six spaces. Also, a purely cosmetic fix.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the test that asserts that lightweight tags can only be
clobbered by a force-push to check do the same tests for annotated
tags.
There used to be less exhaustive tests for this with the code added in
40eff17999 ("push: require force for annotated tags", 2012-11-29), but
Junio removed them in 256b9d70a4 ("push: fix "refs/tags/ hierarchy
cannot be updated without --force"", 2013-01-16) while fixing some of
the behavior around tag pushing.
That change left us without any coverage asserting that pushing and
clobbering annotated tags worked as intended. There was no reason to
suspect that the receive machinery wouldn't behave the same way with
annotated tags, but now we know for sure.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the tests added in dbfeddb12e ("push: require force for refs
under refs/tags/", 2012-11-29) to assert that the same behavior
applies various other combinations of command-line option and
refspecs.
Supplying either "+" in refspec or "--force" is sufficient to clobber
the reference. With --no-force we still pay attention to "+" in the
refspec, and vice-versa with clobbering kicking in if there's no "+"
in the refspec but "+" is given.
This is consistent with how refspecs work for branches, where either
"+" or "--force" will enable clobbering, with neither taking priority
over the other.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a logic error that's been here since this test was added in
dbfeddb12e ("push: require force for refs under refs/tags/",
2012-11-29).
The intent of this test is to force-create a new tag pointing to
HEAD~, and then assert that pushing it doesn't work without --force.
Instead, the code was not creating a new tag at all, and then failing
to push the previous tag for the unrelated reason of providing a
refspec that doesn't make any sense.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove an invocation of 'git push' that's exactly the same as the one
on the preceding line. This was seemingly added by mistake in
dbfeddb12e ("push: require force for refs under refs/tags/",
2012-11-29) and doesn't affect the result of the test, the second
"push" was a no-op as there was nothing new to push.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling the test tag "Tag" will make for confusing reading later in
this series when making use of the "git push tag <name>"
feature. Let's call the tag testTag instead.
Changes code initially added in dbfeddb12e ("push: require force for
refs under refs/tags/", 2012-11-29).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This mixture of quoting, pipes, and here-docs to produce expected
results in shell variables is difficult to follow. Simplify by using
simpler constructs that write output to files instead.
Noticed because without this patch, t/chainlint is not able to
understand the script in order to validate that its subshells use an
unbroken &&-chain, causing "make -C contrib/subtree test" to fail with
error: bug in the test script: broken &&-chain or run-away HERE-DOC:
in t7900.21.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, the cSpell extension ignores all files under .git/. That
includes, unfortunately, COMMIT_EDITMSG, i.e. commit messages. However,
spell checking is *quite* useful when writing commit messages... And
since the user hardly ever opens any file inside .git (apart from commit
messages, the config, and sometimes interactive rebase's todo lists),
there is really not much harm in *not* ignoring .git/.
The default also ignores `node_modules/`, but that does not apply to
Git, so let's skip ignoring that, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The quite useful cSpell extension allows VS Code to have "squiggly"
lines under spelling mistakes. By default, this would add too much
clutter, though, because so much of Git's source code uses words that
would trigger cSpell.
Let's add a few words to make the spell checking more useful by reducing
the number of false positives.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a couple settings for the .c/.h files so that it is easier to
conform to Git's conventions while editing the source code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When configuring VS Code as core.editor (via `code --wait`), we really
want to adhere to the Git conventions of wrapping commit messages.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The C/C++ settings are special, as they are the only generated VS Code
configurations that *will* change over the course of Git's development,
e.g. when a new constant is defined.
Therefore, let's only update the C/C++ settings, also to prevent user
modifications from being overwritten.
Ideally, we would keep user modifications in the C/C++ settings, but
that would require parsing JSON, a task for which a Unix shell script is
distinctly unsuited. So we write out .new files instead, and warn the
user if they may want to reconcile their changes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps VS Code's intellisense to figure out that we want to include
windows.h, and that we want to define the minimum target Windows version
as Windows Vista/2008R2.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While it is technically possible, it is confusing. Not only the user,
but also VS Code's intellisense.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sadly, we do not get all of the definitions via ALL_CFLAGS. Some defines
are passed to GCC *only* when compiling specific files, such as git.o.
Let's just hard-code them into the script for the time being.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
VS Code is a lightweight but powerful source code editor which runs on
your desktop and is available for Windows, macOS and Linux. Among other
languages, it has support for C/C++ via an extension, which offers to
not only build and debug the code, but also Intellisense, i.e.
code-aware completion and similar niceties.
This patch adds a script that helps set up the environment to work
effectively with VS Code: simply run the Unix shell script
contrib/vscode/init.sh, which creates the relevant files, and open the
top level folder of Git's source code in VS Code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These extra comments should be make it easier to understand how to use
locks in pack-objects delta search code. For reference, see
8ecce684a3 (basic threaded delta search - 2007-09-06)
384b32c09b (pack-objects: fix threaded load balancing - 2007-12-08)
50f22ada52 (threaded pack-objects: Use condition... - 2007-12-16)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6c213e863a ("http-backend: respect CONTENT_LENGTH for
receive-pack", 2018-07-27) adds a test which uses the non-portable
export construct. Replace it with "FOO=bar && export FOO" instead.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike ref advertisement, client capabilities and the first want are
separated by SP, not NUL, in the implementation. Fix the documentation
to align with the implementation. pack-protocol.txt is already fixed.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change various tests that use an idiom of the form:
>expect &&
test_cmp expect actual
To instead use:
test_must_be_empty actual
The test_must_be_empty() wrapper was introduced in ca8d148daf ("test:
test_must_be_empty helper", 2013-06-09). Many of these tests have been
added after that time. This was mostly found with, and manually pruned
from:
git grep '^\s+>.*expect.* &&$' t
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change various tests that use an idiom of the form:
>expect &&
test_cmp expect actual
To instead use:
test_must_be_empty actual
The test_must_be_empty() wrapper was introduced in ca8d148daf ("test:
test_must_be_empty helper", 2013-06-09). Many of these tests have been
added after that time. This was mostly found with, and manually pruned
from:
git grep '^\s+>.*expect.* &&$' t
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fsck.<msg-id> is set to an unknown value it'll cause "fsck" to
die, but the same is not true of the "fetch" and "receive"
variants. Document this and test for it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stress test the parsing logic shared by fsck.skipList and
{fetch,receive}.fsck.skipList added in cd94c6f91e ("fsck: git
receive-pack: support excluding objects from fsck'ing",
2015-06-22). There were no tests for the work done by the
init_skiplist() routine, e.g. how it dies on invalid input.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test and document that the {fetch,receive}.fsck.* family of variables
doesn't fall back on the corresponding .fsck.* variables.
This was alluded to in the existing documentation by saying that
"receive" looks at receive.fsck.* and "fsck" looks at fsck.* etc., but
it wasn't explicitly stated that there was no fallback, and if you'd
e.g. like to configure the skipList you need to do that for all three.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement support for fetch.fsck.* corresponding with the existing
receive.fsck.*. This allows for pedantically cloning repositories with
specific issues without turning off fetch.fsckObjects.
One such repository is https://github.com/robbyrussell/oh-my-zsh.git
which before this change will emit this error when cloned with
fetch.fsckObjects:
error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes
fatal: Error in object
fatal: index-pack failed
Now with fetch.fsck.zeroPaddedFilemode=warn we'll warn about that
issue, but the clone will succeed:
warning: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes
warning: object a18c4d13c2a5fa2d4ecd5346c50e119b999b807d: zeroPaddedFilemode: contains zero-padded file modes
warning: object 84df066176c8da3fd59b13731a86d90f4f1e5c9d: zeroPaddedFilemode: contains zero-padded file modes
The motivation for this is to be able to turn on fetch.fsckObjects
globally across a fleet of computers but still be able to manually
clone various legacy repositories by either white-listing specific
issues, or better yet whitelist specific objects.
The use of --git-dir=* instead of -C in the tests could be considered
somewhat archaic, but the tests I'm adding here are duplicating the
corresponding receive.* tests with as few changes as possible.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests for transfer.fsckObjects have grown organically over time to
not make much sense.
Initially when these were added in b10a53583f ("test: fetch/receive
with fsckobjects", 2011-09-04) they were only testing the "corrupt or
missing object" case, but later on in 70a4ae73d8 ("fsck: add a simple
test for receive.fsck.<msg-id>", 2015-06-22) they were expanded to
check for the fsck.<msg-id> feature.
The problem was that we still kept the same corrupt test repo, making
it harder to add new tests that check the entirety of the repository
between operations via "git fsck" to see whether only known issues
that can be ignored with fsck.<msg-id> have occurred.
The tests only did the right thing because such a full "git fsck" was
never done after a certain point, and instead we were only
manipulating specific refs. This makes it harder to add new tests, and
none of the fsck.<msg-id> tests relied on this.
So let's not confuse the two and repair the corrupt repository before
we run the fsck.<msg-id> tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the transfer.fsckObjects documentation to explicitly note the
unique security and/or corruption issues fetch.fsckObjects suffers
from, since it doesn't have a quarantine environment.
This was already alluded to in the existing documentation, but let's
spell it out so there's no confusion here, and give a concrete example
of how to work around this limitation.
Let's also prominently note that this is considered to be a limitation
of the current implementation, rather than something that's intended
and by design, since we might change this in the future.
See
https://public-inbox.org/git/20180531060259.GE17344@sigill.intra.peff.net/
for further details.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing documentation led the user to believe that all we were
doing were basic reachability sanity checks, but that hasn't been true
for a very long time. Update the description to match reality, and
note the caveat that there's a quarantine for accepting pushes, but
not for fetching.
Also mention that the fsck checks for security issues, which was my
initial motivation for writing this fetch.fsck.* series.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the fsck.<msg-id> and receive.fsck.<msg-id>
variables was mostly duplicated in two places, with fsck.<msg-id>
making no mention of the corresponding receive.fsck.<msg-id>, and the
same for fsck.skipList.
I spent quite a lot of time today wondering why setting the
fsck.<msg-id> variant wasn't working to clone a legacy repository (not
that that would have worked anyway, but a subsequent patch implements
fetch.fsck.<msg-id>).
Rectify this situation by describing the feature in general terms
under the fsck.* documentation, and make the receive.fsck.*
documentation refer to those variables instead.
This documentation was initially added in 2becf00ff7 ("fsck: support
demoting errors to warnings", 2015-06-22) and 4b55b9b479 ("fsck:
document the new receive.fsck.<msg-id> options", 2015-06-22).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refer readers of fetch.fsckObjects and receive.fsckObjects to
transfer.fsckObjects instead of repeating the description at each
location.
I don't think this description of them makes much sense, but for now
I'm just moving the existing documentation around. Making it better
will be done in a later patch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the setting of a receive.fsck.badDate config variable to
"ignore". This was added in efaba7cc77 ("fsck: optionally ignore
specific fsck issues completely", 2015-06-22) but never did anything,
presumably it was part of some work-in-progress code that never made
it into git.git.
None of these tests will emit the "invalid author/committer line - bad
date" warning. The dates on the commit objects we're setting up are
not invalid.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-recursive takes any files marked as unmerged by unpack_trees,
tries to figure out whether they can be resolved (e.g. using renames
or a file-level merge), and then if they can be it will delete the old
cache entries and writes new ones. This means that any ce_flags for
those cache entries are essentially cleared when merging.
Unfortunately, if a file was marked as skip_worktree and it needs a
file-level merge but the merge results in the same version of the file
that was found in HEAD, we skip updating the worktree (because the
file was unchanged) but clear the skip_worktree bit (because of the
delete-cache-entry-and-write-new-one). This makes git treat the file
as having a local change in the working copy, namely a delete, when it
should appear as unchanged despite not being present. Avoid this
problem by copying the skip_worktree flag in this case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent changes in merge_content() induced a bug when merging files that are
not present in the local working directory due to sparse-checkout. Add a
test case to demonstrate the bug so that we can ensure the fix resolves
it and to prevent future regressions.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Push passes to another commands, as described in
https://public-inbox.org/git/20171129032214.GB32345@sigill.intra.peff.net/
As it gets complicated to correctly track the data length, instead transfer
the data through parent process and cut the pipe as the specified length is
reached. Do it only when CONTENT_LENGTH is set, otherwise pass the input
directly to the forked commands.
Add tests for cases:
* CONTENT_LENGTH is set, script's stdin has more data, with all combinations
of variations: fetch or push, plain or compressed body, correct or truncated
input.
* CONTENT_LENGTH is specified to a value which does not fit into ssize_t.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When compiling under Apple LLVM version 9.1.0 (clang-902.0.39.2) with
"make DEVELOPER=1 DEVOPTS=pedantic", the compiler says
error: redeclaration of already-defined enum 'object_type' is a GNU
extension [-Werror,-Wgnu-redeclared-enum]
According to https://en.cppreference.com/w/c/language/declarations
(section "Redeclaration"), a repeated declaration after the definition
is only legal for structs and unions, but not for enums.
Drop the belated declaration of enum object_type and include cache.h
instead to make sure the enum is defined.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strncpy() function is less horrible than strcpy(), but
is still pretty easy to misuse because of its funny
termination semantics. Namely, that if it truncates it omits
the NUL terminator, and you must remember to add it
yourself. Even if you use it correctly, it's sometimes hard
for a reader to verify this without hunting through the
code. If you're thinking about using it, consider instead:
- strlcpy() if you really just need a truncated but
NUL-terminated string (we provide a compat version, so
it's always available)
- xsnprintf() if you're sure that what you're copying
should fit
- strbuf or xstrfmt() if you need to handle
arbitrary-length heap-allocated strings
Note that there is one instance of strncpy in
compat/regex/regcomp.c, which is fine (it allocates a
sufficiently large string before copying). But this doesn't
trigger the ban-list even when compiling with NO_REGEX=1,
because:
1. we don't use git-compat-util.h when compiling it
(instead we rely on the system includes from the
upstream library); and
2. It's in an "#ifdef DEBUG" block
Since it's doesn't trigger the banned.h code, we're better
off leaving it to keep our divergence from upstream minimal.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sprintf() function (and its variadic form vsprintf) make
it easy to accidentally introduce a buffer overflow. If
you're thinking of using them, you're better off either
using a dynamic string (strbuf or xstrfmt), or xsnprintf if
you really know that you won't overflow. The last sprintf()
call went away quite a while ago in f0766bf94e (fsck: use
for_each_loose_file_in_objdir, 2015-09-24).
Note that we respect HAVE_VARIADIC_MACROS here, which some
ancient platforms lack. As a fallback, we can just "guess"
that the caller will provide 3 arguments. If they do, then
the macro will work as usual. If not, then they'll get a
slightly less useful error, like:
git.c:718:24: error: macro "sprintf" passed 3 arguments, but takes just 2
That's not ideal, but it at least alerts them to the problem
area. And anyway, we're primarily targeting people adding
new code. Most developers should be on modern enough
platforms to see the normal "good" error message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strcat() function has all of the same overflow problems
as strcpy(). And as a bonus, it's easy to end up
accidentally quadratic, as each subsequent call has to walk
through the existing string.
The last strcat() call went away in f063d38b80 (daemon: use
cld->env_array when re-spawning, 2015-09-24). In general,
strcat() can be replaced either with a dynamic string
(strbuf or xstrfmt), or with xsnprintf if you know the
length is bounded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few standard C functions (like strcpy) which are
easy to misuse. E.g.:
char path[PATH_MAX];
strcpy(path, arg);
may overflow the "path" buffer. Sometimes there's an earlier
constraint on the size of "arg", but even in such a case
it's hard to verify that the code is correct. If the size
really is unbounded, you're better off using a dynamic
helper like strbuf:
struct strbuf path = STRBUF_INIT;
strbuf_addstr(path, arg);
or if it really is bounded, then use xsnprintf to show your
expectation (and get a run-time assertion):
char path[PATH_MAX];
xsnprintf(path, sizeof(path), "%s", arg);
which makes further auditing easier.
We'd usually catch undesirable code like this in a review,
but there's no automated enforcement. Adding that
enforcement can help us be more consistent and save effort
(and a round-trip) during review.
This patch teaches the compiler to report an error when it
sees strcpy (and will become a model for banning a few other
functions). This has a few advantages over a separate
linting tool:
1. We know it's run as part of a build cycle, so it's
hard to ignore. Whereas an external linter is an extra
step the developer needs to remember to do.
2. Likewise, it's basically free since the compiler is
parsing the code anyway.
3. We know it's robust against false positives (unlike a
grep-based linter).
The two big disadvantages are:
1. We'll only check code that is actually compiled, so it
may miss code that isn't triggered on your particular
system. But since presumably people don't add new code
without compiling it (and if they do, the banned
function list is the least of their worries), we really
only care about failing to clean up old code when
adding new functions to the list. And that's easy
enough to address with a manual audit when adding a new
function (which is what I did for the functions here).
2. If this ends up generating false positives, it's going
to be harder to disable (as opposed to a separate
linter, which may have mechanisms for overriding a
particular case).
But the intent is to only ban functions which are
obviously bad, and for which we accept using an
alternative even when this particular use isn't buggy
(e.g., the xsnprintf alternative above).
The implementation here is simple: we'll define a macro for
the banned function which replaces it with a reference to a
descriptively named but undeclared identifier. Replacing it
with any invalid code would work (since we just want to
break compilation). But ideally we'd meet these goals:
- it should be portable; ideally this would trigger
everywhere, and does not need to be part of a DEVELOPER=1
setup (because unlike warnings which may depend on the
compiler or system, this is a clear indicator of
something wrong in the code).
- it should generate a readable error that gives the
developer a clue what happened
- it should avoid generating too much other cruft that
makes it hard to see the actual error
- it should mention the original callsite in the error
The output with this patch looks like this (using gcc 7, on
a checkout with 022d2ac1f3 reverted, which removed the final
strcpy from blame.c):
CC builtin/blame.o
In file included from ./git-compat-util.h:1246,
from ./cache.h:4,
from builtin/blame.c:8:
builtin/blame.c: In function ‘cmd_blame’:
./banned.h:11:22: error: ‘sorry_strcpy_is_a_banned_function’ undeclared (first use in this function)
#define BANNED(func) sorry_##func##_is_a_banned_function
^~~~~~
./banned.h:14:21: note: in expansion of macro ‘BANNED’
#define strcpy(x,y) BANNED(strcpy)
^~~~~~
builtin/blame.c:1074:4: note: in expansion of macro ‘strcpy’
strcpy(repeated_meta_color, GIT_COLOR_CYAN);
^~~~~~
./banned.h:11:22: note: each undeclared identifier is reported only once for each function it appears in
#define BANNED(func) sorry_##func##_is_a_banned_function
^~~~~~
./banned.h:14:21: note: in expansion of macro ‘BANNED’
#define strcpy(x,y) BANNED(strcpy)
^~~~~~
builtin/blame.c:1074:4: note: in expansion of macro ‘strcpy’
strcpy(repeated_meta_color, GIT_COLOR_CYAN);
^~~~~~
This prominently shows the phrase "strcpy is a banned
function", along with the original callsite in blame.c and
the location of the ban code in banned.h. Which should be
enough to get even a developer seeing this for the first
time pointed in the right direction.
This doesn't match our ideals perfectly, but it's a pretty
good balance. A few alternatives I tried:
1. Instead of using an undeclared variable, using an
undeclared function. This shortens the message, because
the "each undeclared identifier" message is not needed
(and as you can see above, it triggers a separate
mention of each of the expansion points).
But it doesn't actually stop compilation unless you use
-Werror=implicit-function-declaration in your CFLAGS.
This is the case for DEVELOPER=1, but not for a default
build (on the other hand, we'd eventually produce a
link error pointing to the correct source line with the
descriptive name).
2. The linux kernel uses a similar mechanism in its
BUILD_BUG_ON_MSG(), where they actually declare the
function but do so with gcc's error attribute. But
that's not portable to other compilers (and it also
runs afoul of our error() macro).
We could make a gcc-specific technique and fallback on
other compilers, but it's probably not worth the
complexity. It also isn't significantly shorter than
the error message shown above.
3. We could drop the BANNED() macro, which would shorten
the number of lines in the error. But curiously,
removing it (and just expanding strcpy directly to the
bogus identifier) causes gcc _not_ to report the
original line of code.
So this strategy seems to be an acceptable mix of
information, portability, simplicity, and robustness,
without _too_ much extra clutter. I also tested it with
clang, and it looks as good (actually, slightly less
cluttered than with gcc).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changelog embedded in the document pre-dates the addition of the
document to git.git (it used to be a Google Doc), so it only goes up
to 752414ae43 ("technical doc: add a design doc for hash function
transition", 2017-09-27).
Since then I made some small edits to it, which would have been worthy
of including in this changelog (but weren't). Instead of amending it
to include these, just note that future changes will be noted in the
log.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the interest of code hygiene, make it easier to compile Git with the
flag -pedantic.
Pure pedantic compilation with GCC 7.3 results in one warning per use of
the translation macro `N_`:
warning: array initialized from parenthesized string constant [-Wpedantic]
Therefore also disable the parenthesising of i18n strings with
-DUSE_PARENS_AROUND_GETTEXT_N=0.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Honor core.commentchar when preparing the list of commits to replay
in "rebase -i".
* as/sequencer-customizable-comment-char:
sequencer: use configured comment character
Look for broken use of "VAR=VAL shell_func" in test scripts as part
of test-lint.
* es/test-lint-one-shot-export:
t/check-non-portable-shell: detect "FOO=bar shell_func"
t/check-non-portable-shell: make error messages more compact
t/check-non-portable-shell: stop being so polite
t6046/t9833: fix use of "VAR=VAL cmd" with a shell function
"git rev-parse ':/substring'" did not consider the history leading
only to HEAD when looking for a commit with the given substring,
when the HEAD is detached. This has been fixed.
* wc/find-commit-with-pattern-on-detached-head:
sha1-name.c: for ":/", find detached HEAD commits
"git reset --merge" (hence "git merge ---abort") and "git reset --hard"
had trouble working correctly in a sparsely checked out working
tree after a conflict, which has been corrected.
* mk/merge-in-sparse-checkout:
unpack-trees: do not fail reset because of unmerged skipped entry
Code clean-up.
* hs/push-cert-check-cleanup:
gpg-interface: make parse_gpg_output static and remove from interface header
builtin/receive-pack: use check_signature from gpg-interface
Handling of an empty range by "git cherry-pick" was inconsistent
depending on how the range ended up to be empty, which has been
corrected.
* jk/empty-pick-fix:
sequencer: don't say BUG on bogus input
sequencer: handle empty-set cases consistently
Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side. The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).
* jt/partial-clone-fsck-connectivity:
clone: check connectivity even if clone is partial
upload-pack: send refs' objects despite "filter"
The content-transfer-encoding of the message "git send-email" sends
out by default was 8bit, which can cause trouble when there is an
overlong line to bust RFC 5322/2822 limit. A new option 'auto' to
automatically switch to quoted-printable when there is such a line
in the payload has been introduced and is made the default.
* bc/send-email-auto-cte:
docs: correct RFC specifying email line length
send-email: automatically determine transfer-encoding
send-email: accept long lines with suitable transfer encoding
send-email: add an auto option for transfer encoding
The character display width table has been updated to match the
latest Unicode standard.
* bb/unicode-11-width:
unicode: update the width tables to Unicode 11
The codebase has been updated to compile cleanly with -pedantic
option.
* bb/pedantic:
utf8.c: avoid char overflow
string-list.c: avoid conversion from void * to function pointer
sequencer.c: avoid empty statements at top level
convert.c: replace "\e" escapes with "\033".
fixup! refs/refs-internal.h: avoid forward declaration of an enum
refs/refs-internal.h: avoid forward declaration of an enum
fixup! connect.h: avoid forward declaration of an enum
connect.h: avoid forward declaration of an enum
"git fast-import" has been updated to avoid attempting to create
delta against a zero-byte-long string, which is pointless.
* mh/fast-import-no-diff-delta-empty:
fast-import: do not call diff_delta() with empty buffer
The userdiff pattern for .php has been updated.
* kn/userdiff-php:
userdiff: support new keywords in PHP hunk header
t4018: add missing test cases for PHP
The help message shown in the editor to edit todo list in "rebase -p"
has regressed recently, which has been corrected.
* ag/rebase-p:
git-rebase--preserve-merges: fix formatting of todo help message
"git fetch" failed to correctly validate the set of objects it
received when making a shallow history deeper, which has been
corrected.
* jt/connectivity-check-after-unshallow:
fetch-pack: write shallow, then check connectivity
fetch-pack: implement ref-in-want
fetch-pack: put shallow info in output parameter
fetch: refactor to make function args narrower
fetch: refactor fetch_refs into two functions
fetch: refactor the population of peer ref OIDs
upload-pack: test negotiation with changing repository
upload-pack: implement ref-in-want
test-pkt-line: add unpack-sideband subcommand
The "--ignore-case" option of "git for-each-ref" (and its friends)
did not work correctly, which has been fixed.
* jk/for-each-ref-icase:
ref-filter: avoid backend filtering with --ignore-case
for-each-ref: consistently pass WM_IGNORECASE flag
t6300: add a test for --ignore-case
"git rebase" behaved slightly differently depending on which one of
the three backends gets used; this has been documented and an
effort to make them more uniform has begun.
* en/rebase-consistency:
git-rebase: make --allow-empty-message the default
t3401: add directory rename testcases for rebase and am
git-rebase.txt: document behavioral differences between modes
directory-rename-detection.txt: technical docs on abilities and limitations
git-rebase.txt: address confusion between --no-ff vs --force-rebase
git-rebase: error out when incompatible options passed
t3422: new testcases for checking when incompatible options passed
git-rebase.sh: update help messages a bit
git-rebase.txt: document incompatible options
"git checkout --recurse-submodules another-branch" did not report
in which submodule it failed to update the working tree, which
resulted in an unhelpful error message.
* sb/submodule-move-head-error-msg:
submodule.c: report the submodule that an error occurs in
"fsck.skipList" did not prevent a blob object listed there from
being inspected for is contents (e.g. we recently started to
inspect the contents of ".gitmodules" for certain malicious
patterns), which has been corrected.
* rj/submodule-fsck-skip:
fsck: check skiplist for object in fsck_blob()
All of the numeric formatting done by this function uses
"%u", but we pass in a signed "int". The actual range
doesn't matter here, since the conditional makes sure we're
always showing reasonably small numbers. And even gcc's
format-checker does not seem to mind. But it's potentially
confusing to a reader of the code to see the mismatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we initially added the strbuf_readlink() function in
b11b7e13f4 (Add generic 'strbuf_readlink()' helper function,
2008-12-17), the point was that we generally have a _guess_
as to the correct size based on the stat information, but we
can't necessarily trust it.
Over the years, a few callers have grown up that simply pass
in 0, even though they have the stat information. Let's have
them pass in their hint for consistency (and in theory
efficiency, since it may avoid an extra resize/syscall loop,
but neither location is probably performance critical).
Note that st.st_size is actually an off_t, so in theory we
need xsize_t() here. But none of the other callsites use it,
and since this is just a hint, it doesn't matter either way
(if we wrap we'll simply start with a too-small hint and
then eventually complain when we cannot allocate the
memory).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return type of readlink() is ssize_t, not int. This
probably doesn't matter in practice, as it would require a
2GB symlink destination, but it doesn't hurt to be careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few strbuf functions store the length of a strbuf in a
temporary variable. We should always use size_t for this, as
it's possible for a strbuf to exceed an "int" (e.g., a 2GB
string on a 64-bit system). This is unlikely in practice,
but we should try to behave sensibly on silly or malicious
input.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The iconv interface takes a size_t, which is the appropriate
type for an in-memory buffer. But our reencode_string_*
functions use integers, meaning we may get confusing results
when the sizes exceed INT_MAX. Let's use size_t
consistently.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting a string with iconv, if the output buffer
isn't big enough, we grow it. But our growth is done without
any concern for integer overflow. So when we add:
outalloc = sofar + insz * 2 + 32;
we may end up wrapping outalloc (which is a size_t), and
allocating a too-small buffer. We then manipulate it
further:
outsz = outalloc - sofar - 1;
and feed outsz back to iconv. If outalloc is wrapped and
smaller than sofar, we'll end up with a small allocation but
feed a very large outsz to iconv, which could result in it
overflowing the buffer.
Can we use this to construct an attack wherein the victim
clones a repository with a very large commit object with an
encoding header, and running "git log" reencodes it into
utf8, causing an overflow?
An attack of this sort is likely impossible in practice.
"sofar" is how many output bytes we've written total, and
"insz" is the number of input bytes remaining. Imagine our
input doubles in size as we output it (which is easy to do
by converting latin1 to utf8, for example), and that we
start with N input bytes. Our initial output buffer also
starts at N bytes, so after the first call we'd have N/2
input bytes remaining (insz), and have written N bytes
(sofar). That means our next allocation will be
(N + N/2 * 2 + 32) bytes, or (2N + 32).
We can therefore overflow a 32-bit size_t with a commit
message that's just under 2^31 bytes, assuming it consists
mostly of "doubling" sequences (e.g., latin1 0xe1 which
becomes utf8 0xc3 0xa1).
But we'll never make it that far with such a message. We'll
be spending 2^31 bytes on the original string. And our
initial output buffer will also be 2^31 bytes. Which is not
going to succeed on a system with a 32-bit size_t, since
there will be other things using the address space, too. The
initial malloc will fail.
If we imagine instead that we can triple the size when
converting, then our second allocation becomes
(N + 2/3N * 2 + 32), or (7/3N + 32). That still requires two
allocations of 3/7 of our address space (6/7 of the total)
to succeed.
If we imagine we can quadruple, it becomes (5/2N + 32); we
need to be able to allocate 4/5 of the address space to
succeed.
This might start to get plausible. But is it possible to get
a 4-to-1 increase in size? Probably if you're converting to
some obscure encoding. But since git defaults to utf8 for
its output, that's the likely destination encoding for an
attack. And while there are 4-character utf8 sequences, it's
unlikely that you'd be able find a single-byte source
sequence in any encoding.
So this is certainly buggy code which should be fixed, but
it is probably not a useful attack vector.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing tag following, in addition to using the server's
"include-tag" capability to send tag objects (and emulating it if the
server does not support that capability), "git fetch" relies upon the
presence of refs/tags/* entries in the initial ref advertisement to
locally create refs pointing to the aforementioned tag objects. When
using protocol v2, refs/tags/* entries in the initial ref advertisement
may be suppressed by a ref-prefix argument, leading to the tag object
being downloaded, but the ref not being created.
Commit dcc73cf7ff ("fetch: generate ref-prefixes when using a configured
refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref
prefix when "git fetch" is invoked with no refspecs, but not when "git
fetch" is invoked with refspecs. Extend that functionality to make it
work in both situations.
This also necessitates a change another test which tested ref
advertisement filtering using tag refs - since tag refs are sent by
default now, the test has been switched to using branch refs instead.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the protocol v2 tests to also test fetches with multiple refspecs
specified. This also covers the previously uncovered cases of fetching
with prefix matching and fetching by SHA-1.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes I want to remove only Coccinelle's results, but keep all
other build artifacts left after my usual 'make all man' build. This
new 'cocciclean' make target will allow just that.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coccinelle outputs its suggested transformations as patches, whose
header looks something like this:
--- commit.c
+++ /tmp/cocci-output-19250-7ae78a-commit.c
Note the lack of 'diff --opts <old> <new>' line, the differing number
of path components on the --- and +++ lines, and the nonsensical
filename on the +++ line. 'patch -p0' can still apply these patches,
as it takes the filename to be modified from the --- line. Alas, 'git
apply' can't, because it takes the filename from the +++ line, and
then complains about the nonexisting file.
Pass the '--patch .' options to Coccinelle via the SPATCH_FLAGS 'make'
variable, as it seems to make it generate proper context diff patches,
with the header starting with a 'diff ...' line and containing sane
filenames. The resulting 'contrib/coccinelle/*.cocci.patch' files
then can be applied both with 'git apply' and 'patch' (even without
'-p0').
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1dc is an external library, that we carry in-tree for convenience
or grab as a submodule, so there is no use in applying our semantic
patches to its source files.
Therefore, exclude sha1dc's source files from Coccinelle's static
analysis.
This change also makes the static analysis somewhat faster: presumably
because of the heavy use of repetitive macro declarations, applying
the semantic patches 'array.cocci' and 'swap.cocci' to 'sha1dc/sha1.c'
takes over half a minute each on my machine, which amounts to about a
third of the runtime of applying these two semantic patches to the
whole git source tree.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dependencies of the 'coccicheck' make target are listed with the
help of the $(patsubst) make function, which in this case doesn't do
any pattern substitution, but only adds the '.patch' suffix.
Use the shorter and more idiomatic $(addsuffix) make function instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'coccicheck' target doesn't create a file with the same name, so
mark it as .PHONY.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Regression tests are automated tests which try to ensure a specific
behavior. The idea is: if the test case fails, the behavior indicated in
the test case's title regressed.
If a regression test that fails, even occasionally, for any reason other
than to indicate the particular regression(s) it tries to catch, it is
less useful than when it really only fails when there is a bug in the
(non-test) code that needs to be fixed.
In the instance of the test case "submodule update --init --recursive
from subdirectory" of the script t7406-submodule-update.sh, the exact
output of a recursive clone is compared with a pre-generated one. And
this is a racy test because the structure of the submodules only
guarantees a *partial* order. The 'none' and the 'rebasing' submodules
*can* be cloned in any order, which means that a mismatch with the
hard-coded order does not necessarily indicate a bug in the tested code.
See for example:
https://git-for-windows.visualstudio.com/git/_build/results?buildId=14035&view=logs
To prevent such false positives from unnecessarily costing time when
investigating test failures, let's take the exact order of the lines out
of the equation by sorting them before comparing them.
This test script seems not to have any more test cases that try to
verify any specific order in which recursive clones process the
submodules, therefore this is the only test case that is changed in this
manner.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Coccinelle's and in turn 'make coccicheck's exit code only indicates
that Coccinelle managed to finish its analysis without any errors
(e.g. no unknown --options, no missing files, no syntax errors in the
semantic patches, etc.), but it doesn't indicate whether it found any
undesired code patterns to transform or not. To find out the latter,
one has to look closer at 'make coccicheck's standard output and look
for lines like:
SPATCH result: contrib/coccinelle/<something>.cocci.patch
And this only indicates that there is something to transform, but to
see what the suggested transformations are one has to actually look
into those '*.cocci.patch' files.
This makes the automated static analysis build job on Travis CI not
particularly useful, because it neither draws our attention to
Coccinelle's findings, nor shows the actual findings. Consequently,
new topics introducing undesired code patterns graduated to master
on several occasions without anyone noticing.
The only way to draw attention in such an automated setting is to fail
the build job. Therefore, modify the 'ci/run-static-analysis.sh'
build script to check all the resulting '*.cocci.patch' files, and
fail the build job if any of them turns out to be not empty. Include
those files' contents, i.e. Coccinelle's suggested transformations, in
the build job's trace log, so we'll know why it failed.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently the static analysis build job runs Coccinelle using a single
'make' job. Using two parallel jobs cuts down the build job's run
time from around 10-12mins to 6-7mins, sometimes even under 6mins
(there is quite large variation between build job runtimes). More
than two parallel jobs don't seem to bring further runtime benefits.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of these are straight forward. GETTEXT_POISON does catch the last
string in cmd_pack_objects(), but since this is --progress output, it's
not supposed to be machine-readable.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many messages will be marked for translation in the following
commits. This commit updates some of them to be more consistent and
reduce diff noise in those commits. Changes are
- keep the first letter of die(), error() and warning() in lowercase
- no full stop in die(), error() or warning() if it's single sentence
messages
- indentation
- some messages are turned to BUG(), or prefixed with "BUG:" and will
not be marked for i18n
- some messages are improved to give more information
- some messages are broken down by sentence to be i18n friendly
(on the same token, combine multiple warning() into one big string)
- the trailing \n is converted to printf_ln if possible, or deleted
if not redundant
- errno_errno() is used instead of explicit strerror()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's start with some background about oe_delta_size() and
oe_set_delta_size(). If you already know, skip the next paragraph.
These two are added in 0aca34e826 (pack-objects: shrink delta_size
field in struct object_entry - 2018-04-14) to help reduce 'struct
object_entry' size. The delta size field in this struct is reduced to
only contain max 1MB. So if any new delta is produced and larger than
1MB, it's dropped because we can't really save such a large size
anywhere. Fallback is provided in case existing packfiles already have
large deltas, then we can retrieve it from the pack.
While this should help small machines repacking large repos without
large deltas (i.e. less memory pressure), dropping large deltas during
the delta selection process could end up with worse pack files. And if
existing packfiles already have >1MB delta and pack-objects is
instructed to not reuse deltas, all of them will be dropped on the
floor, and the resulting pack would be definitely bigger.
There is also a regression in terms of CPU/IO if we have large on-disk
deltas because fallback code needs to parse the pack every time the
delta size is needed and just access to the mmap'd pack data is enough
for extra page faults when memory is under pressure.
Both of these issues were reported on the mailing list. Here's some
numbers for comparison.
Version Pack (MB) MaxRSS(kB) Time (s)
------- --------- ---------- --------
2.17.0 5498 43513628 2494.85
2.18.0 10531 40449596 4168.94
This patch provides a better fallback that is
- cheaper in terms of cpu and io because we won't have to read
existing pack files as much
- better in terms of pack size because the pack heuristics is back to
2.17.0 time, we do not drop large deltas at all
If we encounter any delta (on-disk or created during try_delta phase)
that is larger than the 1MB limit, we stop using delta_size_ field for
this because it can't contain such size anyway. A new array of delta
size is dynamically allocated and can hold all the deltas that 2.17.0
can. This array only contains delta sizes that delta_size_ can't
contain.
With this, we do not have to drop deltas in try_delta() anymore. Of
course the downside is we use slightly more memory, even compared to
2.17.0. But since this is considered an uncommon case, a bit more
memory consumption should not be a problem.
Delta size limit is also raised from 1MB to 16MB to better cover
common case and avoid that extra memory consumption (99.999% deltas in
this reported repo are under 12MB; Jeff noted binary artifacts topped
out at about 3MB in some other private repos). Other fields are
shuffled around to keep this struct packed tight. We don't use more
memory in common case even with this limit update.
A note about thread synchronization. Since this code can be run in
parallel during delta searching phase, we need a mutex. The realloc
part in packlist_alloc() is not protected because it only happens
during the object counting phase, which is always single-threaded.
Access to e->delta_size_ (and by extension
pack->delta_size[e - pack->objects]) is unprotected as before, the
thread scheduler in pack-objects must make sure "e" is never updated
by two different threads.
The area under the new lock is as small as possible, avoiding locking
at all in common case, since lock contention with high thread count
could be expensive (most blobs are small enough that delta compute
time is short and we end up taking the lock very often). The previous
attempt to always hold a lock in oe_delta_size() and
oe_set_delta_size() increases execution time by 33% when repacking
linux.git with with 40 threads.
Reported-by: Elijah Newren <newren@gmail.com>
Helped-by: Elijah Newren <newren@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running the same reproduction script as the previous patch,
it turns out the stack is too small, which can be easily avoided.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach clone to send a list of ref-prefixes, when using protocol v2, to
allow the server to filter out irrelevant references from the
ref-advertisement. This reduces wasted time and bandwidth when cloning
repositories with a larger number of references.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The five new tests added to 't9300-fast-import.sh' in 30e215a65c
(fast-import: checkpoint: dump branches/tags/marks even if
object_count==0, 2017-09-28), all with the prefix "V:" in their test
description, run 'git fast-import' in the background and then 'kill'
it as part of a 'test_when_finished' cleanup command. When this test
script is executed with Bash, some or even all of these tests tend to
pollute the test script's stderr, and messages about terminated
processes end up on the terminal:
$ bash ./t9300-fast-import.sh
<... snip ...>
ok 179 - V: checkpoint helper does not get stuck with extra output
/<...>/test-lib-functions.sh: line 388: 28383 Terminated git fast-import $options 0<&8 1>&9
ok 180 - V: checkpoint updates refs after reset
./t9300-fast-import.sh: line 3210: 28401 Terminated git fast-import $options 0<&8 1>&9
ok 181 - V: checkpoint updates refs and marks after commit
ok 182 - V: checkpoint updates refs and marks after commit (no new objects)
./test-lib.sh: line 634: line 3250: 28485 Terminated git fast-import $options 0<&8 1>&9
ok 183 - V: checkpoint updates tags after tag
./t9300-fast-import.sh: line 3264: 28510 Terminated git fast-import $options 0<&8 1>&9
After a background child process terminates, its parent Bash process
always outputs a message like those above to stderr, even when in
non-interactive mode.
But how do some of these messages end up on the test script's stderr,
why don't we get them from all five tests, and why do they come from
different file/line locations? Well, after sending the TERM signal to
the background child process, it takes a little while until that
process receives the signal and terminates, and then it takes another
while until the parent process notices it. During this time the
parent Bash process is continuing execution, and by the time it
notices that its child terminated it might have already left
'test_eval_inner_' and its stderr is not redirected to /dev/null
anymore. That's why such a message can appear on the test script's
stderr, while other times, when the child terminates fast and/or the
parent shell is slow enough, the message ends up in /dev/null, just
like any other output of the test does. Bash always adds the file
name and line number of the code location it was about to execute when
it notices the termination of its child process as a prefix to that
message, hence the varying and sometimes totally unrelated location
prefixes in those messages (e.g. line 388 in 'test-lib-functions.sh'
is 'test_verify_prereq', and I saw such a message pointing to
'say_color' as well).
Prevent these messages from appearing on the test script's stderr by
'wait'-ing on the pid of the background 'git fast-import' process
after sending it the TERM signal. This ensures that the executing
shell's stderr is still redirected when the shell notices the
termination of its child process in the background, and that these
messages get a consistent file/line location prefix.
Note that this is not an issue when the test script is run with Bash
and '-v', because then these messages are supposed to go to the test
script's stderr anyway, and indeed all of them do; though the
sometimes seemingly random file/line prefixes could be confusing
still. Similarly, it's not an issue with Bash and '--verbose-log'
either, because then all messages go to the log file as they should.
Finally, it's not an issue with some other shells (I tried dash, ksh,
ksh93 and mksh) even without any of the verbose options, because they
don't print messages like these in non-interactive mode in the first
place.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test cases to cover the new X509/gpgsm support. Most of them
resemble existing ones. They just switch the format to x509 and set the
signingkey when creating signatures. Validation of signatures does not
need any configuration of git, it does need gpgsm to be configured to
trust the key(-chain).
Several of the testcases build on top of existing gpg testcases.
The commit ships a self-signed key for committer@example.com and
configures gpgsm to trust it.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a memory issue when recursing a lot, which can be reproduced as
seq 1 100000 >one
seq 1 4 100000 >two
git diff --no-index --histogram one two
Before this patch, histogram_diff would call itself recursively before
calling free_index, which would mean a lot of memory is allocated during
the recursion and only freed afterwards. By moving the memory allocation
(and its free call) into find_lcs, the memory is free'd before we recurse,
such that memory is reused in the next step of the recursion instead of
using new memory.
This addresses only the memory pressure, not the run time complexity,
that is also awful for the corner case outlined above.
Helpful in understanding the code (in addition to the sparse history of
this file), was https://stackoverflow.com/a/32367597 which reproduces
most of the code comments of the JGit implementation.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will be useful in the next patch as we'll introduce multiple
callers.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By passing the 'xpp' and 'env' argument directly to the function
'fall_back_to_classic_diff', we eliminate an occurrence of the 'index'
in histogram_diff, which will prove useful in a bit.
While at it, move it up in the file. This will make the diff of
one of the next patches more legible.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.
To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough. However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".
As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.
This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.
As this is a white space mode, it is made exclusive to other white space
modes in the move detection.
This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):
- A
- B
- C
+ A
+ B
+ C
At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.
Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:
- <TAB> A
...
+ A <TAB>
...
As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:
- A
- B
- A
- B
+ A
+ B
+ A
+ B
When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can already disable replace refs using a command line
option or environment variable, but those are awkward to
apply universally. Let's add a config option to do the same
thing.
That raises the question of why one might want to do so
universally. The answer is that replace refs violate the
immutability of objects. For instance, if you wanted to
cache the diff between commit XYZ and its parent, then in
theory that never changes; the hash XYZ represents the total
state. But replace refs violate that; pushing up a new ref
may create a completely new diff.
The obvious "if it hurts, don't do it" answer is not to
create replace refs if you're doing this kind of caching.
But for a site hosting arbitrary repositories, they may want
to allow users to share replace refs with each other, but
not actually respect them on the site (because the caching
is more important than the replace feature).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was added as a NEEDSWORK in c3c36d7de2 (replace-object:
check_replace_refs is safe in multi repo environment, 2018-04-11),
waiting for a calmer period. Since doing so now doesn't conflict
with anything in 'pu', it seems as good a time as any.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit afc711b8e1 (rename read_replace_refs to
check_replace_refs, 2014-02-18) added a comment mentioning
that check_replace_refs is set in two ways:
- from user intent via --no-replace-objects, etc
- after seeing there are no replace refs to respect
Since c3c36d7de2 (replace-object: check_replace_refs is safe
in multi repo environment, 2018-04-11) the second is no
longer true. Let's drop that part of the comment.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify that setting core.ignoreCase to deviate from reality would
not turn a case-incapable filesystem into a case-capable one.
* ms/core-icase-doc:
Documentation: declare "core.ignoreCase" as internal variable
* en/rebase-i-microfixes:
git-rebase--merge: modernize "git-$cmd" to "git $cmd"
Fix use of strategy options with interactive rebases
t3418: add testcase showing problems with rebase -i and strategy options
"git filter-branch" when used with the "--state-branch" option
still attempted to rewrite the commits whose filtered result is
known from the previous attempt (which is recorded on the state
branch); the command has been corrected not to waste cycles doing
so.
* mb/filter-branch-optim:
filter-branch: skip commits present on --state-branch
Tighten the API to make it harder to misuse in-tree .gitmodules
file, even though it shares the same syntax with configuration
files, to read random configuration items from it.
* ao/config-from-gitmodules:
submodule-config: reuse config_from_gitmodules in repo_read_gitmodules
submodule-config: pass repository as argument to config_from_gitmodules
submodule-config: make 'config_from_gitmodules' private
submodule-config: add helper to get 'update-clone' config from .gitmodules
submodule-config: add helper function to get 'fetch' config from .gitmodules
config: move config_from_gitmodules to submodule-config.c
The "-l" option in "git branch -l" is an unfortunate short-hand for
"--create-reflog", but many users, both old and new, somehow expect
it to be something else, perhaps "--list". This step warns when "-l"
is used as a short-hand for "--create-reflog" and warns about the
future repurposing of the it when it is used.
* jk/branch-l-0-deprecation:
branch: deprecate "-l" option
t: switch "branch -l" to "branch --create-reflog"
t3200: unset core.logallrefupdates when testing reflog creation
"git grep" learned the "--column" option that gives not just the
line number but the column number of the hit.
* tb/grep-column:
contrib/git-jump/git-jump: jump to exact location
grep.c: add configuration variables to show matched option
builtin/grep.c: add '--column' option to 'git-grep(1)'
grep.c: display column number of first match
grep.[ch]: extend grep_opt to allow showing matched column
grep.c: expose {,inverted} match column in match_line()
Documentation/config.txt: camel-case lineNumber for consistency
The effort to move globals to per-repository in-core structure
continues.
* jt/remove-pack-bitmap-global:
pack-bitmap: add free function
pack-bitmap: remove bitmap_git global variable
Recently added "--base" option to "git format-patch" command did
not correctly generate prereq patch ids.
* xy/format-patch-prereq-patch-id-fix:
format-patch: clear UNINTERESTING flag before prepare_bases
Bugfix for "rebase -i" corner case regression.
* pw/rebase-i-keep-reword-after-conflict:
sequencer: do not squash 'reword' commits when we hit conflicts
Code preparation to make "git p4" closer to be usable with Python 3.
* ld/p423:
git-p4: python3: fix octal constants
git-p4: python3: use print() function
git-p4: python3: basestring workaround
git-p4: python3: remove backticks
git-p4: python3: replace dict.has_key(k) with "k in dict"
git-p4: python3: replace <> with !=
"git submodule" did not correctly adjust core.worktree setting that
indicates whether/where a submodule repository has its associated
working tree across various state transitions, which has been
corrected.
* sb/submodule-core-worktree:
submodule deinit: unset core.worktree
submodule: ensure core.worktree is set after update
submodule: unset core.worktree if no working tree is present
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
Add missing colon in two places to fix formatting of options.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit allows git to create and check x509 type signatures using
gpgsm.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Supporting multiple signing formats we will have the need to configure a
custom program each. Add a new config value to cater for that.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gnupg does print the keyid followed by a space and the signer comes
next. The same pattern is also used in gpgsm, but there the key length
would be 40 instead of 16. Instead of hardcoding the expected length,
find the first space and calculate it.
Input that does not match the expected format will be ignored now,
before we jumped to found+17 which might have been behind the end of an
unexpected string.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a struct that holds the format details for the supported formats.
At the moment that is still just "openpgp". This commit prepares for the
introduction of more formats, that might use other programs and match
other signatures.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a struct repository argument to the functions in commit-graph.h that
read the commit graph. (This commit does not affect functions that write
commit graphs.)
Because the commit graph functions can now read the commit graph of any
repository, the global variable core_commit_graph has been removed.
Instead, the config option core.commitGraph is now read on the first
time in a repository that a commit is attempted to be parsed using its
commit graph.
This commit includes a test that exercises the functionality on an
arbitrary repository that is not the_repository.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of storing commit graphs in static variables, store them in
struct object_store. There are no changes to the signatures of existing
functions - they all still only support the_repository, and support for
other instances of struct repository will be added in a subsequent
commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two functions in the code (1) check if the repository is configured for
commit graphs, (2) call prepare_commit_graph(), and (3) check if the
graph exists. Move (1) and (3) into prepare_commit_graph(), reducing
duplication of code.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use oid_object_info_extended() to get object info instead of
read_object_file().
It will help to handle some requests faster (e.g., we do not need to
parse whole object if we need to know %(objectsize)).
It could also help us to add new atoms such as %(objectsize:disk)
and %(deltabase).
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inline get_obj(): it would be easier to edit the code
without this split.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Initialize variable `eaten` before its using. We may initialize it in
parse_object_buffer(), but there are cases when we do not reach this
invocation.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Atoms like "align" or "end" do not have string representation.
Earlier we had to go and parse whole object with a hope that we
could fill their string representations. It's easier to fill them
with an empty string before we start to work with whole object.
It is important to mention that we fill only these atoms that must
contain nothing. So, if we could not fill the atom because, for example,
the object is missing, we leave it with NULL.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the source of object data to prevent parsing of unneeded data.
The goal is to improve performance by avoiding calling expensive
functions when we don't need the information they provide
or when we could get it by using a cheaper function.
It is stored in valid_atoms because it depends on the atoms we are
interested in.
Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add "gpg.format" where the user can specify which type of signature to
use for commits. At the moment only "openpgp" is supported and the value is
not even used. This commit prepares for a new types of signatures.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the part of code that checks if we're still in a block
into its own function. We'll need a different approach on advancing
the blocks in a later patch, so having it as a separate function will
prove useful.
While at it rename the variable `p` to `prev` to indicate that it refers
to the previous line. This is as pmb[i] was assigned in the last iteration
of the outmost for loop.
Further rename `pnext` to `cur` to indicate that this should match up with
the current line of the outmost for loop.
Also replace the advancement of pmb[i] to reuse `cur` instead of
using `p->next` (which is how the name for pnext could be explained.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff. Some cases came up where different treatment would
have been nice.
Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=<modes>
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.
As we change the default, we'll adjust the tests.
For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.
As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.
By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we initialize the hashmap, we give it a pointer to the
diff_options, which it then passes along to each call of the
hashmap_cmp_fn function. There's no need to pass it a second time as
the "keydata" parameter, and our comparison functions never look at
keydata.
This was a mistake left over from an earlier round of 2e2d5ac184
(diff.c: color moved lines differently, 2017-06-30), before hashmap
learned to pass the data pointer for us.
Explanation-by: Jeff King <peff@peff.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t4015 we have a pattern of
git diff [<options, related to color>] |
grep -v "index" |
test_decode_color >actual &&
to produce output that we want to test against. This pattern was introduced
in 86b452e276 (diff.c: add dimming to moved line detection, 2017-06-30)
as then the focus on getting the colors right. However the pattern used
is not best practice as we do care about the exit code of Git. So let's
not have Git as the upstream of a pipe. Piping the output of grep to
some function is fine as we assume grep to be un-flawed in our test suite.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no need to forward-declare these functions, as they are used
after their implementation only.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These flags were there since the beginning (3443546f6e (Use a *real*
built-in diff generator, 2006-03-24), but were never used. Remove them.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection. The
heuristics handle a range of stylistic variations in existing tests
(evolved over the years), however, they are still best-guesses. As such,
it is possible for future changes to accidentally break assumptions upon
which the heuristics are based. Protect against this possibility by
adding tests which check the linter itself for correctness.
In addition to protecting against regressions, these tests help document
(for humans) expected behavior, which is important since the linter's
implementation language ('sed') does not necessarily lend itself to easy
comprehension.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option uses heuristics and knowledge of shell syntax to
detect broken &&-chains in subshells by pure textual inspection.
Although the heuristics work well, they are still best-guesses and
future changes could accidentally break assumptions upon which they are
based. To protect against this possibility, tests checking correctness
of the linter itself will be added. As preparation, add a new makefile
"check-chainlint" target and associated machinery.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --chain-lint option detects broken &&-chains by forcing the test to
exit early (as the very first step) with a sentinel value. If that
sentinel is the test's overall exit code, then the &&-chain is intact;
if not, then the chain is broken. Unfortunately, this detection does not
extend to &&-chains within subshells even when the subshell itself is
properly linked into the outer &&-chain.
Address this shortcoming by feeding the body of the test to a
lightweight "linter" which can peer inside subshells and identify broken
&&-chains by pure textual inspection. Although the linter does not
actually parse shell scripts, it has enough knowledge of shell syntax to
reliably deal with formatting style variations (as evolved over the
years) and to avoid being fooled by non-shell content (such as inside
here-docs and multi-line strings). It recognizes modern subshell
formatting:
statement1 &&
(
statement2 &&
statement3
) &&
statement4
as well as old-style:
statement1 &&
(statement2 &&
statement3) &&
statement4
Heuristics are employed to properly identify the extent of a subshell
formatted in the old-style since a number of legitimate constructs may
superficially appear to close the subshell even though they don't. For
example, it understands that neither "x=$(command)" nor "case $x in *)"
end a subshell, despite the ")" at the end of line.
Due to limitations of the tool used ('sed') and its inherent
line-by-line processing, only subshells one level deep are handled, as
well as one-liner subshells one level below that. Subshells deeper than
that or multi-line subshells at level two are passed through as-is, thus
&&-chains in their bodies are not checked.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable smtp_encryption must keep it's value between two batches.
Otherwise the authentication is skipped after the first batch.
Signed-off-by: Jules Maselbas <jules.maselbas@grenoble-inp.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One-shot environment variable assignments, such as 'FOO' in
"FOO=bar cmd", exist only during the invocation of 'cmd'. However, if
'cmd' happens to be a shell function, then 'FOO' is assigned in the
executing shell itself, and that assignment remains until the process
exits (unless explicitly unset). Since this side-effect of
"FOO=bar shell_func" is unlikely to be intentional, detect and report
such usage.
To distinguish shell functions from other commands, perform a pre-scan
of shell scripts named as input, gleaning a list of function names by
recognizing lines of the form (loosely matching whitespace):
shell_func () {
and later report suspect lines of the form (loosely matching quoted
values):
FOO=bar [BAR=foo ...] shell_func
Also take care to stitch together incomplete lines (those ending with
"\") since suspect invocations may be split over multiple lines:
FOO=bar BAR=foo \
shell_func
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error messages emitted by this linting script are long and noisy,
consisting of several sections:
<test-script>:<line#>: error: <explanation>: <failed-shell-text>
The line of failed shell text, usually coming from within a test body,
is often indented by one or two TABs, with the result that the actual
(important) text is separated from <explanation> by a good deal of empty
space. This can make for a difficult read, especially on typical
80-column terminals.
Make the messages more compact and perhaps a bit easier to digest by
folding out the leading whitespace from <failed-shell-text>.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error messages emitted by this linting script are long and noisy,
consisting of several sections:
<test-script>:<line#>: error: <explanation>: <failed-shell-text>
Many problem explanations ask the reader to "please" use a suggested
alternative, however, such politeness is unnecessary and just adds to
the noise and length of the line, so drop "please" to make the message a
bit more concise.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike "FOO=bar cmd" one-shot environment variable assignments
which exist only for the invocation of 'cmd', those assigned by
"FOO=bar shell_func" exist within the running shell and continue to
do so until the process exits (or are explicitly unset). It is
unlikely that this behavior was intended by the test author.
In these particular tests, the "FOO=bar shell_func" invocations are
already in subshells, so the assignments don't last too long, don't
appear to harm subsequent commands in the same subshells, and don't
affect other tests in the same scripts, however, the usage is
nevertheless misleading and poor practice, so fix the tests to assign
and export the environment variables in the usual fashion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new negotiation algorithm used during fetch that skips
commits in an effort to find common ancestors faster. The skips grow
similarly to the Fibonacci sequence as the commit walk proceeds further
away from the tips. The skips may cause unnecessary commits to be
included in the packfile, but the negotiation step typically ends more
quickly.
Usage of this algorithm is guarded behind the configuration flag
fetch.negotiationAlgorithm.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test uses a convoluted method to verify that "p4 help" errors
out when asked for help about an unknown command. In doing so, it
intentionally breaks the &&-chain. Simplify by employing the typical
"! command" idiom and a normal &&-chain instead.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has been dysfunctional since it was added by 6489660b4b
(send-email: support validate hook, 2017-05-12), however, the problem
went unnoticed due to a broken &&-chain late in the test.
The test wants to verify that a non-zero exit code from the
'sendemail-validate' hook causes git-send-email to abort with a
particular error message. A command which is expected to fail should be
run with 'test_must_fail', however, the test neglects to do so.
Fix this problem, as well as the broken &&-chain behind which the
problem hid.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test manually checks the exit code of git-grep for a particular
value. In doing so, it intentionally breaks the &&-chain. Modernize the
test by taking advantage of test_expect_code() and a normal &&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has been dysfunctional since it was added by 619acfc78c
(submodule add: extend force flag to add existing repos, 2016-10-06),
however, two problems early in the test went unnoticed due to a broken
&&-chain later in the test.
First, it tries configuring the submodule with repository "bogus-url",
however, "git submodule add" insists that the repository be either an
absolute URL or a relative pathname requiring prefix "./" or "../" (this
is true even with --force), but "bogus-url" does not meet those
criteria, thus the command fails.
Second, it then tries configuring a submodule with a path which is
.gitignore'd, which is disallowed. This restriction can be overridden
with --force, but the test neglects to use that option.
Fix both problems, as well as the broken &&-chain behind which they hid.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 60f487ac0e (Remove common-cmds.h, 2018-05-10), we forgot to adjust
this README when removing the common-cmds.h file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch several hard-coded constants into expressions based either on
GIT_MAX_HEXSZ or the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert one use of 20 and several uses of GIT_SHA1_HEXSZ into references
to the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use GIT_MAX_HEXSZ instead of GIT_SHA1_HEXSZ for an allocation so that it
is sufficiently large. Switch a comparison to use the_hash_algo to
determine the length of a hex object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert several uses of GIT_SHA1_HEXSZ into references to the_hash_algo.
Switch other uses into a use of parse_oid_hex and uses of its computed
pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch from using get_oid_hex to parse_oid_hex to simplify pointer
operations and avoid the need for a hash-related constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch from using GIT_SHA1_HEXSZ to the_hash_algo to make the parsing of
the index information hash independent.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to ensure we write the correct amount, use the_hash_algo to
find the correct number of bytes for the current hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to be sure we have enough space to use with any hash algorithm,
use GIT_MAX_HEXSZ to allocate space.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Specify these constants in terms of the size of the hash algorithm
currently in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using the GIT_SHA1_* constants, switch to using the_hash_algo
to convert object IDs to and from hex format.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the hard-coded 20-based values and replace them with uses of
the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of our code has been converted to use struct object_id for object
IDs. However, there are some places that still have not, and there are
a variety of places that compare equivalently sized hashes that are not
object IDs. All of these hashes are artifacts of the internal hash
algorithm in use, and when we switch to NewHash for object storage, all
of these uses will also switch.
Update the hashcpy, hashclr, and hashcmp functions to use the_hash_algo,
since they are used in a variety of places to copy and manipulate
buffers that need to move data into or out of struct object_id. This
has the effect of making the corresponding oid* functions use
the_hash_algo as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our color buffers are all COLOR_MAXLEN, which fits the
largest possible color. So we can never overflow the buffer
by copying an existing color. However, using strcpy() makes
it harder to audit the code-base for calls that _are_
problems. We should use something like xsnprintf(), which
shows the reader that we expect this never to fail (and
provides a run-time assertion if it does, just in case).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add "struct json_writer" and a series of jw_ routines to compose JSON
data into a string buffer. The resulting string may then be printed by
commands wanting to support a JSON-like output format.
The json_writer is limited to correctly formatting structured data for
output. It does not attempt to build an object model of the JSON data.
We say "JSON-like" because we do not enforce the Unicode (usually UTF-8)
requirement on string fields. Internally, Git does not necessarily have
Unicode/UTF-8 data for most fields, so it is currently unclear the best
way to enforce that requirement. For example, on Linux pathnames can
contain arbitrary 8-bit character data, so a command like "status" would
not know how to encode the reported pathnames. We may want to revisit
this (or double encode such strings) in the future.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Wink Saville <wink@saville.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
deref_tag() prints a warning if the object that a tag refers to does not
exist. However, when a partial clone has an annotated tag from its
promisor remote, but not the object that it refers to, printing a
warning on such a tag is incorrect.
This occurs, for example, when the checkout that happens after a partial
clone causes some objects to be fetched - and as part of the fetch, all
local refs are read. The test included in this patch demonstrates this
situation.
Therefore, do not print a warning in this case.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In handle_commit(), it is fatal for an annotated tag to point to a
non-existent object. --exclude-promisor-objects should relax this rule
and allow non-existent objects that are promisor objects, but this is
not the case. Update handle_commit() to tolerate this situation.
This was observed when cloning from a repository with an annotated tag
pointing to a blob. The test included in this patch demonstrates this
case.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sequencer currently passes GIT_DIR, but not GIT_WORK_TREE, to exec
commands. In that configuration, we assume that whatever directory
we're in is the top level of the work tree, and git rev-parse
--show-toplevel responds accordingly. However, when we're in a
subdirectory, that isn't correct: we respond with the subdirectory as
the top level, resulting in unexpected behavior.
Ensure that we pass GIT_WORK_TREE as well as GIT_DIR so that git
operations within subdirectories work correctly.
Note that we are guaranteed to have a work tree in this case: the
relevant sequencer functions are called only from revert, cherry-pick,
and rebase--helper; all of these commands require a working tree.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the configured comment character when generating comments about
branches in a todo list. Failure to honor this configuration causes a
failure to parse the resulting todo list.
Setting core.commentChar to "auto" will not be honored here, and the
previously configured or default value will be used instead. But, since
the todo list will consist of only generated content, there should not
be any non-comment lines beginning with that character.
Signed-off-by: Aaron Schrab <aaron@schrab.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We added an fsck check in ed8b10f631 (fsck: check
.gitmodules content, 2018-05-02) as a defense against the
vulnerability from 0383bbb901 (submodule-config: verify
submodule names as paths, 2018-04-30). With the idea that
up-to-date hosting sites could protect downstream unpatched
clients that fetch from them.
As part of that defense, we reject any ".gitmodules" entry
that is not syntactically valid. The theory is that if we
cannot even parse the file, we cannot accurately check it
for vulnerabilities. And anybody with a broken .gitmodules
file would eventually want to know anyway.
But there are a few reasons this is a bad tradeoff in
practice:
- for this particular vulnerability, the client has to be
able to parse the file. So you cannot sneak an attack
through using a broken file, assuming the config parsers
for the process running fsck and the eventual victim are
functionally equivalent.
- a broken .gitmodules file is not necessarily a problem.
Our fsck check detects .gitmodules in _any_ tree, not
just at the root. And the presence of a .gitmodules file
does not necessarily mean it will be used; you'd have to
also have gitlinks in the tree. The cgit repository, for
example, has a file named .gitmodules from a
pre-submodule attempt at sharing code, but does not
actually have any gitlinks.
- when the fsck check is used to reject a push, it's often
hard to work around. The pusher may not have full control
over the destination repository (e.g., if it's on a
hosting server, they may need to contact the hosting
site's support). And the broken .gitmodules may be too
far back in history for rewriting to be feasible (again,
this is an issue for cgit).
So we're being unnecessarily restrictive without actually
improving the security in a meaningful way. It would be more
convenient to downgrade this check to "info", which means
we'd still comment on it, but not reject a push. Site admins
can already do this via config, but we should ship sensible
defaults.
There are a few counterpoints to consider in favor of
keeping the check as an error:
- the first point above assumes that the config parsers for
the victim and the fsck process are equivalent. This is
pretty true now, but as time goes on will become less so.
Hosting sites are likely to upgrade their version of Git,
whereas vulnerable clients will be stagnant (if they did
upgrade, they'd cease to be vulnerable!). So in theory we
may see drift over time between what two config parsers
will accept.
In practice, this is probably OK. The config format is
pretty established at this point and shouldn't change a
lot. And the farther we get from the announcement of the
vulnerability, the less interesting this extra layer of
protection becomes. I.e., it was _most_ valuable on day
0, when everybody's client was still vulnerable and
hosting sites could protect people. But as time goes on
and people upgrade, the population of vulnerable clients
becomes smaller and smaller.
- In theory this could protect us from other
vulnerabilities in the future. E.g., .gitmodules are the
only way for a malicious repository to feed data to the
config parser, so this check could similarly protect
clients from a future (to-be-found) bug there.
But that's trading a hypothetical case for real-world
pain today. If we do find such a bug, the hosting site
would need to be updated to fix it, too. At which point
we could figure out whether it's possible to detect
_just_ the malicious case without hurting existing
broken-but-not-evil cases.
- Until recently, we hadn't made any restrictions on
.gitmodules content. So now in tightening that we're
hitting cases where certain things used to work, but
don't anymore. There's some moderate pain now. But as
time goes on, we'll see more (and more varied) cases that
will make tightening harder in the future. So there's
some argument for putting rules in place _now_, before
users grow more cases that violate them.
Again, this is trading pain now for hypothetical benefit
in the future. And if we try hard in the future to keep
our tightening to a minimum (i.e., rejecting true
maliciousness without hurting broken-but-not-evil repos),
then that reduces even the hypothetical benefit.
Considering both sets of arguments, it makes sense to loosen
this check for now.
Note that we have to tweak the test in t7415 since fsck will
no longer consider this a fatal error. But we still check
that it reports the warning, and that we don't get the
spurious error from the config code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since ed8b10f631 (fsck: check .gitmodules content,
2018-05-02), we'll report a gitmodulesParse error for two
conditions:
- a .gitmodules entry is not syntactically valid
- a .gitmodules entry is larger than core.bigFileThreshold
with the intent that we can detect malicious files and
protect downstream clients. E.g., from the issue in
0383bbb901 (submodule-config: verify submodule names as
paths, 2018-04-30).
But these conditions are actually quite different with
respect to that bug:
- a syntactically invalid file cannot trigger the problem,
as the victim would barf before hitting the problematic
code
- a too-big .gitmodules _can_ trigger the problem. Even
though it is obviously silly to have a 500MB .gitmodules
file, the submodule code will happily parse it if you
have enough memory.
So it may be reasonable to configure their severity
separately. Let's add a new class for the "too large" case
to allow that.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent patch series renamed the get_commit_tree_from_graph method but
forgot to update the coccinelle script that exempted it from rules
regarding accesses to 'maybe_tree'. This fixes that oversight to bring
the coccinelle scripts back to a good state.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bash may take it happily but running test with dash reveals a breakage.
This was not discovered for a long time as no tests after this test
depended on GIT_AUTHOR_NAME to be reverted correctly back to the
original value after this step is done.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, if a tool calls SetCurrentDirectory with a lower case drive
letter, the subsequent call to GetCurrentDirectory will return the same
lower case drive letter. Powershell, for example, does not normalize the
path. If that happens, test-drop-caches will error out as it does not
correctly to handle lower case drive letters.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch broadens the set of commits matched by ":/<pattern>" to
include commits reachable from HEAD but not any named ref. This avoids
surprising behavior when working with a detached HEAD and trying to
refer to a commit that was recently created and only exists within the
detached state.
If multiple worktrees exist, only the current worktree's HEAD is
considered reachable. This is consistent with the existing behavior for
other per-worktree refs: e.g., bisect refs are considered reachable, but
only within the relevant worktree.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: William Chargin <wchargin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The last test of 't5561-http-backend.sh', 'server request log matches
test results' may fail occasionally, because the order of entries in
Apache's access log doesn't match the order of requests sent in the
previous tests, although all the right requests are there. I saw it
fail on Travis CI five times in the span of about half a year, when
the order of two subsequent requests was flipped, and could trigger
the failure with a modified Git. However, I was unable to trigger it
with stock Git on my machine. Three tests in
't5541-http-push-smart.sh' and 't5551-http-fetch-smart.sh' check
requests in the log the same way, so they might be prone to a similar
occasional failure as well.
When a test sends a HTTP request, it can continue execution after
'git-http-backend' fulfilled that request, but Apache writes the
corresponding access log entry only after 'git-http-backend' exited.
Some time inevitably passes between fulfilling the request and writing
the log entry, and, under unfavourable circumstances, enough time
might pass for the subsequent request to be sent and fulfilled by a
different Apache thread or process, and then Apache writes access log
entries racily.
This effect can be exacerbated by adding a bit of variable delay after
the request is fulfilled but before 'git-http-backend' exits, e.g.
like this:
diff --git a/http-backend.c b/http-backend.c
index f3dc218b2..bbf4c125b 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -709,5 +709,7 @@ int cmd_main(int argc, const char **argv)
max_request_buffer);
cmd->imp(&hdr, cmd_arg);
+ if (getpid() % 2)
+ sleep(1);
return 0;
}
This delay considerably increases the chances of log entries being
written out of order, and in turn makes t5561's last test fail almost
every time. Alas, it doesn't seem to be enough to trigger a similar
failure in t5541 and t5551.
So, since we can't just rely on the order of access log entries always
corresponding the order of requests, make checking the access log more
deterministic by sorting (simply lexicographically) both the stripped
access log entries and the expected entries before the comparison with
'test_cmp'. This way the order of log entries won't matter and
occasional out-of-order entries won't trigger a test failure, but the
comparison will still notice any unexpected or missing log entries.
OTOH, this sorting will make it harder to identify from which test an
unexpected log entry came from or which test's request went missing.
Therefore, in case of an error include the comparison of the unsorted
log enries in the test output as well.
And since all this should be performed in four tests in three test
scripts, put this into a new helper function 'check_access_log' in
't/lib-httpd.sh'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Four tests in three httpd-related test scripts check the contents of
Apache's 'access.log', and they all do so by running 'sed' with the
exact same script consisting of four s/// commands to strip
uninteresting log fields and to vertically align the requested URLs.
Extract this into a common helper function 'strip_access_log' in
'lib-httpd.sh', and use it in all of those tests.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the second test of 't5541-http-push-smart.sh', 'no empty path
components' we truncate Apache's access log by running:
echo >.../access.log
There are two issues with this approach:
- This doesn't leave an empty file behind, like a proper truncation
would, but a file with a lone newline in it. Consequently, a
later test checking the log's contents must consider this improper
truncation and include an empty line in the expected content.
- This truncation is done in the middle of the test, because,
quoting the in-code comment, "we do this [truncation] before the
actual comparison to ensure the log is cleared" even when
subsequent 'test_cmp' fails. Alas, this is not quite robust
enough, as it is conceivable that 'git clone' fails after already
having sent a request, in which case the access log would not be
truncated and would leave stray log entries behind.
Since there is no need for that newline at all, drop the 'echo' from
the truncation and adjust the expected content accordingly.
Furthermore, make sure that the truncation is performed no matter
whether and how 'git clone' fails unexpectedly by specifying it as a
'test_when_finished' command.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we support octopus merges in the `--rebase-merges` mode,
we should give users who actually read the manuals a chance to know
about this fact.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we introduced the `merge` command for use in todo lists,
to allow to recreate and modify branch topology.
For ease of implementation, and to make review easier, the initial
implementation only supported merge commits with exactly two parents.
This patch adds support for octopus merges, making use of the
just-introduced `-F <file>` option for the `git merge` command: to keep
things simple, we spawn a new Git command instead of trying to call a
library function, also opening an easier door to enhance `rebase
--rebase-merges` to optionally use a merge strategy different from
`recursive` for regular merges: this feature would use the same code
path as octopus merges and simply spawn a `git merge`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is consistent with `git commit` which, like `git merge`, supports
passing the commit message via `-m <msg>` and, unlike `git merge` before
this patch, via `-F <file>`.
It is useful to allow this for scripted use, or for the upcoming patch
to allow (re-)creating octopus merges in `git rebase --rebase-merges`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If has_uncommitted_changes() can't resolve HEAD (e.g.,
because it's unborn or corrupt), then we end up calling
run_diff_index() with an empty revs.pending array. This
causes a segfault, as run_diff_index() blindly looks at the
first pending item.
Fixing this raises a question of fault: should
run_diff_index() handle this case, or is the caller wrong to
pass an empty pending list?
Looking at the other callers of run_diff_index(), they
handle this in one of three ways:
- they resolve the object themselves, and avoid doing the
diff if it's not valid
- they resolve the object themselves, and fall back to the
empty tree
- they use setup_revisions(), which will die() if the
object isn't valid
Since this is the only broken caller, that argues that the
fix should go there. Falling back to the empty tree makes
sense here, as we'd claim uncommitted changes if and only if
the index is non-empty. This may be a little funny in the
case of corruption (the corrupt HEAD probably _isn't_
empty), but:
- we don't actually know the reason here that HEAD didn't
resolve (the much more likely case is that we have an
unborn HEAD, in which case the empty tree comparison is
the right thing)
- this matches how other code, like "git diff", behaves
While we're thinking about it, let's add an assertion to
run_diff_index(). It should always be passed a single
object, and as this bug shows, it's easy to get it wrong
(and an assertion is easier to hunt down than a segfault, or
a quietly ignored extra tree).
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turn parse_gpg_output into a static function, the only outside user was
migrated in an earlier commit.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The combination of verify_signed_buffer followed by parse_gpg_output is
available as check_signature. Use that instead of implementing it again.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a directory/submodule conflict, we want contents from both the
directory and the submodule to be present for the user to use to resolve
the conflict, but we do not want paths under the directory being written
into the submodule and we do not want the merge being confused by paths
under the submodule being in the way. Add testcases for these situations.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case of a file/submodule conflict, although both cannot exist at
the same path, we expect both to be present somewhere for the user to be
able to resolve the conflict with. Add a testcase for this.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/merge.c contains this important requirement for merge strategies:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
However, Documentation/git-merge.txt says:
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed
index entries are in the state that would result from the merge
already.)
Interestingly, prior to commit c0be8aa06b ("Documentation/git-merge.txt:
Partial rewrite of How Merge Works", 2008-07-19),
Documentation/git-merge.txt said much more:
...the index file must match the tree of `HEAD` commit...
[NOTE]
This is a bit of a lie. In certain special cases [explained
in detail]...
Otherwise, merge will refuse to do any harm to your repository
(that is...your working tree...and index are left intact).
So, this suggests that the exceptions existed because there were special
cases where it would case no harm, and potentially be slightly more
convenient for the user. While the current text in git-merge.txt does
list a condition under which it would be safe to proceed despite the index
not matching HEAD, it does not match what is actually implemented, in
three different ways:
* The exception is written to describe what unpack-trees allows. Not
all merge strategies allow such an exception, though, making this
description misleading. 'ours' and 'octopus' merges have strictly
enforced index==HEAD for a while, and the commit previous to this
one made 'recursive' do so as well.
* If someone did a three-way content merge on a specific file using
versions from the relevant commits and staged it prior to running
merge, then that path would technically satisfy the exception listed
in git-merge.txt. unpack-trees.c would still error out on the path,
though, because it defers the three-way content merge logic to other
parts of the code (resolve, octopus, or recursive) and has no way of
checking whether the index entry from before the merge will match
the end result of the merge.
* The exception as implemented in unpack-trees actually only checked
that the index matched the MERGE_HEAD version of the file and that
HEAD matched the merge base. Assuming no renames, that would indeed
provide cases where the index matches the end result we'd get from a
merge. But renames means unpack-trees is checking that it instead
matches something other than what the final result will be, risking
either erroring out when we shouldn't need to, or not erroring out
when we should and overwriting the user's staged changes.
In addition to the wording behind this exception being misleading, it is
also somewhat surprising to see how many times the code for the special
cases were wrong or the check to make sure the index matched head was
forgotten altogether:
* Prior to commit ee6566e8d7 ("[PATCH] Rewrite read-tree", 2005-09-05),
there were many cases where an unclean index entry was allowed (look for
merged_entry_allow_dirty()); it appears that in those cases, the merge
would have simply overwritten staged changes with the result of the
merge. Thus, the merge result would have been correct, but the user's
uncommitted changes could be thrown away without warning.
* Prior to commit 160252f816 ("git-merge-ours: make sure our index
matches HEAD", 2005-11-03), the 'ours' merge strategy did not check
whether the index matched HEAD. If it didn't, the resulting merge
would include all the staged changes, and thus wasn't really an 'ours'
strategy.
* Prior to commit 3ec62ad9ff ("merge-octopus: abort if index does not
match HEAD", 2016-04-09), 'octopus' merges did not check whether the
index matched HEAD, also resulting in any staged changes from before
the commit silently being folded into the resulting merge. commit
a6ee883b8e ("t6044: new merge testcases for when index doesn't match
HEAD", 2016-04-09) was also added at the same time to try to test to
make sure all strategies did the necessary checking for the requirement
that the index match HEAD. Sadly, it didn't catch all the cases, as
evidenced by the remainder of this list...
* Prior to commit 65170c07d4 ("merge-recursive: avoid incorporating
uncommitted changes in a merge", 2017-12-21), merge-recursive simply
relied on unpack_trees() to do the necessary check, but in one special
case it avoided calling unpack_trees() entirely and accidentally ended
up silently including any staged changes from before the merge in the
resulting merge commit.
* The commit immediately before this one in this series noted that the
exceptions were written in a way that assumed no renames, making it
unsafe for merge-recursive to use. merge-recursive was modified to
use its own check to enforce that index==HEAD.
This history makes it very tempting to go into builtin/merge.c and replace
the comment that strategies must enforce that index matches HEAD with code
that just enforces it. At this point, that would only affect the
'resolve' strategy; all other strategies have each been modified to
manually enforce it. (However, note that index==HEAD is not strictly
enforced for fast-forward merges, as those are not considered a merge
strategy and they trigger in builtin/merge.c before the section in the
code where the relevant comment is found.)
But, even if we don't take the step of just fixing these problems by
enforcing index==HEAD for all strategies, we at least need to update this
misleading documentation in git-merge.txt. For now, just modify the claim
in Documentation/git-merge.txt to fix the error. The precise details
around combination of merges strategies and special cases probably is not
relevant to most users, so simply state that exceptions may exist but are
narrow and vary depending upon which merge strategy is in use.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/merge.c says that when we are about to perform a merge:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
merge-recursive has always relied on unpack_trees() to enforce this
requirement, except in the case of an "Already up to date!" merge.
unpack-trees.c does not actually enforce this requirement, though. It
allows for a pair of exceptions, in cases which it refers to as #14(ALT)
and #2ALT. Documentation/technical/trivial-merge.txt can be consulted for
the precise meanings of the various case numbers and their meanings for
unpack-trees.c, but we have a high-level description of the intent behind
these two exceptions in a combined and summarized form in
Documentation/git-merge.txt:
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed index
entries are in the state that would result from the merge already.)
While this high-level description does describe conditions under which it
would be safe to allow the index to diverge from HEAD, it does not match
what is actually implemented. In particular, unpack-trees.c has no
knowledge of renames, and these two exceptions were written assuming that
no renames take place. Once renames get into the mix, it is no longer
safe to allow the index to not match for #2ALT. We could modify
unpack-trees to only allow #14(ALT) as an exception, but that would be
more strict than required for the resolve strategy (since the resolve
strategy doesn't handle renames at all). Therefore, unpack_trees.c seems
like the wrong place to fix this.
Further, if someone fixes the combination of break and rename detection
and modifies merge-recursive to take advantage of the combination, then it
will also no longer be safe to allow the index to not match for #14(ALT)
when the recursive strategy is in use. Therefore, leaving one of the
exceptions in place with the recursive merge strategy feels like we are
just leaving a latent bug in the code for folks in the future to stumble
across.
It may be possible to fix both unpack-trees and merge-recursive in a way
that implements the exception as stated in Documentation/git-merge.txt,
but it would be somewhat complex, possibly also buggy at first, and
ultimately, not all that valuable. Instead, just enforce the requirement
stated in builtin/merge.c; error out if the index does not match the HEAD
commit, just like the 'ours' and 'octopus' strategies do.
Some testcase fixups were in order:
t7611: had many tests designed to show that `git merge --abort` could
not always restore the index and working tree to the state they
were in before the merge started. The tests that were associated
with having changes in the index before the merge started are no
longer applicable, so they have been removed.
t7504: had a few tests that had stray staged changes that were not
actually part of the test under consideration
t6044: We no longer expect stray staged changes to sometimes result
in the merge continuing. Also, fix a case where a merge
didn't abort but should have.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to Documentation/git-merge.txt,
...[merge will] abort if there are any changes registered in the index
relative to the `HEAD` commit. (One exception is when the changed
index entries are in the state that would result from the merge
already.)
Add some tests showing that this exception, while it does accurately state
what would be a safe condition under which we could allow the merge to
proceed, is not what is actually implemented.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git merge-recursive` does a three-way merge between user-specified trees
base, head, and remote. Since the user is allowed to specify head, we can
not necesarily assume that head == HEAD.
Modify index_has_changes() to take an extra argument specifying the tree
to compare against. If NULL, it will compare to HEAD. We then use this
from merge-recursive to make sure we compare to the user-specified head.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 65170c07d4 ("merge-recursive: avoid incorporating uncommitted
changes in a merge", 2017-12-21), it was noted that there was a special
case when merge-recursive didn't rely on unpack_trees() to enforce the
index == HEAD requirement, and thus that it needed to do that enforcement
itself. Unfortunately, it returned the wrong exit status, signalling that
the merge completed but had conflicts, rather than that it was aborted.
Fix the return code, and while we're at it, change the error message to
match what unpack_trees() would have printed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git merge-recursive` command allows the user to directly specify
three commits to merge -- base, head, and remote. (More than three can be
specified in the case of multiple merge bases.) Note that since the user
is allowed to specify head, it need not match HEAD.
Virtually every test and script in the current git.git codebase calls `git
merge-recursive` with head=HEAD, and likely external callers do as well,
which is why this has gone unnoticed. There is one notable
counter-example: git-stash.sh. However, git-stash called `git
merge-recursive` with an index that matches the expected merge result,
which happens to be a currently allowed exception to the "index must match
head" rule, so this never triggered an error previously.
Since we would like to tighten up the "index must match head" rule, we
need to make sure we are comparing to the correct head. Add a testcase
that demonstrates the failure when we check the wrong HEAD.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After modify/delete merge conflict happens in a file skipped by sparse
checkout, "git reset --merge", which implements the "--abort" actions,
and "git reset --hard" fail with message "Entry * not uptodate. Cannot
update sparse checkout."
As explained in [1], the up-to-date checker mistakenly treats conflicted
entry which does not exist in HEAD as still skipped by sparse checkout.
Use the fix suggested in [1]. Also, add test case which verifies the
issue is fixed.
[1] https://public-inbox.org/git/20180616051444.GA29754@duynguyen.home/
Signed-off-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cherry-picking a single commit, we go through a special
code path that avoids creating a sequencer todo list at all.
This path expects our revision parsing to turn up exactly
one commit, and dies with a BUG if it doesn't.
But it's actually quite easy to fool. For example:
$ git cherry-pick --author=no.such.person HEAD
error: BUG: expected exactly one commit from walk
fatal: cherry-pick failed
This isn't a bug; it's just bogus input.
The condition to trigger this message actually has two
parts:
1. We saw no commits. That's the case in the example
above. Let's drop the "BUG" here to make it clear that
the input is the problem. And let's also use the phrase
"empty commit set passed", which matches what we say
when we do a real revision walk and it turns up empty.
2. We saw more than one commit. That one _should_ be
impossible to trigger, since we fed at most one tip and
provided the no_walk option (and we'll have already
expanded options like "--branches" that can turn into
multiple tips). If this ever triggers, it's an
indication that the conditional added by 7acaaac275
(revert: allow single-pick in the middle of cherry-pick
sequence, 2011-12-10) needs to more carefully define
the single-pick case.
So this can remain a bug, but we'll upgrade it to use
the BUG() macro, which would make it easier to detect
and analyze if it does trigger.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user gives us a set that prepare_revision_walk()
takes to be empty, like:
git cherry-pick base..base
then we report an error. It's nonsense, and there's nothing
to pick.
But if they use revision options that later cull the list,
like:
git cherry-pick --author=nobody base~2..base
then we quietly create an empty todo list and return
success.
Arguably either behavior is acceptable, but we should
definitely be consistent about it. Reporting an error
seems to match the original intent, which dates all the way
back to 7e2bfd3f99 (revert: allow cherry-picking more than
one commit, 2010-06-02). That in turn was trying to match
the single-commit case that existed before then (and which
continues to issue an error).
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we don't care about how many bytes were written, simplify the return
value logic.
log_ref_write_fd() was written long before strbuf was fleshed out. Remove
the old manual buffer management code and replace it with strbuf(). Also
update copy_reflog_msg() which is called only by log_ref_write_fd() to use
strbuf as it keeps things consistent.
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ISO C, char constants must be in the range -128..127. Change the BOM
constants to char literals to avoid overflow.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ISO C forbids the conversion of void pointers to function pointers.
Introduce a context struct that encapsulates the function pointer.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The macro GIT_PATH_FUNC expands to a function definition that ends with
a closing brace. Remove two extra semicolons.
While at it, fix the example in path.h.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "\e" escape is not defined in ISO C.
While on this line, add a missing space after the comma.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach gc --auto to release pack files before auto packing the repository
to prevent failures when removing them.
Also teach the test 'fetching with auto-gc does not lock up' to complain
when it is no longer triggering an auto packing of the repository.
Fixes https://github.com/git-for-windows/git/issues/500
Signed-off-by: Kim Gybels <kgybels@infogroep.be>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git grep --only-matching', a new option to only print the
matching part(s) of a line.
For instance, a line containing the following (taken from README.md:27):
(`man gitcvs-migration` or `git help cvs-migration` if git is
Is printed as follows:
$ git grep --line-number --column --only-matching -e git -- \
README.md | grep ":27"
README.md:27:7:git
README.md:27:16:git
README.md:27:38:git
The patch works mostly as one would expect, with the exception of a few
considerations that are worth mentioning here.
Like GNU grep, this patch ignores --only-matching when --invert (-v) is
given. There is a sensible answer here, but parity with the behavior of
other tools is preferred.
Because a line might contain more than one match, there are special
considerations pertaining to when to print line headers, newlines, and
how to increment the match column offset. The line header and newlines
are handled as a special case within the main loop to avoid polluting
the surrounding code with conditionals that have large blocks.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit that introduced the partial clone feature - 548719fbdc
("clone: partial clone", 2017-12-08) - excluded connectivity checks
for partial clones, but this also meant that it is possible for a clone
to succeed, yet not have all objects either present or promised.
Specifically, if cloning with --filter=blob:none from a repository that
has a tag pointing to a blob, and the blob is not sent in the packfile,
the clone will pass, even if the blob is not referenced by any tree in
the packfile.
Turn on connectivity checks for partial clone.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A filter line in a request to upload-pack filters out objects regardless
of whether they are directly referenced by a "want" line or not. This
means that cloning with "--filter=blob:none" (or another filter that
excludes blobs) from a repository with at least one ref pointing to a
blob (for example, the Git repository itself) results in output like the
following:
error: missing object referenced by 'refs/tags/junio-gpg-pub'
and if that particular blob is not referenced by a fetched tree, the
resulting clone fails fsck because there is no object from the remote to
vouch that the missing object is a promisor object.
Update both the protocol and the upload-pack implementation to include
all explicitly specified "want" objects in the packfile regardless of
the filter specification.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git send-email documentation specifies RFC 2821 (the SMTP RFC) as
providing line length limits, but the specification that restricts line
length to 998 octets is RFC 2822 (the email message format RFC). Since
RFC 2822 has been obsoleted by RFC 5322, update the text to refer to RFC
5322 instead of RFC 2821.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git send-email, when invoked without a --transfer-encoding option, sends
8bit data without a MIME version or a transfer encoding. This has
several downsides.
First, unless the transfer encoding is specified, it defaults to 7bit,
meaning that non-ASCII data isn't allowed. Second, if lines longer than
998 bytes are used, we will send an message that is invalid according to
RFC 5322. The --validate option, which is the default, catches this
issue, but it isn't clear to many people how to resolve this.
To solve these issues, default the transfer encoding to "auto", so that
we explicitly specify 8bit encoding when lines don't exceed 998 bytes
and quoted-printable otherwise. This means that we now always emit
Content-Transfer-Encoding and MIME-Version headers, so remove the
conditionals from this portion of the code.
It is unlikely that the unconditional inclusion of these two headers
will affect the deliverability of messages in anything but a positive
way, since MIME is already widespread and well understood by most email
programs.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --validate (which is the default), we warn about lines exceeding
998 characters due to the limits specified in RFC 5322. However, if
we're using a suitable transfer encoding (quoted-printable or base64),
we're guaranteed not to have lines exceeding 76 characters, so there's
no need to fail in this case. The auto transfer encoding handles this
specific case, so accept it as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For most patches, using a transfer encoding of 8bit provides good
compatibility with most servers and makes it as easy as possible to view
patches. However, there are some patches for which 8bit is not a valid
encoding: RFC 5322 specifies that a message must not have lines
exceeding 998 octets.
Add a transfer encoding value, auto, which indicates that a patch should
use 8bit where allowed and quoted-printable otherwise. Choose
quoted-printable instead of base64, since base64-encoded plain text is
treated as suspicious by some spam filters.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent version of PHP supports interface, trait, abstract class and
final class. This patch fixes the PHP hunk header regexp to support
all of these keywords.
Signed-off-by: Kana Natsuno <dev@whileimautomaton.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A later patch changes the built-in PHP pattern. These test cases
demonstrate aspects of the pattern that we do not want to change.
Signed-off-by: Kana Natsuno <dev@whileimautomaton.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was a discussion of problematic directory/file conflicts with
virtual merge bases on the mailing list years ago at
https://public-inbox.org/git/AANLkTimwUQafGDrjxWrfU9uY1uKoFLJhxYs=vssOPqdf@mail.gmail.com/
Part of these corresponding tests made it into this testsuite. However,
the more problematic one didn't. And there are others that showcase the
problems even more. Add a very lengthy explanation, some of it from that
email, describing the tradeoffs in picking a recursive merge-base when
you're dealing with an add/add directory/file conflict.
The solution picked years ago is relatively good, but there is the
potential to do even better, assuming we're willing to pay a certain
performance cost.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported here[0], Microsoft Visual Studio 2017.2 and "gcc -pedantic"
don't understand the forward declaration of an unsized static array.
They insist on an array size:
d:\git\src\builtin\config.c(70,46): error C2133: 'builtin_config_options': unknown size
The thread [1] explains that this is due to the single-pass nature of
old compilers.
To work around this error, introduce the forward-declared function
usage_builtin_config() instead that uses the array
builtin_config_options only after it has been defined.
Also use this function in all other places where usage_with_options() is
called with the same arguments.
[0]: https://github.com/git-for-windows/git/issues/1735
[1]: https://groups.google.com/forum/#!topic/comp.lang.c.moderated/bmiF2xMz51U
Fixes https://github.com/git-for-windows/git/issues/1735
Reported-By: Karen Huang (via GitHub)
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Part of the todo help message in git-rebase--preserve-merges.sh is
unnecessarily indented, making the message look weird. Remove the
extra lines and trailing indent.
This was a minor regression introduced by d48f97aa ("rebase:
reindent function git_rebase__interactive", 2018-03-23) in the 2.18
timeframe. The same issue exists in "rebase -i", but it is being
addressed separately as part of the rewrite of the subcommand into C.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't need to use backslash continuation, as the "&&"
already provides continuation (and happily soaks up empty
lines between commands).
We can also expand the multi-line printf into a
here-document, which lets us use line breaks more naturally
(and avoids another continuation that required us to break
the natural indentation).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We know diff_delta() returns NULL, saying "no good delta exists for
it", when fed an empty data. Check the length of the data in the
caller to avoid such a call.
This incidentally reduces the number of attempted deltification we
see in the final statistics.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grep code invokes show_line() to display the contents of a matched
or context line in its output. Part of this execution is to print a line
header that includes information such as the kind, the line- and
column-number and etc. of that match.
To prepare for the addition of an option to print only the matching
component(s) of a non-context line, we must prepare for the possibility
that a single line may contain multiple matching parts, and thus will
need multiple headers printed for a single line.
Extracting show_line_header allows us to do just that. In the subsequent
commit, it will be used within the colorization loop to print out only
the matching parts of a line, optionally with LFs delimiting
sub-matches.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During negotiation, fetch-pack eventually reports as "have" lines all
commits reachable from all refs. Allow the user to restrict the commits
sent in this way by providing a whitelist of tips; only the tips
themselves and their ancestors will be sent.
Both globs and single objects are supported.
This feature is only supported for protocols that support connect or
stateless-connect (such as HTTP with protocol v2).
This will speed up negotiation when the repository has multiple
relatively independent branches (for example, when a repository
interacts with multiple repositories, such as with linux-next [1] and
torvalds/linux [2]), and the user knows which local branch is likely to
have commits in common with the upstream branch they are fetching.
[1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/
[2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, connectivity is checked after the shallow file is
updated. There are 2 issues with this: (1) the connectivity check is
only performed up to ancestors of existing refs (which is not thorough
enough if we were deepening an existing ref in the first place), and (2)
there is no rollback of the shallow file if the connectivity check
fails.
To solve (1), update the connectivity check to check the ancestry chain
completely in the case of a deepening fetch by refraining from passing
"--not --all" when invoking rev-list in connected.c.
To solve (2), have fetch_pack() perform its own connectivity check
before updating the shallow file. To support existing use cases in which
"git fetch-pack" is used to download objects without much regard as to
the connectivity of the resulting objects with respect to the existing
repository, the connectivity check is only done if necessary (that is,
the fetch is not a clone, and the fetch involves shallow/deepen
functionality). "git fetch" still performs its own connectivity check,
preserving correctness but sometimes performing redundant work. This
redundancy is mitigated by the fact that fetch_pack() reports if it has
performed a connectivity check itself, and if the transport supports
connect or stateless-connect, it will bubble up that report so that "git
fetch" knows not to perform the connectivity check in such a case.
This was noticed when a user tried to deepen an existing repository by
fetching with --no-shallow from a server that did not send all necessary
objects - the connectivity check as run by "git fetch" succeeded, but a
subsequent "git fsck" failed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When for-each-ref is used with --ignore-case, we expect
match_name_as_path() to do a case-insensitive match. But
there's an extra layer of filtering that happens before we
even get there. Since commit cfe004a5a9 (ref-filter: limit
traversal to prefix, 2017-05-22), we feed the prefix to the
ref backend so that it can optimize the ref iteration.
There's no mechanism for us to tell the backend we're matching
case-insensitively. Nor is there likely to be one anytime soon,
since the packed backend relies on binary-searching the sorted list
of refs. Let's just punt on this case. The extra filtering is an
optimization that we simply can't do. We'll still give the correct
answer via the filtering in match_name_as_path().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The match_name_as_path() function learned to set
WM_IGNORECASE in the "flags" field when the user passed
--ignore-case. But it forgot to actually pass the flags to
wildmatch()!
As a result, the --ignore-case feature has been broken since
it was added in 3bb16a8bf2 (tag, branch, for-each-ref: add
--ignore-case for sorting and filtering, 2016-12-04). We
didn't notice because we added tests only for git-branch and
git-tag. Whereas git-for-each-ref has slightly different
matching rules, and thus uses a different function (the
related function match_pattern() does it correctly).
Incidentally, this also caused clang's scan-build to
complain about the code; the assignment to "flags" was dead
code.
Note that we can't flip the test in t6300 to expect_success
yet. There's another bug, which will be dealt with in the
next patch.
Commit-message-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --ignore-case option was added by 3bb16a8bf2 (tag,
branch, for-each-ref: add --ignore-case for sorting and
filtering, 2016-12-04), but it was never tested. And indeed,
it does not work due to multiple bugs (which will be fixed
in subsequent patches).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each rename is a lego: the source side could be connected to a delete or
another rename, and the destination side could be connected to a rename or a
conflicting add. Previous tests combined these to get e.g.
rename/rename(1to2)/add/add, rename/rename(2to1)/delete/delete, and
rename/add/delete. But we can also build bigger chains of conflicts. Add a
testcase demonstrating this.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If either side of a rename/rename(2to1) conflict is itself also involved
in a rename/delete conflict, then the conflict is a little more complex;
we can even have what I'd call a rename/rename(2to1)/delete/delete
conflict. (In some ways, this is similar to a rename/rename(1to2)/add/add
conflict, as added in commit 3672c97148 ("merge-recursive: Fix working
copy handling for rename/rename/add/add", 2011-08-11)). Add a testcase
for such a conflict.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a file is renamed on one side of history, and the other side of history
both deletes the original file and adds a new unrelated file in the way of
the rename, then we have what I call a rename/add/delete conflict. Add a
testcase covering this scenario.
Reported-by: Robert Dailey <rcdailey.lists@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6044 has lots of tests for verifying that merge will abort as expected
when there are changes staged before the merge starts. However, it only
checked for non-zero exit code, which could mean that the merge ran to
completion with conflicts. Check that the merge was actually correctly
aborted, i.e. that .git/MERGE_HEAD is not present.
This changes one of the tests from expect_success to expect_failure.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modify index_has_changes() to take a struct istate* instead of just
operating on the_index. This is only a partial conversion, though,
because we call do_diff_cache() which implicitly assumes work is to be
done on the_index. Ongoing work is being done elsewhere to do the
remainder of the conversion, and thus is not duplicated here. Instead,
a simple check is put in place until that work is complete.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since index_has_change() is an index-related function, move it to
read-cache.c, only modifying it to avoid uses of the active_cache and
active_nr macros.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test employs a for-loop inside a subshell and correctly aborts the
loop and fails the test overall (via "exit 1") if any iteration of the
for-loop fails. Otherwise, it exits the subshell with an explicit but
entirely unnecessary "exit 0", presumably to indicate that all
iterations of the loop succeeded. The &&-chain is broken between the
for-loop and the "exit 0". Rather than fixing the &&-chain, just drop
the pointless "exit 0".
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests reference non-existent object "c" when they really mean to
be referencing "C", however, these errors went unnoticed due to a broken
&&-chain later in the tests. Fix these errors, as well as the broken
&&-chains behind which they hid.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test uses a subshell within a subshell but is formatted in such a
way as to suggests that the inner subshell is a sibling rather than a
child, which makes it difficult to digest the test's structure and
intent.
Worse, the inner subshell performs cleanup of actions from earlier in
the test, however, a failure between the initial actions and the cleanup
will prevent the cleanup from taking place.
Fix these problems by modernizing and simplifying the test and by using
test_when_finished() for the cleanup action.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of write_script() to abstract-away details of shell
script creation, thus allowing the reader to focus on script content.
Readability benefits, particularly in this case, since the script body
was buried in a noisy one-liner subshell responsible for emitting
boilerplate and body.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test expects "git push" to fail, thus it manually inverts that
local expected failure into a successful exit code for the test overall.
In doing so, it intentionally breaks the &&-chain. Modernize by
replacing manual exit code management with test_must_fail() and a normal
&&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test has been dysfunctional since it was added by 259f3ee296
(lib-submodule-update.sh: define tests for recursing into submodules,
2017-03-14), however, the problem went unnoticed due to a broken
&&-chain.
The test wants to verify that replacing a submodule containing a .git
directory will absorb the .git directory into the .git/modules/ of the
superproject, and then replace the working tree content appropriate to
the superproject. It is, therefore, incorrect to check if the
submodule content still exists since the submodule will have been
replaced by the content of the superproject.
Fix this by removing the submodule content check, which also happens
to be the line that broke the &&-chain.
While at it, fix broken &&-chains in a couple neighboring tests.
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests intentionally break the &&-chain after using 'unset' since
they don't know if 'unset' will succeed or fail and don't want a local
'unset' failure to fail the test overall. We can do better by using
sane_unset(), which can be linked into the &&-chain as usual.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests employ a noisy subshell (with missing &&-chain) to feed
input into Git commands or files:
(echo a; echo b; echo c) | git some-command ...
Simplify by taking advantage of test_write_lines():
test_write_lines a b c | git some-command ...
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests manually coerce the exit code of invoked commands to
"success" when they don't care if the command succeeds or fails since
failure of those commands should not cause the test to fail overall.
In doing so, they intentionally break the &&-chain. Modernize by
replacing manual exit code management with test_might_fail() and a
normal &&-chain.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an option (controlled by an environment variable) perform extra
validations on mem_pool allocated cache entries. When set:
1) Invalidate cache_entry memory when discarding cache_entry.
2) When discarding index_state struct, verify that all cache_entries
were allocated from expected mem_pool.
3) When discarding mem_pools, invalidate mem_pool memory.
This should provide extra checks that mem_pools and their allocated
cache_entries are being used as expected.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading large indexes from disk, a portion of the time is
dominated in malloc() calls. This can be mitigated by allocating a
large block of memory and manage it ourselves via memory pools.
This change moves the cache entry allocation to be on top of memory
pools.
Design:
The index_state struct will gain a notion of an associated memory_pool
from which cache_entries will be allocated from. When reading in the
index from disk, we have information on the number of entries and
their size, which can guide us in deciding how large our initial
memory allocation should be. When an index is discarded, the
associated memory_pool will be discarded as well - so the lifetime of
a cache_entry is tied to the lifetime of the index_state that it was
allocated for.
In the case of a Split Index, the following rules are followed. 1st,
some terminology is defined:
Terminology:
- 'the_index': represents the logical view of the index
- 'split_index': represents the "base" cache entries. Read from the
split index file.
'the_index' can reference a single split_index, as well as
cache_entries from the split_index. `the_index` will be discarded
before the `split_index` is. This means that when we are allocating
cache_entries in the presence of a split index, we need to allocate
the entries from the `split_index`'s memory pool. This allows us to
follow the pattern that `the_index` can reference cache_entries from
the `split_index`, and that the cache_entries will not be freed while
they are still being referenced.
Managing transient cache_entry structs:
Cache entries are usually allocated for an index, but this is not always
the case. Cache entries are sometimes allocated because this is the
type that the existing checkout_entry function works with. Because of
this, the existing code needs to handle cache entries associated with an
index / memory pool, and those that only exist transiently. Several
strategies were contemplated around how to handle this:
Chosen approach:
An extra field was added to the cache_entry type to track whether the
cache_entry was allocated from a memory pool or not. This is currently
an int field, as there are no more available bits in the existing
ce_flags bit field. If / when more bits are needed, this new field can
be turned into a proper bit field.
Alternatives:
1) Do not include any information about how the cache_entry was
allocated. Calling code would be responsible for tracking whether the
cache_entry needed to be freed or not.
Pro: No extra memory overhead to track this state
Con: Extra complexity in callers to handle this correctly.
The extra complexity and burden to not regress this behavior in the
future was more than we wanted.
2) cache_entry would gain knowledge about which mem_pool allocated it
Pro: Could (potentially) do extra logic to know when a mem_pool no
longer had references to any cache_entry
Con: cache_entry would grow heavier by a pointer, instead of int
We didn't see a tangible benefit to this approach
3) Do not add any extra information to a cache_entry, but when freeing a
cache entry, check if the memory exists in a region managed by existing
mem_pools.
Pro: No extra memory overhead to track state
Con: Extra computation is performed when freeing cache entries
We decided tracking and iterating over known memory pool regions was
less desirable than adding an extra field to track this stae.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add functions for:
- combining two memory pools
- determining if a memory address is within the range managed by a
memory pool
These functions will be used by future commits.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add initialization and discard functions to mem_pool type. As the
memory allocated by mem_pool can now be freed, we also track the large
allocations.
If the there are existing mp_blocks in the mem_poo's linked list of
mp_blocksl, then the mp_block for a large allocation is inserted
behind the head block. This is because only the head mp_block is considered
when searching for availble space. This results in the following
desirable properties:
1) The mp_block allocated for the large request will not be included
not included in the search for available in future requests, the large
mp_block is sized for the specific request and does not contain any
spare space.
2) The head mp_block will not bumped from considation for future
memory requests just because a request for a large chunk of memory
came in.
These changes are in preparation for a future commit that will utilize
creating and discarding memory pool.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of searching all memory blocks for available space to fulfill
a memory request, only search the head block. If the head block does
not have space, assume that previous block would most likely not be
able to fulfill request either. This could potentially lead to more
memory fragmentation, but also avoids searching memory blocks that
probably will not be able to fulfill request.
This pattern will benefit consumers that are able to generate a good
estimate for how much memory will be needed, or if they are performing
fixed sized allocations, so that once a block is exhausted it will
never be able to fulfill a future request.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It has been observed that the time spent loading an index with a large
number of entries is partly dominated by malloc() calls. This change
is in preparation for using memory pools to reduce the number of
malloc() calls made to allocate cahce entries when loading an index.
Add an API to allocate and discard cache entries, abstracting the
details of managing the memory backing the cache entries. This commit
does actually change how memory is managed - this will be done in a
later commit in the series.
This change makes the distinction between cache entries that are
associated with an index and cache entries that are not associated with
an index. A main use of cache entries is with an index, and we can
optimize the memory management around this. We still have other cases
where a cache entry is not persisted with an index, and so we need to
handle the "transient" use case as well.
To keep the congnitive overhead of managing the cache entries, there
will only be a single discard function. This means there must be enough
information kept with the cache entry so that we know how to discard
them.
A summary of the main functions in the API is:
make_cache_entry: create cache entry for use in an index. Uses specified
parameters to populate cache_entry fields.
make_empty_cache_entry: Create an empty cache entry for use in an index.
Returns cache entry with empty fields.
make_transient_cache_entry: create cache entry that is not used in an
index. Uses specified parameters to populate
cache_entry fields.
make_empty_transient_cache_entry: create cache entry that is not used in
an index. Returns cache entry with
empty fields.
discard_cache_entry: A single function that knows how to discard a cache
entry regardless of how it was allocated.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor refresh_cache_entry() to work on a specific index, instead of
implicitly using the_index. This is in preparation for making the
make_cache_entry function apply to a specific index.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit ed8b10f631 ("fsck: check .gitmodules content", 2018-05-02),
fsck will issue an error message for '.gitmodules' content that cannot
be parsed correctly. This is the case, even when the corresponding blob
object has been included on the skiplist. For example, using the cgit
repository, we see the following:
$ git fsck
Checking object directories: 100% (256/256), done.
error: bad config line 5 in blob .gitmodules
error in blob 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob
Checking objects: 100% (6626/6626), done.
$
$ git config fsck.skiplist '.git/skip'
$ echo 51dd1eff1edc663674df9ab85d2786a40f7ae3a5 >.git/skip
$
$ git fsck
Checking object directories: 100% (256/256), done.
error: bad config line 5 in blob .gitmodules
Checking objects: 100% (6626/6626), done.
$
Note that the error message issued by the config parser is still
present, despite adding the object-id of the blob to the skiplist.
One solution would be to provide a means of suppressing the messages
issued by the config parser. However, given that (logically) we are
asking fsck to ignore this object, a simpler approach is to just not
call the config parser if the object is to be skipped. Add a check to
the 'fsck_blob()' processing function, to determine if the object is
on the skiplist and, if so, exit the function early.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there's a parsing error we'll already report it via the
usual fsck report() function (or not, if the user has asked
to skip this object or warning type). The error message from
the config parser just adds confusion. Let's suppress it.
Note that we didn't test this case at all, so I've added
coverage in t7415. We may end up toning down or removing
this fsck check in the future. So take this test as checking
what happens now with a focus on stderr, and not any
ironclad guarantee that we must detect and report parse
failures in the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The underlying config parser knows how to handle a
config_options struct, but git_config_from_mem() always
passes NULL. Let's allow our callers to specify the options
struct.
We could add a "_with_options" variant, but since there are
only a handful of callers, let's just update them to pass
NULL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can currently die() or error(), but there's not yet any
way for callers to ask us just to quietly return an error.
Let's give them one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config code has a die_on_error flag, which lets us emit
an error() instead of dying when we see a bogus config file.
But there's no way for a caller of the config code to set
this: it's auto-set based on whether we're reading a file or
a blob.
Instead, let's add it to the config_options struct. When
it's not set (or we have no options) we'll continue to fall
back to the existing file/blob behavior.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The realloc counter is declared outside the struct for the given slabname,
which makes it harder for a follow up patch to move the declaration of the
struct around as then the counter variable would need special treatment.
As the reallocation counter is currently unused we can just remove it.
If we ever need to count the reallocations again, we can reintroduce
the counter as part of 'struct slabname' in commit-slab-decl.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of deref_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_tag_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of lookup_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of get_cached_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of set_commit_buffer to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_commit_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_commit to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_commit_reference
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of lookup_tree
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of lookup_blob
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_object_buffer
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_object to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a continuation of 94b410bba8 (.mailmap: Map email
addresses to names, 2013-07-12), merging names that are
spelled differently but have the same author email to the
same person.
Most spellings differed in accents or the order of names.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ed32b788c0 (version --build-options: report commit, too, if
possible, 2017-12-15), we introduced code to let `git version
--build-options` report the current commit from which the binaries were
built, if any.
To prevent erroneous commits from being reported (e.g. when unpacking
Git's source code from a .tar.gz file into a subdirectory of a different
Git project, as e.g. git_osx_installer does), we painstakingly set
GIT_CEILING_DIRECTORIES when trying to determine the current commit.
Except that we got the quoting wrong, and that variable therefore does
not have the desired effect.
The issue is that the $(shell) is resolved before the output is stuffed
into the command-line with -DGIT_BUILT_FROM_COMMIT, and therefore is
*not* inside quotes. And thus backslashing the quotes is wrong, as the
quote gets literally inserted into the CEILING_DIRECTORIES variable.
Let's fix that quoting, and while at it, also suppress the unhelpful
message
fatal: not a git repository (or any of the parent directories): .git
that gets printed to stderr if no current commit could be determined,
and might scare the occasional developer who simply tries to build Git
from scratch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test 8 in t5407 appears to be an accidental exact duplicate of of test 5;
the testcode is identical and has identical repo state, but the test
description is different and suggests that rebase -m followed by rebase
--skip was what was actually supposed to be tested. Modify the test to
include the -m option.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to try seeing if a fetch is necessary in a submodule
during a fetch with --recurse-submodules got confused when the path
to the submodule was changed in the range of commits in the
superproject, sometimes showing "(null)". This has been corrected.
* sb/fix-fetching-moved-submodules:
t5526: test recursive submodules when fetching moved submodules
submodule: fix NULL correctness in renamed broken submodules
Build and test procedure for netrc credential helper (in contrib/)
has been updated.
* tz/cred-netrc-cleanup:
git-credential-netrc: make "all" default target of Makefile
git-credential-netrc: fix exit status when tests fail
git-credential-netrc: use in-tree Git.pm for tests
git-credential-netrc: minor whitespace cleanup in test script
Continuing with the idea to programmatically enumerate various
pieces of data required for command line completion, the codebase
has been taught to enumerate options prefixed with "--no-" to
negate them.
* nd/completion-negation:
completion: collapse extra --no-.. options
completion: suppress some -no- options
parse-options: option to let --git-completion-helper show negative form
When user edits the patch in "git add -p" and the user's editor is
set to strip trailing whitespaces indiscriminately, an empty line
that is unchanged in the patch would become completely empty
(instead of a line with a sole SP on it). The code introduced in
Git 2.17 timeframe failed to parse such a patch, but now it learned
to notice the situation and cope with it.
* pw/add-p-recount:
add -p: fix counting empty context lines in edited patches
"git fetch-pack --all" used to unnecessarily fail upon seeing an
annotated tag that points at an object other than a commit.
* jk/fetch-all-peeled-fix:
fetch-pack: test explicitly that --all can fetch tag references pointing to non-commits
fetch-pack: don't try to fetch peel values with --all
"git send-pack --signed" (hence "git push --signed" over the http
transport) did not read user ident from the config mechanism to
determine whom to sign the push certificate as, which has been
corrected.
* ms/send-pack-honor-config:
builtin/send-pack: populate the default configs
The recent addition of "partial clone" experimental feature kicked
in when it shouldn't, namely, when there is no partial-clone filter
defined even if extensions.partialclone is set.
* jh/partial-clone:
list-objects: check if filter is NULL before using
Some flaky tests have been fixed.
* sg/gpg-tests-fix:
tests: make forging GPG signed commits and tags more robust
t7510-signed-commit: use 'test_must_fail'
Make refspec parsing codepath more robust.
* ab/refspec-init-fix:
refspec: initalize `refspec_item` in `valid_fetch_refspec()`
refspec: add back a refspec_item_init() function
refspec: s/refspec_item_init/&_or_die/g
The current description of "core.ignoreCase" reads like an option which
is intended to be changed by the user while it's actually expected to
be set by Git on initialization only. Subsequently, Git relies on the
proper configuration of this variable, as noted by Bryan Turner [1]:
Git on a case-insensitive filesystem (APFS, HFS+, FAT32, exFAT,
vFAT, NTFS, etc.) is not designed to be run with anything other
than core.ignoreCase=true.
[1] https://marc.info/?l=git&m=152998665813997&w=2
mid:CAGyf7-GeE8jRGPkME9rHKPtHEQ6P1+ebpMMWAtMh01uO3bfy8w@mail.gmail.com
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph feature shipped in Git 2.18 has some inconsistencies in
the constants used by the implementation and specified by the format
document.
The commit data chunk uses the key "CDAT" in the file format, but was
previously documented to say "CGET".
The commit data chunk stores commit parents using two 32-bit fields that
typically store the integer position of the parent in the list of commit
ids within the commit-graph file. When a parent does not exist, we had
documented the value 0xffffffff, but implemented the value 0x70000000.
This swap is easy to correct in the documentation, but unfortunately
reduces the number of commits that we can store in the commit-graph.
Update that estimate, too.
Reported-by: Grant Welch <gwelch925@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement ref-in-want on the client side so that when a server supports
the "ref-in-want" feature, a client will send "want-ref" lines for each
reference the client wants to fetch. This feature allows clients to
tolerate inconsistencies that exist when a remote repository's refs
change during the course of negotiation.
This allows a client to request to request a particular ref without
specifying the OID of the ref. This means that instead of hitting an
error when a ref no longer points at the OID it did at the beginning of
negotiation, negotiation can continue and the value of that ref will be
sent at the termination of negotiation, just before a packfile is sent.
More information on the ref-in-want feature can be found in
Documentation/technical/protocol-v2.txt.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched. Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.
This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor find_non_local_tags and get_ref_map to only take the
information they need instead of the entire transport struct. Besides
improving code clarity, this also improves their flexibility, allowing
for a different set of refs to be used instead of relying on the ones
stored in the transport struct.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the fetch_refs function into a function that does the fetching
of refs and another function that stores them. This is in preparation
for allowing additional processing of the fetched refs before updating
the local ref store.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Populate peer ref OIDs in get_ref_map instead of do_fetch. Besides
tightening scopes of variables in the code, this also prepares for
get_ref_map being able to be called multiple times within do_fetch.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to check the behavior of fetching from a repository which
changes between rounds of negotiation (for example, when different
servers in a load-balancing agreement participate in the same stateless
RPC negotiation). This forms a baseline of comparison to the ref-in-want
functionality (which will be introduced to the client in subsequent
commits), and ensures that subsequent commits do not change existing
behavior.
As part of this effort, a mechanism to substitute strings in a single
HTTP response is added.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids. This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement and
there exists replication delay.
In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2. This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref <ref>" parameter. At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-rebase.sh wrote strategy options to .git/rebase/merge/strategy_opts
in the following format:
'--ours' '--renormalize'
Note the double spaces.
git-rebase--interactive uses sequencer.c to parse that file, and
sequencer.c used split_cmdline() to get the individual strategy options.
After splitting, sequencer.c prefixed each "option" with a double dash,
so, concatenating all its options would result in:
-- --ours -- --renormalize
So, when it ended up calling try_merge_strategy(), that in turn would run
git merge-$strategy -- --ours -- --renormalize $merge_base -- $head $remote
instead of the expected/desired
git merge-$strategy --ours --renormalize $merge_base -- $head $remote
Remove the extra spaces so that when it goes through split_cmdline() we end
up with the desired command line.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are not passing the same args to merge strategies when we are doing an
--interactive rebase as we do with a --merge rebase. The merge strategy
should not need to be aware of which type of rebase is in effect. Add a
testcase which checks for the appropriate args.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it easier to find references to core.excludesfile and the default
$XDG_CONFIG_HOME/git/ignore path.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default core.excludesfile path is $XDG_CONFIG_HOME/git/ignore.
$HOME/.config/git/ignore is used if XDG_CONFIG_HOME is empty or unset,
as described later in the document.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase backends currently behave differently with empty commit messages,
largely as a side-effect of the different underlying commands on which
they are based. am-based rebases apply commits with an empty commit
message without stopping or requiring the user to specify an extra flag.
(It is interesting to note that am-based rebases are the default rebase
type, and no one has ever requested a --no-allow-empty-message flag to
change this behavior.) merge-based and interactive-based rebases (which
are ultimately based on git-commit), will currently halt on any such
commits and require the user to manually specify what to do with the
commit and continue.
One possible rationale for the difference in behavior is that the purpose
of an "am" based rebase is solely to transplant an existing history, while
an "interactive" rebase is one whose purpose is to polish a series before
making it publishable. Thus, stopping and asking for confirmation for a
possible problem is more appropriate in the latter case. However, there
are two problems with this rationale:
1) merge-based rebases are also non-interactive and there are multiple
types of rebases that use the interactive machinery but are not
explicitly interactive (e.g. when either --rebase-merges or
--keep-empty are specified without --interactive). These rebases are
also used solely to transplant an existing history, and thus also
should default to --allow-empty-message.
2) this rationale only says that the user is more accepting of stopping
in the case of an explicitly interactive rebase, not that stopping
for this particular reason actually makes sense. Exploring whether
it makes sense, requires backing up and analyzing the underlying
commands...
If git-commit did not error out on empty commits by default, accidental
creation of commits with empty messages would be a very common occurrence
(this check has caught me many times). Further, nearly all such empty
commit messages would be considered an accidental error (as evidenced by a
huge amount of documentation across version control systems and in various
blog posts explaining how important commit messages are). A simple check
for what would otherwise be a common error thus made a lot of sense, and
git-commit gained an --allow-empty-message flag for special case
overrides. This has made commits with empty messages very rare.
There are two sources for commits with empty messages for rebase (and
cherry-pick): (a) commits created in git where the user previously
specified --allow-empty-message to git-commit, and (b) commits imported
into git from other version control systems. In case (a), the user has
already explicitly specified that there is something special about this
commit that makes them not want to specify a commit message; forcing them
to re-specify with every cherry-pick or rebase seems more likely to be
infuriating than helpful. In case (b), the commit is highly unlikely to
have been authored by the person who has imported the history and is doing
the rebase or cherry-pick, and thus the user is unlikely to be the
appropriate person to write a commit message for it. Stopping and
expecting the user to modify the commit before proceeding thus seems
counter-productive.
Further, note that while empty commit messages was a common error case for
git-commit to deal with, it is a rare case for rebase (or cherry-pick).
The fact that it is rare raises the question of why it would be worth
checking and stopping on this particular condition and not others. For
example, why doesn't an interactive rebase automatically stop if the
commit message's first line is 2000 columns long, or is missing a blank
line after the first line, or has every line indented with five spaces, or
any number of other myriad problems?
Finally, note that if a user doing an interactive rebase does have the
necessary knowledge to add a message for any such commit and wants to do
so, it is rather simple for them to change the appropriate line from
'pick' to 'reword'. The fact that the subject is empty in the todo list
that the user edits should even serve as a way to notify them.
As far as I can tell, the fact that merge-based and interactive-based
rebases stop on commits with empty commit messages is solely a by-product
of having been based on git-commit. It went without notice for a long
time precisely because such cases are rare. The rareness of this
situation made it difficult to reason about, so when folks did eventually
notice this behavior, they assumed it was there for a good reason and just
added an --allow-empty-message flag. In my opinion, stopping on such
messages not desirable in any of these cases, even the (explicitly)
interactive case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a simple directory rename testcase, in conjunction with each of the
types of rebases:
git-rebase--interactive
git-rebase--am
git-rebase--merge
and also use the same testcase for
git am --3way
This demonstrates a difference in behavior between the different rebase
backends in regards to directory rename detection.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of aspects that are common to all rebases regardless
of which backend is in use; however, the behavior for these different
aspects varies in ways that could surprise users. (In fact, it's not
clear -- to me at least -- that these differences were even desirable or
intentional.) Document these differences.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase was taught the --force-rebase option in commit b2f82e05de ("Teach
rebase to rebase even if upstream is up to date", 2009-02-13). This flag
worked for the am and merge backends, but wasn't a valid option for the
interactive backend.
rebase was taught the --no-ff option for interactive rebases in commit
b499549401 ("Teach rebase the --no-ff option.", 2010-03-24), to do the
exact same thing as --force-rebase does for non-interactive rebases. This
commit explicitly documented the fact that --force-rebase was incompatible
with --interactive, though it made --no-ff a synonym for --force-rebase
for non-interactive rebases. The choice of a new option was based on the
fact that "force rebase" didn't sound like an appropriate term for the
interactive machinery.
In commit 6bb4e485cf ("rebase: align variable names", 2011-02-06), the
separate parsing of command line options in the different rebase scripts
was removed, and whether on accident or because the author noticed that
these options did the same thing, the options became synonyms and both
were accepted by all three rebase types.
In commit 2d26d533a0 ("Documentation/git-rebase.txt: -f forces a rebase
that would otherwise be a no-op", 2014-08-12), which reworded the
description of the --force-rebase option, the (no-longer correct) sentence
stating that --force-rebase was incompatible with --interactive was
finally removed.
Finally, as explained at
https://public-inbox.org/git/98279912-0f52-969d-44a6-22242039387f@xiplink.com
In the original discussion around this option [1], at one point I
proposed teaching rebase--interactive to respect --force-rebase
instead of adding a new option [2]. Ultimately --no-ff was chosen as
the better user interface design [3], because an interactive rebase
can't be "forced" to run.
We have accepted both --no-ff and --force-rebase as full synonyms for all
three rebase types for over seven years. Documenting them differently
and in ways that suggest they might not be quite synonyms simply leads to
confusion. Adjust the documentation to match reality.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git rebase has three different types: am, merge, and interactive, all of
which are implemented in terms of separate scripts. am builds on git-am,
merge builds on git-merge-recursive, and interactive builds on
git-cherry-pick. We make use of features in those lower-level commands in
the different rebase types, but those features don't exist in all of the
lower level commands so we have a range of incompatibilities. Previously,
we just accepted nearly any argument and silently ignored whichever ones
weren't implemented for the type of rebase specified. Change this so the
incompatibilities are documented, included in the testsuite, and tested
for at runtime with an appropriate error message shown.
Some exceptions I left out:
* --merge and --interactive are technically incompatible since they are
supposed to run different underlying scripts, but with a few small
changes, --interactive can do everything that --merge can. In fact,
I'll shortly be sending another patch to remove git-rebase--merge and
reimplement it on top of git-rebase--interactive.
* One could argue that --interactive and --quiet are incompatible since
--interactive doesn't implement a --quiet mode (perhaps since
cherry-pick itself does not implement one). However, the interactive
mode is more quiet than the other modes in general with progress
messages, so one could argue that it's already quiet.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git rebase is split into three types: am, merge, and interactive. Various
options imply different types, and which mode we are using determine which
sub-script (git-rebase--$type) is executed to finish the work. Not all
options work with all types, so add tests for combinations where we expect
to receive an error rather than having options be silently ignored.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph feature is now integrated with 'fsck' and 'gc',
so remove those items from the "Future Work" section of the
commit-graph design document.
Also remove the section on lazy-loading trees, as that was completed
in an earlier patch series.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file is a very helpful feature for speeding up git
operations. In order to make it more useful, make it possible to
write the commit-graph file during standard garbage collection
operations.
Add a 'gc.commitGraph' config setting that triggers writing a
commit-graph file after any non-trivial 'git gc' command. Defaults to
false while the commit-graph feature matures. We specifically do not
want to have this on by default until the commit-graph feature is fully
integrated with history-modifying features like shallow clones.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing commit-graph files, it can be convenient to ask for all
reachable commits (starting at the ref set) in the resulting file. This
is particularly helpful when writing to stdin is complicated, such as a
future integration with 'git gc'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If core.commitGraph is true, verify the contents of the commit-graph
during 'git fsck' using the 'git commit-graph verify' subcommand. Run
this check on all alternates, as well.
We use a new process for two reasons:
1. The subcommand decouples the details of loading and verifying a
commit-graph file from the other fsck details.
2. The commit-graph verification requires the commits to be loaded
in a specific order to guarantee we parse from the commit-graph
file for some objects and from the object database for others.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file ends with a SHA1 hash of the previous contents. If
a commit-graph file has errors but the checksum hash is correct, then we
know that the problem is a bug in Git and not simply file corruption
after-the-fact.
Compute the checksum right away so it is the first error that appears,
and make the message translatable since this error can be "corrected" by
a user by simply deleting the file and recomputing. The rest of the
errors are useful only to developers.
Be sure to continue checking the rest of the file data if the checksum
is wrong. This is important for our tests, as we break the checksum as
we modify bytes of the commit-graph file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file has an extra chunk to store the parent int-ids for
parents beyond the first parent for octopus merges. Our test repo has a
single octopus merge that we can manipulate to demonstrate the 'verify'
subcommand detects incorrect values in that chunk.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While iterating through the commit parents, perform the generation
number calculation and compare against the value stored in the
commit-graph.
The tests demonstrate that having a different set of parents affects
the generation number calculation, and this value propagates to
descendants. Hence, we drop the single-line condition on the output.
Since Git will ship with the commit-graph feature without generation
numbers, we need to accept commit-graphs with all generation numbers
equal to zero. In this case, ignore the generation number calculation.
However, verify that we should never have a mix of zero and non-zero
generation numbers. Create a test that sets one commit to generation
zero and all following commits report a failure as they have non-zero
generation in a file that contains generation number zero.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file stores parents in a two-column portion of the
commit data chunk. If there is only one parent, then the second column
stores 0xFFFFFFFF to indicate no second parent.
The 'verify' subcommand checks the parent list for the commit loaded
from the commit-graph and the one parsed from the object database. Test
these checks for corrupt parents, too many parents, and wrong parents.
Add a boundary check to insert_parent_or_die() for when the parent
position value is out of range.
The octopus merge will be tested in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'verify' subcommand must compare the commit content parsed from the
commit-graph against the content in the object database. Use
lookup_commit() and parse_commit_in_graph_one() to parse the commits
from the graph and compare against a commit that is loaded separately
and parsed directly from the object database.
Add checks for the root tree OID.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the 'verify' subcommand, load commits directly from the object
database to ensure they exist. Parse by skipping the commit-graph.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the commit-graph file, the OID fanout chunk provides an index into
the OID lookup. The 'verify' subcommand should find incorrect values
in the fanout.
Similarly, the 'verify' subcommand should find out-of-order values in
the OID lookup.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph file requires the following three chunks:
* OID Fanout
* OID Lookup
* Commit Data
If any of these are missing, then the 'verify' subcommand should
report a failure. This includes the chunk IDs malformed or the
chunk count is truncated.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the first of several commits that add a test to check that
'git commit-graph verify' catches corruption in the commit-graph
file. The first test checks that the command catches an error in
the file signature. This is a check that exists in the existing
commit-graph reading code.
Add a helper method 'corrupt_graph_and_verify' to the test script
t5318-commit-graph.sh. This helper corrupts the commit-graph file
at a certain location, runs 'git commit-graph verify', and reports
the output to the 'err' file. This data is filtered to remove the
lines added by 'test_must_fail' when the test is run verbosely.
Then, the output is checked to contain a specific error message.
Most messages from 'git commit-graph verify' will not be marked
for translation. There will be one exception: the message that
reports an invalid checksum will be marked for translation, as that
is the only message that is intended for a typical user.
Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the commit-graph file becomes corrupt, we need a way to verify
that its contents match the object database. In the manner of
'git fsck' we will implement a 'git commit-graph verify' subcommand
to report all issues with the file.
Add the 'verify' subcommand to the 'commit-graph' builtin and its
documentation. The subcommand is currently a no-op except for
loading the commit-graph into memory, which may trigger run-time
errors that would be caught by normal use. Add a simple test that
ensures the command returns a zero error code.
If no commit-graph file exists, this is an acceptable state. Do
not report any errors.
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When lazy-loading a tree for a commit, it will be important to select
the tree from a specific struct commit_graph. Create a new method that
specifies the commit-graph file and use that in
get_commit_tree_in_graph().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of verifying commit-graph file contents against the
object database, create parse_commit_internal() to allow side-stepping
the commit-graph file and parse directly from the object database.
Due to the use of generation numbers, this method should not be called
unless the intention is explicit in avoiding commits from the
commit-graph file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before verifying a commit-graph file against the object database, we
need to parse all commits from the given commit-graph file. Create
parse_commit_in_graph_one() to target a given struct commit_graph.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GRAPH_MIN_SIZE macro should be the smallest size of a parsable
commit-graph file. However, the minimum number of chunks was wrong.
It is possible to write a commit-graph file with zero commits, and
that violates this macro's value.
Rewrite the macro, and use extra macros to better explain the magic
constants.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph tests should be checking that normal Git operations
succeed and have matching output with and without the commit-graph
feature enabled. However, the test was toggling 'core.graph' instead
of the correct 'core.commitGraph' variable.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commits in state:filter.map have already been processed, so don't
filter them again. This makes incremental git filter-branch much faster.
Also add tests for --state-branch option.
Signed-off-by: Michael Barabanov <michael.barabanov@gmail.com>
Acked-by: Ian Campbell <ijc@hellion.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reuse config_from_gitmodules in repo_read_gitmodules to remove some
duplication and also have a single point where the .gitmodules file is
read.
The change does not introduce any new behavior, the same gitmodules_cb
config callback is still used, which only deals with configuration
specific to submodules.
The check about the repo's worktree is removed from repo_read_gitmodules
because it's already performed in config_from_gitmodules.
The config_from_gitmodules function is moved up in the file —unchanged—
before its users to avoid a forward declaration.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generalize config_from_gitmodules() to accept a repository as an argument.
This is in preparation to reuse the function in repo_read_gitmodules in
order to have a single point where the '.gitmodules' file is accessed.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that 'config_from_gitmodules' is not used in the open, it can be
marked as private.
Hopefully this will prevent its usage for retrieving arbitrary
configuration form the '.gitmodules' file.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function to make it clearer that retrieving 'update-clone'
configuration from the .gitmodules file is a special case supported
solely for backward compatibility purposes.
This change removes one direct use of 'config_from_gitmodules' for
options not strictly related to submodules: "submodule.fetchjobs" does
not describe a property of a submodule, but a behavior of other commands
when dealing with submodules, so it does not really belong to the
.gitmodules file.
This is in the effort to communicate better that .gitmodules is not to
be used as a mechanism to store arbitrary configuration in the
repository that any command can retrieve.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function to make it clearer that retrieving 'fetch'
configuration from the .gitmodules file is a special case supported
solely for backward compatibility purposes.
This change removes one direct use of 'config_from_gitmodules' in code
not strictly related to submodules, in the effort to communicate better
that .gitmodules is not to be used as a mechanism to store arbitrary
configuration in the repository that any command can retrieve.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .gitmodules file is not meant as a place to store arbitrary
configuration to distribute with the repository.
Move config_from_gitmodules() out of config.c and into
submodule-config.c to make it even clearer that it is not a mechanism to
retrieve arbitrary configuration from the .gitmodules file.
Signed-off-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With GNU sed, the r command doesn't care if a space separates it and
the filename it reads from.
With SunOS sed, the space is required.
Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
signoff is not specific to the am-backend. Also, re-order a few options
to make like things (e.g. strategy and strategy-option) be near each
other.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git rebase has many options that only work with one of its three backends.
It also has a few other pairs of incompatible options. Document these.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Misc leak plugging.
* sb/plug-misc-leaks:
sequencer.c: plug mem leak in git_sequencer_config
sequencer.c: plug leaks in do_pick_commit
submodule--helper: plug mem leak in print_default_remote
refs/packed-backend.c: close fd of empty file
Instead of mucking with filesystem directly, use plumbing commands
update-ref etc. to manipulate the refs in the tests.
* cc/tests-without-assuming-ref-files-backend:
t9104: kosherly remove remote refs
"git fetch --shallow-since=<cutoff>" that specifies the cut-off
point that is newer than the existing history used to end up
grabbing the entire history. Such a request now errors out.
* nd/reject-empty-shallow-request:
upload-pack: reject shallow requests that would return nothing
"git remote update" can take both a single remote nickname and a
nickname for remote groups, and the completion script (in contrib/)
has been taught about it.
* ls/complete-remote-update-names:
completion: complete remote names too
Separate "rebase -p" codepath out of "rebase -i" implementation to
slim down the latter and make it easier to manage.
* ag/rebase-p:
rebase: remove -p code from git-rebase--interactive.sh
rebase: use the new git-rebase--preserve-merges.sh
rebase: strip unused code in git-rebase--preserve-merges.sh
rebase: introduce a dedicated backend for --preserve-merges
Continuing with the idea to programatically enumerate various
pieces of data required for command line completion, teach the
codebase to report the list of configuration variables
subcommands care about to help complete them.
* nd/complete-config-vars:
completion: complete general config vars in two steps
log-tree: allow to customize 'grafted' color
completion: support case-insensitive config vars
completion: keep other config var completion in camelCase
completion: drop the hard coded list of config vars
am: move advice.amWorkDir parsing back to advice.c
advice: keep config name in camelCase in advice_config[]
fsck: produce camelCase config key names
help: add --config to list all available config
fsck: factor out msg_id_info[] lazy initialization code
grep: keep all colors in an array
Add and use generic name->id mapping code for color slot parsing
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-alloc:
alloc: allow arbitrary repositories for alloc functions
object: allow create_object to handle arbitrary repositories
object: allow grow_object_hash to handle arbitrary repositories
alloc: add repository argument to alloc_commit_index
alloc: add repository argument to alloc_report
alloc: add repository argument to alloc_object_node
alloc: add repository argument to alloc_tag_node
alloc: add repository argument to alloc_commit_node
alloc: add repository argument to alloc_tree_node
alloc: add repository argument to alloc_blob_node
object: add repository argument to grow_object_hash
object: add repository argument to create_object
repository: introduce parsed objects field
Clean up tests in t6xxx series about 'merge' command.
* en/merge-recursive-tests:
t6036: prefer test_when_finished to manual cleanup in following test
t6036, t6042: prefer test_cmp to sequences of test
t6036, t6042: prefer test_path_is_file, test_path_is_missing
t6036, t6042: use test_line_count instead of wc -l
t6036, t6042: use test_create_repo to keep tests independent
"git diff" compares the index and the working tree. For paths
added with intent-to-add bit, the command shows the full contents
of them as added, but the paths themselves were not marked as new
files. They are now shown as new by default.
"git apply" learned the "--intent-to-add" option so that an
otherwise working-tree-only application of a patch will add new
paths to the index marked with the "intent-to-add" bit.
* nd/diff-apply-ita:
apply: add --intent-to-add
t2203: add a test about "diff HEAD" case
diff: turn --ita-invisible-in-index on by default
diff: ignore --ita-[in]visible-in-index when diffing worktree-to-tree
Update to ds/generation-numbers topic.
* ds/commit-graph-lockfile-fix:
commit-graph: fix UX issue when .lock file exists
commit-graph.txt: update design document
merge: check config before loading commits
commit: use generation number in remove_redundant()
commit: add short-circuit to paint_down_to_common()
commit: use generation numbers for in_merge_bases()
ref-filter: use generation number for --contains
commit-graph: always load commit-graph information
commit: use generations in paint_down_to_common()
commit-graph: compute generation numbers
commit: add generation number to struct commit
ref-filter: fix outdated comment on in_commit_list
The in-core "commit" object had an all-purpose "void *util" field,
which was tricky to use especially in library-ish part of the
code. All of the existing uses of the field has been migrated to a
more dedicated "commit-slab" mechanism and the field is eliminated.
* nd/commit-util-to-slab:
commit.h: delete 'util' field in struct commit
merge: use commit-slab in merge remote desc instead of commit->util
log: use commit-slab in prepare_bases() instead of commit->util
show-branch: note about its object flags usage
show-branch: use commit-slab for commit-name instead of commit->util
name-rev: use commit-slab for rev-name instead of commit->util
bisect.c: use commit-slab for commit weight instead of commit->util
revision.c: use commit-slab for show_source
sequencer.c: use commit-slab to associate todo items to commits
sequencer.c: use commit-slab to mark seen commits
shallow.c: use commit-slab for commit depth instead of commit->util
describe: use commit-slab for commit names instead of commit->util
blame: use commit-slab for blame suspects instead of commit->util
commit-slab: support shared commit-slab
commit-slab.h: code split
The bulk of "git submodule foreach" has been rewritten in C.
* pc/submodule-helper-foreach:
submodule: port submodule subcommand 'foreach' from shell to C
submodule foreach: document variable '$displaypath'
submodule foreach: document '$sm_path' instead of '$path'
submodule foreach: correct '$path' in nested submodules from a subdirectory
When an error occurs in updating the working tree of a submodule in
submodule_move_head, tell the user which submodule the error occurred in.
The call to read-tree contains a super-prefix, such that the read-tree
will correctly report any path related issues, but some error messages
do not contain a path, for example:
~/gerrit$ git checkout --recurse-submodules origin/master
~/gerrit$ fatal: failed to unpack tree object 07672f31880ba80300b38492df9d0acfcd6ee00a
Give the hint which submodule has a problem.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-l" option is short for "--create-reflog". This has
caused much confusion over the years. Most people expect it
to work as "--list", because that would match the other
"mode" options like -d/--delete and -m/--move, as well as
the similar -l/--list option of git-tag.
Adding to the confusion, using "-l" _appears_ to work as
"--list" in some cases:
$ git branch -l
* master
because the branch command defaults to listing (so even
trying to specify --list in the command above is redundant).
But that may bite the user later when they add a pattern,
like:
$ git branch -l foo
which does not return an empty list, but in fact creates a
new branch (with a reflog, naturally) called "foo".
It's also probably quite uncommon for people to actually use
"-l" to create a reflog. Since 0bee591869 (Enable reflogs by
default in any repository with a working directory.,
2006-12-14), this is the default in non-bare repositories.
So it's rather unfortunate that the feature squats on the
short-and-sweet "-l" (which was only added in 3a4b3f269c
(Create/delete branch ref logs., 2006-05-19), meaning there
were only 7 months where it was actually useful).
Let's deprecate "-l" in hopes of eventually re-purposing it
to "--list".
Note that we issue the warning only when we're not in list
mode. This means that people for whom it works as a happy
accident, namely:
$ git branch -l
master
won't see the warning at all. And when we eventually switch
to it meaning "--list", that will just continue to work.
We do the issue the warning for these important cases:
- when we are actually creating a branch, in case the user
really did mean it as "--create-reflog"
- when we are in some _other_ mode, like deletion. There
the "-l" is a noop for now, but it will eventually
conflict with any other mode request, and the user
should be told that this is changing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for deprecating "-l", let's make sure we're
using the recommended option ourselves.
This patch just mechanically converts "branch -l" to "branch
--create-reflog". Note that with the exception of the
actual "--create-reflog" test, we could actually remove "-l"
entirely from most of these callers. That's because these
days core.logallrefupdates defaults to true in a non-bare
repository.
I've left them in place, though, since they serve to
document the expectation of the test, even if they are
technically noops.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test checks that the "-l" option creates a reflog. But
in fact we'd create one even without it, since the default
in a non-bare repository is to do so. Let's unset the config
so we can be sure our "-l" option is kicking in.
Note that we can't do this with test_config, since that
would leave the variable unset after our test finishes,
confusing downstream tests (the helper is not not smart
enough to restore the previous value, and just always runs
test_unconfig).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP servers return 400 if you send headers before the GET request.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of 'git-grep(1)''s new option, '--column' in order to
teach Peff's 'git-jump' script how to jump to the correct column for any
given match.
'git-grep(1)''s output is in the correct format for Vim's jump list, so
no additional cleanup is necessary.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To support git-grep(1)'s new option, '--column', document and teach
grep.c how to interpret relevant configuration options, similar to those
associated with '--line-number'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach 'git-grep(1)' a new option, '--column', to show the column
number of the first match on a non-context line. This makes it possible
to teach 'contrib/git-jump/git-jump' how to seek to the first matching
position of a grep match in your editor, and allows similar additional
scripting capabilities.
For example:
$ git grep -n --column foo | head -n3
.clang-format:51:14:# myFunction(foo, bar, baz);
.clang-format:64:7:# int foo();
.clang-format:75:8:# void foo()
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for 'git grep' learning '--column', teach grep.c's
show_line() how to show the column of the first match on non-context
lines.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To support showing the matched column when calling 'git-grep(1)', teach
'grep_opt' the normal set of options to configure the default behavior
and colorization of this feature.
Now that we have opt->columnnum, use it to disable short-circuiting over
ORs and ANDs so that col and icol are always filled with the earliest
matches on each line. In addition, don't return the first match from
match_line(), for the same reason.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling match_line(), callers presently cannot determine the
relative offset of the match because match_line() discards the
'regmatch_t' that contains this information.
Instead, teach match_line() to take in two 'ssize_t's. Fill the first
with the offset of the match produced by the given expression. If
extended, fill the later with the offset of the match produced as if
--invert were given.
For instance, matching "--not -e x" on this line produces a columnar
offset of 0, (i.e., the whole line does not contain an x), but "--invert
--not -e -x" will fill the later ssize_t of the column containing an
"x", because this expression is semantically equivalent to "-e x".
To determine the column for the inverted and non-inverted case, do the
following:
- If matching an atom, the non-inverted column is as given from
match_one_pattern(), and the inverted column is unset.
- If matching a --not, the inverted column and non-inverted column
swap.
- If matching an --and, or --or, the non-inverted column is the
minimum of the two children.
Presently, the existing short-circuiting logic for AND and OR applies as
before. This will change in the following commit when we add options to
configure the --column flag. Taken together, this and the forthcoming
change will always yield the earlier column on a given line.
This patch will become useful when we later pick between the two new
results in order to display the column number of the first match on a
line with --column.
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a link to gitsubmodules(7) under the `submodule.active` entry in
git-config(1).
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an 'unpack-sideband' subcommand to the test-pkt-line helper to
enable unpacking packet line data sent multiplexed using a sideband.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
lineNumber has casing that is inconsistent with surrounding options,
like color.grep.matchContext, and color.grep.matchSelected. Re-case this
documentation in order to be consistent with the text around it, and to
ensure that new entries are consistent, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function to free struct bitmap_index instances, and use it where
needed (except when rebuild_existing_bitmaps() is used, since it creates
references to the bitmaps within the struct bitmap_index passed to it).
Note that the hashes field in struct bitmap_index is not freed because
it points to another field within the same struct. The documentation for
that field has been updated to clarify that.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the bitmap_git global variable. Instead, generate on demand an
instance of struct bitmap_index for code that needs to access it.
This allows us significant control over the lifetime of instances of
struct bitmap_index. In particular, packs can now be closed without
worrying if an unnecessarily long-lived "pack" field in struct
bitmap_index still points to it.
The bitmap API is also clearer in that we need to first obtain a struct
bitmap_index, then we use it.
This patch raises two potential issues: (1) memory for the struct
bitmap_index is allocated without being freed, and (2)
prepare_bitmap_git() and prepare_bitmap_walk() can reuse a previously
loaded bitmap. For (1), this will be dealt with in a subsequent patch in
this patch set that also deals with freeing the contents of the struct
bitmap_index (which were not freed previously, because they have global
scope). For (2), current bitmap users only load the bitmap once at most
(note that pack-objects can use bitmaps or write bitmaps, but not both
at the same time), so support for reuse has no effect - and future users
can pass around the struct bitmap_index * obtained if they need to do 2
or more things with the same bitmap.
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Complete the removal of unused 'ewah bitmap' code by removing the now
unused 'rlwit_discharge_empty()' function. Also, the 'ewah_clear()'
function can now be made a file-scope static symbol.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When users specify the commit range with 'Z..C' pattern for format-patch, all
the parents of Z (including Z) would be marked as UNINTERESTING which would
prevent revision walk in prepare_bases from getting the prerequisite commits,
thus `git format-patch --base <base_commit_sha> Z..C` won't be able to generate
the list of prerequisite patch ids. Clear UNINTERESTING flag with
clear_object_flags solves this issue.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since commit 18633e1a22 ("rebase -i: use the rebase--helper builtin",
2017-02-09), when a commit marked as 'reword' in an interactive rebase
has conflicts and fails to apply, when the rebase is resumed that commit
will be squashed into its parent with its commit message taken.
The issue can be understood better by looking at commit 56dc3ab04b
("sequencer (rebase -i): implement the 'edit' command", 2017-01-02), which
introduced error_with_patch() for the edit command. For the edit command,
it needs to stop the rebase whether or not the patch applies cleanly. If
the patch does apply cleanly, then when it resumes it knows it needs to
amend all changes into the previous commit. If it does not apply cleanly,
then the changes should not be amended. Thus, it passes !res (success of
applying the 'edit' commit) to error_with_patch() for the to_amend flag.
The problematic line of code actually came from commit 04efc8b57c
("sequencer (rebase -i): implement the 'reword' command", 2017-01-02).
Note that to get to this point in the code:
* !!res (i.e. patch application failed)
* item->command < TODO_SQUASH
* item->command != TODO_EDIT
* !is_fixup(item->command) [i.e. not squash or fixup]
So that means this can only be a failed patch application that is either a
pick, revert, or reword. We only need to amend HEAD when rewording the
root commit or a commit that has been fast-forwarded, for any of the other
cases we want a new commit, so we should not set the to_amend flag.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Original-patch-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace calls to print ... with the function form, print(...), to
allow use with python3 as well as python2.x.
Converted using 2to3 (and some hand-editing).
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Python3, basestring no longer exists, so use this workaround.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Backticks around a variable are a deprecated alias for repr().
This has been removed in python3, so just use the string
representation instead, which is equivalent.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python3 does not have the dict.has_key() function, so replace all
such calls with "k in dict". This will still work with python2.6
and python2.7.
Converted using 2to3 (plus some hand-editing)
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The <> string inequality operator (which doesn't seem to be even
documented) no longer exists in python3. Replace with !=.
This still works with python2.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a submodule is deinit'd, the working tree is gone, so the setting of
core.worktree is bogus. Unset it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "make" in contrib/credential/netrc should run the "all" target
rather than the "test" target. Add an empty "all::" target like most of
our other Makefiles.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't call this function, and never have. The on-disk
bitmap format uses network-byte-order integers, meaning that
we cannot use the native-byte-order format written here.
Let's drop it in the name of simplicity.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't call this function, and in fact never have since it
was added (at least not in iterations of the ewah patches
that got merged). Instead we use ewah_read_mmap().
Let's drop the unused code.
Note to anybody who later wants to resurrect this: it does
not check for integer overflow in the ewah data size,
meaning it may be possible to convince the code to allocate
a too-small buffer and read() into it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The netrc test.pl script calls git-credential-netrc which imports the
Git module. Pass GITPERLLIB to git-credential-netrc via PERL5LIB to
ensure the in-tree Git module is used for testing.
Signed-off-by: Luis Marsano <luis.marsano@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some of our tests try to make sure Git behaves sensibly in a
read-only directory, by dropping 'w' permission bit before doing a
test and then restoring it after it is done. The latter is needed
for the test framework to clean after itself without leaving a
leftover directory that cannot be removed.
Ancient parts of tests however arrange the above with
chmod a-w . &&
... do the test ...
status=$?
chmod 775 .
(exit $status)
which obviously would not work if the test somehow dies before it
has the chance to do "chmod 775". Rewrite them by following a more
robust pattern recently written tests use, which is
test_when_finished "chmod 775 ." &&
chmod a-w . &&
... do the test ...
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the -L option is used to specify a line range in git log, and the end
of the range is past the end of the file, git will fail with a fatal
error. This commit prevents such behaviour - instead we perform the log
for existing lines within the specified range.
This commit also fixes a corner case where -L ,-n:file would be treated
as a log over the whole file. Now we treat this as -L 1,-n:file and
blame the first line of the file instead.
Signed-off-by: Isabella Stephens <istephens@atlassian.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the -L option is used to specify a line range in git blame, and the
end of the range is past the end of the file, git will fail with a fatal
error. This commit prevents such behavior - instead we display the blame
for existing lines within the specified range. Tests are amended
accordingly.
This commit also fixes two corner cases. Blaming -L n,-(n+1) now blames
the first n lines of a file rather than from n to the end of the file.
Blaming -L ,-n will be treated as -L 1,-n and blame the first line of
the file, rather than blaming the whole file.
Signed-off-by: Isabella Stephens <istephens@atlassian.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the new files fetch-negotiator.{h,c}, which contains an API
behind which the details of negotiation are abstracted. Currently, only
one algorithm is available: the existing one.
This patch is written to be easily reviewed: static functions are
moved verbatim from fetch-pack.c to negotiator/default.c, and it can be
seen that the lines replaced by negotiator->X() calls are present in the
X() functions respectively.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When receiving 'ACK <object-id> continue' for a common commit, check if
the commit was already known to be common and mark it as such if not up
front. This should make future refactoring of how the information about
common commits is stored more straightforward.
No visible change intended.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce the number of global variables by making the priority queue and
the count of non-common commits in it local, passing them as a struct to
various functions where necessary.
This also helps in the case that fetch_pack() is invoked twice in the
same process (when tag following is required when using a transport that
does not support tag following), in that different priority queues will
now be used in each invocation, instead of reusing the possibly
non-empty one.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In negotiation using protocol v2, fetch-pack sometimes does not make
full use of the information obtained in the ref advertisement:
specifically, that if the server advertises a commit that the client
also has, the client never needs to inform the server that it has the
commit's parents, since it can just tell the server that it has the
advertised commit and it knows that the server can and will infer the
rest.
This is because, in do_fetch_pack_v2(), rev_list_insert_ref_oid() is
invoked before mark_complete_and_common_ref(). This means that if we
have a commit that is both our ref and their ref, it would be enqueued
by rev_list_insert_ref_oid() as SEEN, and since it is thus already SEEN,
mark_complete_and_common_ref() would not enqueue it.
If mark_complete_and_common_ref() were invoked first, as it is in
do_fetch_pack() for protocol v0, then mark_complete_and_common_ref()
would enqueue it with COMMON_REF | SEEN. The addition of COMMON_REF
ensures that its parents are not sent as "have" lines.
Change the order in do_fetch_pack_v2() to be consistent with
do_fetch_pack(), and to avoid sending unnecessary "have" lines.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "ACK %s ready" is received, find_common() clears rev_list in an
attempt to stop further "have" lines from being sent [1]. It is much
more readable to explicitly break from the loop instead.
So explicitly break from the loop, and make the clearing of the rev_list
happen unconditionally.
[1] The rationale is further described in the originating commit
f2cba9299b ("fetch-pack: Finish negotation if remote replies "ACK %s
ready"", 2011-03-14).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a submodules work tree is removed, we should unset its core.worktree
setting as the worktree is no longer present. This is not just in line
with the conceptual view of submodules, but it fixes an inconvenience
for looking at submodules that are not checked out:
git clone --recurse-submodules git://github.com/git/git && cd git &&
git checkout --recurse-submodules v2.13.0
git -C .git/modules/sha1collisiondetection log
fatal: cannot chdir to '../../../sha1collisiondetection': \
No such file or directory
With this patch applied, the final call to git log works instead of dying
in its setup, as the checkout will unset the core.worktree setting such
that following log will be run in a bare repository.
This patch covers all commands that are in the unpack machinery, i.e.
checkout, read-tree, reset. A follow up patch will address
"git submodule deinit", which will also make use of the new function
submodule_unset_core_worktree(), which is why we expose it in this patch.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The topic merged in 0c7ecb7c31 (Merge branch 'sb/submodule-move-nested',
2018-05-08) provided support for moving nested submodules.
Remove the NEEDSWORK comment and implement the nested submodules test as
the comment hinted at.
Signed-off-by: Stefan Beller <sbeller@google.com>
Acked-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching with recursing into submodules, the fetch logic inspects
the superproject which submodules actually need to be fetched. This is
tricky for submodules that were renamed in the fetched range of commits.
This was implemented in c68f837576 (implement fetching of moved
submodules, 2017-10-16), and this patch fixes a mistake in the logic
there.
When the warning is printed, the `name` might be NULL as
default_name_or_path can return NULL, so fix the warning to use the path
as obtained from the diff machinery, as that is not NULL.
While at it, make sure we only attempt to load the submodule if a git
directory of the submodule is found as default_name_or_path will return
NULL in case the git directory cannot be found. Note that passing NULL
to submodule_from_name is just a semantic error, as submodule_from_name
accepts NULL as a value, but then the return value is not the submodule
that was asked for, but some arbitrary other submodule. (Cf. 'config_from'
in submodule-config.c: "If any parameter except the cache is a NULL
pointer just return the first submodule. Can be used to check whether
there are any submodules parsed.")
Reported-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Acked-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If tag following is required when using a transport that does not
support tag following, fetch_pack() will be invoked twice in the same
process, necessitating a clearing of the object flags used by
fetch_pack() sometime during the second invocation. This is currently
done in find_common(), which means that the invocation of
mark_complete_and_common_ref() in do_fetch_pack() is useless.
(This cannot be reproduced with Git alone, because all transports that
come with Git support tag following.)
Therefore, move this clearing from find_common() to its
parent function do_fetch_pack(), right before it calls
mark_complete_and_common_ref().
This has been occurring since the commit that introduced the clearing of
marks, 420e9af498 ("Fix tag following", 2008-03-19).
The corresponding code for protocol v2 in do_fetch_pack_v2() does not
have this problem, as the clearing of flags is done before any marking
(whether by rev_list_insert_ref_oid() or
mark_complete_and_common_ref()).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function everything_local(), despite its name, also (1) marks
commits as COMPLETE and COMMON_REF and (2) invokes filter_refs() as
important side effects. Extract (1) into its own function
(mark_complete_and_common_ref()) and remove
(2).
The restoring of save_commit_buffer, which was introduced in a1c6d7c1a7
("fetch-pack: restore save_commit_buffer after use", 2017-12-08), is a
concern of the parse_object() call in mark_complete_and_common_ref(), so
it has been moved from the end of everything_local() to the end of
mark_complete_and_common_ref().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetch-pack --all became broken with respect to unusual tags in
5f0fc64513 (fetch-pack: eliminate spurious error messages, 2012-09-09),
and was fixed only recently in e9502c0a7f (fetch-pack: don't try to fetch
peel values with --all, 2018-06-11). However the test added in
e9502c0a7f does not explicitly cover all funky cases.
In order to be sure fetching funky tags will never break, let's
explicitly test all relevant cases with 4 tag objects pointing to 1) a
blob, 2) a tree, 3) a commit, and 4) another tag objects. The referenced
tag objects themselves are referenced from under regular refs/tags/*
namespace. Before e9502c0a7f `fetch-pack --all` was failing e.g. this way:
.../git/t/trash directory.t5500-fetch-pack/fetchall$ git ls-remote ..
44085874... HEAD
...
bc4e9e1f... refs/tags/tag-to-blob
038f48ad... refs/tags/tag-to-blob^{} # peeled
520db1f5... refs/tags/tag-to-tree
7395c100... refs/tags/tag-to-tree^{} # peeled
.../git/t/trash directory.t5500-fetch-pack/fetchall$ git fetch-pack --all ..
fatal: A git upload-pack: not our ref 038f48ad...
fatal: The remote end hung up unexpectedly
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/send-pack didn't call git_default_config, and because of this
git push --signed didn't respect the username and email in gitconfig in
the HTTP transport.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In partial_clone_get_default_filter_spec(), the
core_partial_clone_filter_default variable may be NULL; ensure that it
is not NULL before using it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
handle_change_delete() has a block of code displaying one of four nearly
identical messages. Each contains about half a dozen variable
interpolations, which use nearly identical variables as well. Someone
trying to parse this may be slowed down trying to parse the differences
and why they are here; help them out by adding a comment explaining the
differences.
Further, point out that this code structure isn't collapsed into something
more concise and readable for the programmer, because we want to keep full
messages intact in order to make translators' jobs much easier.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions were added because processing of these conflicts needed
to be deferred until process_entry() in order to get D/F conflicts and
such right. The number of these has grown over time, and now include
some whose name is misleading:
* conflict_rename_normal() is for handling normal file renames; a
typical rename may need content merging, but we expect conflicts
from that to be more the exception than the rule.
* conflict_rename_via_dir() will not be a conflict; it was just an
add that turned into a move due to directory rename detection.
(If there was a file in the way of the move, that would have been
detected and reported earlier.)
* conflict_rename_rename_2to1 and conflict_rename_add (the latter
of which doesn't exist yet but has been submitted before and I
intend to resend) technically might not be conflicts if the
colliding paths happen to match exactly.
Rename this family of functions to handle_rename_*().
Also rename handle_renames() to detect_and_process_renames() both to make
it clearer what it does, and to differentiate it as a pre-processing step
from all the handle_rename_*() functions which are called from
process_entry().
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had an enum of rename types which included RENAME_DIR; this name felt
misleading since it was not about an entire directory but was a status for
each individual file add that occurred within a renamed directory.
Since this type is for signifying that the files in question were being
renamed due to directory rename detection, rename this enum value to
RENAME_VIA_DIR.
Make a similar change to the conflict_rename_dir() function, and add a
comment to the top of that function explaining its purpose (it may not be
quite as obvious as for the other conflict_rename_*() functions).
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various refactorings throughout the code have left lots of alignment
issues that were driving me crazy; fix them.
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-backend reads whole input until EOF. However, the RFC 3875 specifies
that a script must read only as many bytes as specified by CONTENT_LENGTH
environment variable. Web server may exercise the specification by not closing
the script's standard input after writing content. In that case http-backend
would hang waiting for the input. The issue is known to happen with
IIS/Windows, for example.
Make http-backend read only CONTENT_LENGTH bytes, if it's defined, rather than
the whole input until EOF. If the variable is not defined, keep older behavior
of reading until EOF because it is used to support chunked transfer-encoding.
This commit only fixes buffered input, whcih reads whole body before
processign it. Non-buffered input is going to be fixed in subsequent commit.
Signed-off-by: Florian Manschwetus <manschwetus@cs-software-gmbh.de>
[mk: fixed trivial build failures and polished style issues]
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "fetch-pack --all" sees a tag-to-blob on the remote, it
tries to fetch both the tag itself ("refs/tags/foo") and the
peeled value that the remote advertises ("refs/tags/foo^{}").
Asking for the object pointed to by the latter can cause
upload-pack to complain with "not our ref", since it does
not mark the peeled objects with the OUR_REF (unless they
were at the tip of some other ref).
Arguably upload-pack _should_ be marking those peeled
objects. But it never has in the past, since clients would
generally just ask for the tag and expect to get the peeled
value along with it. And that's how "git fetch" works, as
well as older versions of "fetch-pack --all".
The problem was introduced by 5f0fc64513 (fetch-pack:
eliminate spurious error messages, 2012-09-09). Before then,
the matching logic was something like:
if (refname is ill-formed)
do nothing
else if (doing --all)
always consider it matched
else
look through list of sought refs for a match
That commit wanted to flip the order of the second two arms
of that conditional. But we ended up with:
if (refname is ill-formed)
do nothing
else
look through list of sought refs for a match
if (--all and no match so far)
always consider it matched
That means tha an ill-formed ref will trigger the --all
conditional block, even though we should just be ignoring
it. We can fix that by having a single "else" with all of
the well-formed logic, that checks the sought refs and
"--all" in the correct order.
Reported-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commands that make use of --git-completion-helper feature could
now produce a lot of --no-xxx options that a command can take. This in
many case could nearly double the amount of completable options, using
more screen estate and also harder to search for the wanted option.
This patch attempts to mitigate that by collapsing extra --no-
options, the ones that are added by --git-completion-helper and not in
original struct option arrays. The "--no-..." option will be displayed
in this case to hint about more options, e.g.
> ~/w/git $ git clone --
--bare --origin=
--branch= --progress
--checkout --quiet
--config= --recurse-submodules
--depth= --reference=
--dissociate --reference-if-able=
--filter= --separate-git-dir=
--hardlinks --shallow-exclude=
--ipv4 --shallow-since=
--ipv6 --shallow-submodules
--jobs= --shared
--local --single-branch
--mirror --tags
--no-... --template=
--no-checkout --upload-pack=
--no-hardlinks --verbose
--no-tags
and when you complete it with --no-<tab>, all negative options will be
presented:
> ~/w/git $ git clone --no-
--no-bare --no-quiet
--no-branch --no-recurse-submodules
--no-checkout --no-reference
--no-config --no-reference-if-able
--no-depth --no-separate-git-dir
--no-dissociate --no-shallow-exclude
--no-filter --no-shallow-since
--no-hardlinks --no-shallow-submodules
--no-ipv4 --no-shared
--no-ipv6 --no-single-branch
--no-jobs --no-tags
--no-local --no-template
--no-mirror --no-upload-pack
--no-origin --no-verbose
--no-progress
Corner case: to make sure that people will never accidentally complete
the fake option "--no-..." there must be one real --no- in the first
complete listing even if it's not from the original struct option.
PS. This could could be made simpler with ";&" to fall through from
"--no-*" block and share the code but ";&" is not available on bash-3
(i.e. Mac)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A regression introduced in 8462ff43 ("convert_to_git():
safe_crlf/checksafe becomes int conv_flags", 2018-01-13) back in Git
2.17 cycle caused autocrlf rewrites to produce a warning message
despite setting safecrlf=false.
Signed-off-by: Anthony Sottile <asottile@umich.edu>
Acked-By: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of test scripts create forged GPG signed commits or tags to
check that such forgery can't fool various git commands' signature
verification. All but one of those test scripts are prone to
occasional failures because the forgery creates a bogus GPG signature,
and git commands error out with an unexpected error message, e.g.
"Commit deadbeef does not have a GPG signature" instead of "... has a
bad GPG signature".
't5573-pull-verify-signatures.sh', 't7510-signed-commit.sh' and
't7612-merge-verify-signatures.sh' create forged signed commits like
this:
git commit -S -m "bad on side" &&
git cat-file commit side-bad >raw &&
sed -e "s/bad/forged bad/" raw >forged &&
git hash-object -w -t commit forged >forged.commit
On rare occasions the given pattern occurs not only in the commit
message but in the GPG signature as well, and after it's replaced in
the signature the resulting signature becomes invalid, GPG will report
CRC error and that it couldn't find any signature, which will then
ultimately cause the test failure.
Since in all three cases the pattern to be replaced during the forgery
is the first word of the commit message's subject line, and since the
GPG signature in the commit object is indented by a space, let's just
anchor those patterns to the beginning of the line to prevent this
issue.
The test script 't7030-verify-tag.sh' creates a forged signed tag
object in a similar way by replacing the pattern "seventh", but the
GPG signature in tag objects is not indented by a space, so the above
solution is not applicable in this case. However, in the tag object
in question the pattern "seventh" occurs not only in the tag message
but in the 'tag' header as well. To create a forged tag object it's
sufficient to replace only one of the two occurences, so modify the
sed script to limit the pattern to the 'tag' header (i.e. a line
beginning with "tag ", which, because of the space character, can
never occur in the base64-encoded GPG signature).
Note that the forgery in 't7004-tag.sh' is not affected by this issue:
while 't7004' does create a forged signed tag kind of the same way,
it replaces "signed-tag" in the tag object, which, because of the '-'
character, can never occur in the base64-encoded GPG signarute.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The two tests 'detect fudged signature' and 'detect fudged signature
with NUL' in 't7510-signed-commit.sh' check that 'git verify-commit'
errors out when encountering a forged commit, but they do so by
running
! git verify-commit ...
Use 'test_must_fail' instead, because that would catch potential
unexpected errors like a segfault as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate a `struct refspec_item` on the stack without initializing
it. In particular, its `dst` and `src` members will contain some random
data from the stack. When we later call `refspec_item_clear()`, it will
call `free()` on those pointers. So if the call to `parse_refspec()` did
not assign to them, we will be freeing some random "pointers". This is
undefined behavior.
To the best of my understanding, this cannot currently be triggered by
user-provided data. And for what it's worth, the test-suite does not
trigger this with SANITIZE=address. It can be provoked by calling
`valid_fetch_refspec(":*")`.
Zero the struct, as is done in other users of `struct refspec_item` by
using the refspec_item_init() initialization function.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Re-add the non-fatal version of refspec_item_init_or_die() renamed
away in an earlier change to get a more minimal diff. This should be
used by callers that have their own error handling.
This new function could be marked "static" since nothing outside of
refspec.c uses it, but expecting future use of it, let's make it
available to other users.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the refspec_item_init() function introduced in
6d4c057859 ("refspec: introduce struct refspec", 2018-05-16) to
refspec_item_init_or_die().
This follows the convention of other *_or_die() functions, and is done
in preparation for making it a wrapper for a non-fatal variant.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
recount_edited_hunk() introduced in commit 2b8ea7f3c7 ("add -p:
calculate offset delta for edited patches", 2018-03-05) required all
context lines to start with a space, empty lines are not counted. This
was intended to avoid any recounting problems if the user had
introduced empty lines at the end when editing the patch. However this
introduced a regression into 'git add -p' as it seems it is common for
editors to strip the trailing whitespace from empty context lines when
patches are edited thereby introducing empty lines that should be
counted. 'git apply' knows how to deal with such empty lines and POSIX
states that whether or not there is an space on an empty context line
is implementation defined [1].
Fix the regression by counting lines that consist solely of a newline
as well as lines starting with a space as context lines and add a test
to prevent future regressions.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/diff.html
Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Reported-by: Oliver Joseph Ash <oliverjash@gmail.com>
Reported-by: Jeff Felchner <jfelchner1@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a checkout.defaultRemote setting which can be used to
designate a remote to prefer (via checkout.defaultRemote=origin) when
running e.g. "git checkout master" to mean origin/master, even though
there's other remotes that have the "master" branch.
I want this because it's very handy to use this workflow to checkout a
repository and create a topic branch, then get back to a "master" as
retrieved from upstream:
(
cd /tmp &&
rm -rf tbdiff &&
git clone git@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git checkout master
)
That will output:
Branch 'master' set up to track remote branch 'master' from 'origin'.
Switched to a new branch 'master'
But as soon as a new remote is added (e.g. just to inspect something
from someone else) the DWIMery goes away:
(
cd /tmp &&
rm -rf tbdiff &&
git clone git@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git remote add avar git@github.com:avar/tbdiff.git &&
git fetch avar &&
git checkout master
)
Will output (without the advice output added earlier in this series):
error: pathspec 'master' did not match any file(s) known to git.
The new checkout.defaultRemote config allows me to say that whenever
that ambiguity comes up I'd like to prefer "origin", and it'll still
work as though the only remote I had was "origin".
Also adjust the advice.checkoutAmbiguousRemoteBranchName message to
mention this new config setting to the user, the full output on my
git.git is now (the last paragraph is new):
$ ./git --exec-path=$PWD checkout master
error: pathspec 'master' did not match any file(s) known to git.
hint: 'master' matched more than one remote tracking branch.
hint: We found 26 remotes with a reference that matched. So we fell back
hint: on trying to resolve the argument as a path, but failed there too!
hint:
hint: If you meant to check out a remote tracking branch on, e.g. 'origin',
hint: you can do so by fully qualifying the name with the --track option:
hint:
hint: git checkout --track origin/<name>
hint:
hint: If you'd like to always have checkouts of an ambiguous <name> prefer
hint: one remote, e.g. the 'origin' remote, consider setting
hint: checkout.defaultRemote=origin in your config.
I considered splitting this into checkout.defaultRemote and
worktree.defaultRemote, but it's probably less confusing to break our
own rules that anything shared between config should live in core.*
than have two config settings, and I couldn't come up with a short
name under core.* that made sense (core.defaultRemoteForCheckout?).
See also 70c9ac2f19 ("DWIM "git checkout frotz" to "git checkout -b
frotz origin/frotz"", 2009-10-18) which introduced this DWIM feature
to begin with, and 4e85333197 ("worktree: make add <path> <branch>
dwim", 2017-11-26) which added it to git-worktree.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the "checkout" documentation describes:
If <branch> is not found but there does exist a tracking branch in
exactly one remote (call it <remote>) with a matching name, treat
as equivalent to [...] <remote>/<branch.
This is a really useful feature. The problem is that when you add
another remote (e.g. a fork), git won't find a unique branch name
anymore, and will instead print this unhelpful message:
$ git checkout master
error: pathspec 'master' did not match any file(s) known to git
Now it will, on my git.git checkout, print:
$ ./git --exec-path=$PWD checkout master
error: pathspec 'master' did not match any file(s) known to git.
hint: 'master' matched more than one remote tracking branch.
hint: We found 26 remotes with a reference that matched. So we fell back
hint: on trying to resolve the argument as a path, but failed there too!
hint:
hint: If you meant to check out a remote tracking branch on, e.g. 'origin',
hint: you can do so by fully qualifying the name with the --track option:
hint:
hint: git checkout --track origin/<name>
Note that the "error: pathspec[...]" message is still printed. This is
because whatever else checkout may have tried earlier, its final
fallback is to try to resolve the argument as a path. E.g. in this
case:
$ ./git --exec-path=$PWD checkout master pu
error: pathspec 'master' did not match any file(s) known to git.
error: pathspec 'pu' did not match any file(s) known to git.
There we don't print the "hint:" implicitly due to earlier logic
around the DWIM fallback. That fallback is only used if it looks like
we have one argument that might be a branch.
I can't think of an intrinsic reason for why we couldn't in some
future change skip printing the "error: pathspec[...]" error. However,
to do so we'd need to pass something down to checkout_paths() to make
it suppress printing an error on its own, and for us to be confident
that we're not silencing cases where those errors are meaningful.
I don't think that's worth it since determining whether that's the
case could easily change due to future changes in the checkout logic.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no point in doing this right now, but in later change the
"ret" variable will be inspected. This change makes that meaningful
change smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the previously added "num_matches" struct value up to the callers
of unique_tracking_name(). This will allow callers to optionally print
better error messages in a later change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Internally track how many matches we find in the check_tracking_name()
callback. Nothing uses this now, but it will be made use of in a later
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an *_INIT macro for the tracking_name_data similar to what exists
elsewhere in the codebase, e.g. OID_ARRAY_INIT in sha1-array.h. This
will make it more idiomatic in later changes to add more fields to the
struct & its initialization macro.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The line was too long already, and will be longer still when a later
change adds another argument.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Assert that whenever there's a DWIM checkout that the index should be
clean afterwards, in addition to the correct branch being checked-out.
The way the DWIM checkout code in checkout.[ch] works is by looping
over all remotes, and for each remote trying to find if a given
reference name only exists on that remote, or if it exists anywhere
else.
This is done by starting out with a `unique = 1` tracking variable in
a struct shared by the entire loop, which will get set to `0` if the
data reference is not unique.
Thus if we find a match we know the dst_oid member of
tracking_name_data must be correct, since it's associated with the
only reference on the only remote that could have matched our query.
But if there was ever a mismatch there for some reason we might end up
with the correct branch checked out, but at the wrong oid, which would
show whatever the difference between the two staged in the
index (checkout branch A, stage changes from the state of branch B).
So let's amend the tests (mostly added in) 399e4a1c56 ("t2024: Add
tests verifying current DWIM behavior of 'git checkout <branch>'",
2013-04-21) to always assert that "status" is clean after we run
"checkout", that's being done with "-uno" because there's going to be
some untracked files related to the test itself which we don't care
about.
In all these tests (DWIM or otherwise) we start with a clean index, so
these tests are asserting that that's still the case after the
"checkout", failed or otherwise.
Then if we ever run into this sort of regression, either in the
existing code or with a new feature, we'll know.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use skip_prefix() instead of starts_with() and strcmp() when parsing
'git update-ref's stdin to avoid a couple of magic numbers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Going to leave, we additionally free the author and commit message
and make sure to call update_abort_safety_file().
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As there are plans to implement other ref storage systems,
let's use a way to remove remote refs that does not depend
on refs being files.
This makes it clear to readers that this test does not
depend on which ref backend is used.
Suggested-by: Michael Haggerty <mhagger@alum.mit.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shallow clones with --shallow-since or --shalow-exclude work by
running rev-list to get all reachable commits, then draw a boundary
between reachable and unreachable and send "shallow" requests based on
that.
The code does miss one corner case: if rev-list returns nothing, we'll
have no border and we'll send no shallow requests back to the client
(i.e. no history cuts). This essentially means a full clone (or a full
branch if the client requests just one branch). One example is the
oldest commit is older than what is specified by --shallow-since.
To avoid this, if rev-list returns nothing, we abort the clone/fetch.
The user could adjust their request (e.g. --shallow-since further back
in the past) and retry.
Another possible option for this case is to fall back to a default
depth (like depth 1). But I don't like too much magic that way because
we may return something unexpected to the user. If they request
"history since 2008" and we return a single depth at 2000, that might
break stuff for them. It is better to tell them that something is
wrong and let them take the best course of action.
Note that we need to die() in get_shallow_commits_by_rev_list()
instead of just checking for empty result from its caller
deepen_by_rev_list() and handling the error there. The reason is,
empty result could be a valid case: if you have commits in year 2013
and you request --shallow-since=year.2000 then you should get a full
clone (i.e. empty result).
Reported-by: Andreas Krey <a.krey@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the code specific to preserve-merges was moved to
git-rebase--preserve-merges.sh, and so it’s useless to keep it
here.
The intent of this commit is to clean this script as much as possible to
prepare a peaceful conversion as a builtin written in C.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Create a new type of rebase, "preserve-merges", used when rebase is
called with -p.
Before that, the type for preserve-merges was "interactive", and some
places of this script compared $type to "interactive". Instead, the code
now checks if $interactive_rebase is empty or not, as it is set to
"explicit" when calling an interactive rebase (and, possibly, one of its
submodes), and "implied" when calling one of its
submodes (eg. preserve-merges) *without* interactive rebase.
It also detects the presence of the directory "$merge_dir"/rewritten
left by the preserve-merges script when calling rebase --continue,
--skip, etc., and, if it exists, sets the rebase mode to
preserve-merges. In this case, interactive_rebase is set to "explicit",
as "implied" would break some tests.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
This removes the code coming from git-rebase--interactive.sh that is not
needed by preserve-merges, and changes the header comment accordingly.
In a following commit, the -p code from git-rebase--interactive.sh will
be stripped out. As preserve-merges’ successor is already in the works,
this will be the only script to be converted.
This also seems to fix a bug where a failure in
`pick_one_preserving_merges()` would fallback to the non-preserve-merges
`pick_one()`.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
This duplicates git-rebase--interactive.sh to
git-rebase--preserve-merges.sh. This is done to split -p from -i. No
modifications are made to this file here, but any code that is not used
by -p will be stripped in the next commit.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
There are 581 config variables as of now when you do "git config
<tab>" which can fill up a few screens and is not very helpful when
you have to look through columns of text to find what you want.
This patch instead shows you only first level when you do
git config <tab>
There are 78 items, which use up 8 rows in my screen. Compared to
screens of text, it's pretty good. Once you have chosen you first
level, e.g. color:
git config color.<tab>
will show you all color.*
This is not a new idea. branch.* and remote.* completion already does
this for second and third levels. For those variables, you'll need to
<tab> three times to get full variable name.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 76f5df305b (log: decorate grafted commits with "grafted" -
2011-08-18) lets us decorate grafted commits but I forgot about the
color.decorate.* config.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Config variables are case-insensitive but this case/esac construct is
case-sensitive by default. For bash v4, it'll be easy. For platforms
that are stuck with older versions, we need an external command, but
that is not that critical. And where this additional overhead matters
the most is Windows, but luckily Git for Windows ships with Bash v4.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The last patch makes "git config <tab>" shows camelCase names because
that's what's in the source: config.txt. There are still a couple
manual var completion in this code. Let's make them follow the naming
convention as well.
In theory we could automate this part too because we have the
information. But let's stick to one step at a time and leave this for
later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new help option --config-for-completion is a machine friendlier
version of --config where all the placeholders and wildcards are
dropped, leaving only the good, completable prefixes for
git-completion.bash to consume.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only benefit from this move (apart from cleaner code) is that
advice.amWorkDir should now show up in `git help --config`. There
should be no regression since advice config is always read by the
git_default_config().
While at there, use advise() like other code. We now get "hint: "
prefix and the output is stderr instead of stdout (which is also the
reason for the test update because stderr is checked in a following
test and the extra advice can fail it).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For parsing, we don't really need this because the main config parser
will lowercase everything so we can do exact matching. But this array
now is also used for printing in `git help --config`. Keep camelCase
so we have a nice printout.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes it helps to list all available config vars so the user can
search for something they want. The config man page can also be used
but it's harder to search if you want to focus on the variable name,
for example.
This is not the best way to collect the available config since it's
not precise. Ideally we should have a centralized list of config in C
code (pretty much like 'struct option'), but that's a lot more work.
This will do for now.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This array will be used by some other function than parse_msg_id() in
the following commit. Factor out this prep code so it could be called
from that one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is more inline with how we handle color slots in other code. It
also allows us to get the list of configurable color slots later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard coding the name-to-id mapping in C code, keep it in an
array and use a common function to do the parsing. This reduces code
and also allows us to list all possible color slots later.
This starts using C99 designated initializers more for convenience
(the first designated initializers have been introduced in builtin/clean.c
for some time without complaints)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/command-list:
completion: allow to customize the completable command list
completion: add and use --list-cmds=alias
completion: add and use --list-cmds=nohelpers
Move declaration for alias.c to alias.h
completion: reduce completable command list
completion: let git provide the completable command list
command-list.txt: documentation and guide line
help: use command-list.txt for the source of guides
help: add "-a --verbose" to list all commands with synopsis
git: support --list-cmds=list-<category>
completion: implement and use --list-cmds=main,others
git --list-cmds: collect command list in a string_list
git.c: convert --list-* to --list-cmds=*
Remove common-cmds.h
help: use command-list.h for common command list
generate-cmds.sh: export all commands to command-list.h
generate-cmds.sh: factor out synopsis extract code
Most --no- options do have some use, even if rarely to negate some
option that's specified in an alias.
These options --no-ours and --no-theirs however have no clear
semantics. If I specify "--ours --no-theirs", the second will reset
writeout stage and is equivalent of "--no-ours --no-theirs" which is
not that easy to see. Drop them. You can either switch from --ours to
--theirs and back but you can never negate them.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 7fb6aefd2a (Merge branch 'nd/parseopt-completion' - 2018-03-14)
is merged, the completion for negative form is left out because the
series is alread long and it could be done in a follow up series. This
is it.
--git-completion-helper now provides --no-xxx so that git-completion.bash
can drop the extra custom --no-xxx in the script. It adds a lot more
--no-xxx than what's current provided by the git-completion.bash
script. We'll trim that down later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to 'git reset -N', this option makes 'git apply' automatically
mark new files as intent-to-add so they are visible in the following
'git diff' command and could also be committed with 'git commit -a'.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previous attempts to fix ita-related diffs breaks this case. To make
sure that does not happen again, add a test to verify the behavior
wrt. ita entries when we diff a worktree and a tree.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to the implementation detail of intent-to-add entries, the current
"git diff" (i.e. no treeish or --cached argument) would show the
changes in the i-t-a file, but it does not mark the file as new, while
"diff --cached" would mark the file as new while showing its content
as empty.
$ git diff | $ diff --cached
--------------------------------|-------------------------------
diff --git a/new b/new | diff --git a/new b/new
index e69de29..5ad28e2 100644 | new file mode 100644
--- a/new | index 0000000..e69de29
+++ b/new |
@@ -0,0 +1 @@ |
+haha |
One evidence of the current output being wrong is that, the output
from "git diff" (with ita entries) cannot be applied because it
assumes empty files exist before applying.
Turning on --ita-invisible-in-index [1] [2] would fix this. The result
is "new file" line moving from "git diff --cached" to "git diff".
$ git diff | $ diff --cached
--------------------------------|-------------------------------
diff --git a/new b/new |
new file mode 100644 |
index 0000000..5ad28e2 |
--- /dev/null |
+++ b/new |
@@ -0,0 +1 @@ |
+haha |
This option is on by default in git-status [1] but we need more fixup
in rename detection code [3]. Luckily we don't need to do anything
else for the rename detection code in diff.c (wt-status.c uses a
customized one).
[1] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist
in index" - 2016-10-24)
[2] b42b451919 (diff: add --ita-[in]visible-in-index - 2016-10-24)
[3] bc3dca07f4 (Merge branch 'nd/ita-wt-renames-in-status' - 2018-01-23)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option is supposed to fix the diff of "diff-files" (not reporting
ita entries as new files) and "diff-index --cached <tree>" (showing ita
entries as present in the index with empty content) but not
"diff-index <tree>".
When --ita-invisible-in-index is set on "git diff-index <tree>",
unpack_trees() will eventually call oneway_diff() on the ita entry
with the same code flow as "diff-index --cached <tree>". We want to
ignore the ita entry for "diff-index --cached <tree>" but not
"diff-index <tree>" since the latter will examine and produce a diff
based on worktree entry's (real) content, not ita index entry's
(empty) content.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 32637cdf4a (show-index.c: learn about index v2,
2007-04-09) changed the output format of show-index to
include the object CRC32 but didn't update the
documentation. Let's fix that and generally describe the
output in more detail.
There are a few other fixes here while we're rewording:
- refer to index-pack along with pack-objects, since either
can create .idx files
- use "linkgit:" for referring to other commands
- expand the bit about verify-pack, giving reasons why you
might want this command instead. I almost omitted this
entirely, but referring to verify-pack might help a
reader who is looking for more information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-show-index command is built as its own separate
program. There's really no good reason for this, and it
means we waste extra space on disk (and CPU time running the
linker). Let's fold it in to the main binary as a builtin.
The history here is actually a bit amusing. The program
itself is mostly self-contained, and doesn't even use our
normal pack index code. In a5031214c4 (slim down "git
show-index", 2010-01-21), we even stopped using xmalloc() so
that it could avoid libgit.a entirely. But then 040a655116
(cleanup: use internal memory allocation wrapper functions
everywhere, 2011-10-06) switched that back to xmalloc, which
later become ALLOC_ARRAY().
Making it a builtin should give us the best of both worlds:
no wasted space and no need to avoid the usual patterns.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Manually cleaning up from former tests in subsequent ones breaks the
ability to select which tests we want to run. Use test_when_finished to
avoid this problem.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests used pretty strong measures to get a clean slate:
git rm -rf . &&
git clean -fdqx &&
rm -rf .git &&
git init &&
It's easier, safer (what if a previous test has a bug and accidentally
changes into a directory outside the test path?), and allows re-inspecting
test setup later if we instead just use test_create_repo to put different
tests into separate sub-repositories.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use the lockfile API to avoid multiple Git processes from writing to
the commit-graph file in the .git/objects/info directory. In some cases,
this directory may not exist, so we check for its existence.
The existing code does the following when acquiring the lock:
1. Try to acquire the lock.
2. If it fails, try to create the .git/object/info directory.
3. Try to acquire the lock, failing if necessary.
The problem is that if the lockfile exists, then the mkdir fails, giving
an error that doesn't help the user:
"fatal: cannot mkdir .git/objects/info: File exists"
While technically this honors the lockfile, it does not help the user.
Instead, do the following:
1. Check for existence of .git/objects/info; create if necessary.
2. Try to acquire the lock, failing if necessary.
The new output looks like:
fatal: Unable to create
'<dir>/.git/objects/info/commit-graph.lock': File exists.
Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We now calculate generation numbers in the commit-graph file and use
them in paint_down_to_common().
Expand the section on generation numbers to discuss how the three
special generation numbers GENERATION_NUMBER_INFINITY, _ZERO, and
_MAX interact with other generation numbers.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we use generation numbers from the commit-graph, we must
ensure that all commits that exist in the commit-graph are loaded
from that file instead of from the object database. Since the
commit-graph file is only checked if core.commitGraph is true, we
must check the default config before we load any commits.
In the merge builtin, the config was checked after loading the HEAD
commit. This was due to the use of the global 'branch' when checking
merge-specific config settings.
Move the config load to be between the initialization of 'branch' and
the commit lookup.
Without this change, a fast-forward merge would hit a BUG("bad
generation skip") statement in commit.c during paint_down_to_common().
This is because the HEAD commit would be loaded with "infinite"
generation but then reached by commits with "finite" generation
numbers.
Add a test to t5318-commit-graph.sh that exercises this code path to
prevent a regression.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The static remove_redundant() method is used to filter a list
of commits by removing those that are reachable from another
commit in the list. This is used to remove all possible merge-
bases except a maximal, mutually independent set.
To determine these commits are independent, we use a number of
paint_down_to_common() walks and use the PARENT1, PARENT2 flags
to determine reachability. Since we only care about reachability
and not the full set of merge-bases between 'one' and 'twos', we
can use the 'min_generation' parameter to short-circuit the walk.
When no commit-graph exists, there is no change in behavior.
For a copy of the Linux repository, we measured the following
performance improvements:
git merge-base v3.3 v4.5
Before: 234 ms
After: 208 ms
Rel %: -11%
git merge-base v4.3 v4.5
Before: 102 ms
After: 83 ms
Rel %: -19%
The experiments above were chosen to demonstrate that we are
improving the filtering of the merge-base set. In the first
example, more time is spent walking the history to find the
set of merge bases before the remove_redundant() call. The
starting commits are closer together in the second example,
therefore more time is spent in remove_redundant(). The relative
change in performance differs as expected.
Reported-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git branch --contains', the in_merge_bases_many()
method calls paint_down_to_common() to discover if a specific
commit is reachable from a set of branches. Commits with lower
generation number are not needed to correctly answer the
containment query of in_merge_bases_many().
Add a new parameter, min_generation, to paint_down_to_common() that
prevents walking commits with generation number strictly less than
min_generation. If 0 is given, then there is no functional change.
For in_merge_bases_many(), we can pass commit->generation as the
cutoff, and this saves time during 'git branch --contains' queries
that would otherwise walk "around" the commit we are inspecting.
For a copy of the Linux repository, where HEAD is checked out at
v4.13~100, we get the following performance improvement for
'git branch --contains' over the previous commit:
Before: 0.21s
After: 0.13s
Rel %: -38%
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The containment algorithm for 'git branch --contains' is different
from that for 'git tag --contains' in that it uses is_descendant_of()
instead of contains_tag_algo(). The expensive portion of the branch
algorithm is computing merge bases.
When a commit-graph file exists with generation numbers computed,
we can avoid this merge-base calculation when the target commit has
a larger generation number than the initial commits.
Performance tests were run on a copy of the Linux repository where
HEAD is contained in v4.13 but no earlier tag. Also, all tags were
copied to branches and 'git branch --contains' was tested:
Before: 60.0s
After: 0.4s
Rel %: -99.3%
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A commit A can reach a commit B only if the generation number of A
is strictly larger than the generation number of B. This condition
allows significantly short-circuiting commit-graph walks.
Use generation number for '--contains' type queries.
On a copy of the Linux repository where HEAD is contained in v4.13
but no earlier tag, the command 'git tag --contains HEAD' had the
following peformance improvement:
Before: 0.81s
After: 0.04s
Rel %: -95%
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most code paths load commits using lookup_commit() and then
parse_commit(). In some cases, including some branch lookups, the commit
is parsed using parse_object_buffer() which side-steps parse_commit() in
favor of parse_commit_buffer().
With generation numbers in the commit-graph, we need to ensure that any
commit that exists in the commit-graph file has its generation number
loaded.
Create new load_commit_graph_info() method to fill in the information
for a commit that exists only in the commit-graph file. Call it from
parse_commit_buffer() after loading the other commit information from
the given buffer. Only fill this information when specified by the
'check_graph' parameter.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define compare_commits_by_gen_then_commit_date(), which uses generation
numbers as a primary comparison and commit date to break ties (or as a
comparison when both commits do not have computed generation numbers).
Since the commit-graph file is closed under reachability, we know that
all commits in the file have generation at most GENERATION_NUMBER_MAX
which is less than GENERATION_NUMBER_INFINITY.
This change does not affect the number of commits that are walked during
the execution of paint_down_to_common(), only the order that those
commits are inspected. In the case that commit dates violate topological
order (i.e. a parent is "newer" than a child), the previous code could
walk a commit twice: if a commit is reached with the PARENT1 bit, but
later is re-visited with the PARENT2 bit, then that PARENT2 bit must be
propagated to its parents. Using generation numbers avoids this extra
effort, even if it is somewhat rare.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While preparing commits to be written into a commit-graph file, compute
the generation numbers using a depth-first strategy.
The only commits that are walked in this depth-first search are those
without a precomputed generation number. Thus, computation time will be
relative to the number of new commits to the commit-graph file.
If a computed generation number would exceed GENERATION_NUMBER_MAX, then
use GENERATION_NUMBER_MAX instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The generation number of a commit is defined recursively as follows:
* If a commit A has no parents, then the generation number of A is one.
* If a commit A has parents, then the generation number of A is one
more than the maximum generation number among the parents of A.
Add a uint32_t generation field to struct commit so we can pass this
information to revision walks. We use three special values to signal
the generation number is invalid:
GENERATION_NUMBER_INFINITY 0xFFFFFFFF
GENERATION_NUMBER_MAX 0x3FFFFFFF
GENERATION_NUMBER_ZERO 0
The first (_INFINITY) means the generation number has not been loaded or
computed. The second (_MAX) means the generation number is too large to
store in the commit-graph file. The third (_ZERO) means the generation
number was loaded from a commit graph file that was written by a version
of git that did not support generation numbers.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have come this far, you probably have seen that this 'util'
pointer is used for many different purposes. Some are not even
contained in a command code, but buried deep in common code with no
clue who will use it and how.
The move to using commit-slab gives us a much better picture of how
some piece of data is associated with a commit and what for. Since
nobody uses 'util' pointer anymore, we can retire so that nobody will
abuse it again. commit-slab will be the way forward for associating
data to a commit.
As a side benefit, this shrinks struct commit by 8 bytes (on 64-bit
architecture) which should help reduce memory usage for reachability
test a bit. This is also what commit-slab is invented for [1].
[1] 96c4f4a370 (commit: allow associating auxiliary info on-demand -
2013-04-09)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another candidate for commit-slab. Keep Junio's observation in
code so we can search it later on when somebody wants to improve the
code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of relying on commit->util to store the source string, let the
user provide a commit-slab to store the source strings in.
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
While at there, plug a leak for keeping track of depth in this code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
define_shared_commit_slab() could be used in a header file to define a
commit-slab. One of these C files must include commit-slab-impl.h and
"call" implement_shared_commit_slab().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The struct declaration and implementation macros are moved to
commit-slab-hdr.h and commit-slab-impl.h respectively.
This right now is not needed for current users but if we make a public
commit-slab type, we may want to avoid including the slab
implementation in a header file which gets replicated in every c file
that includes it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the global variable 'commit_graft_prepared' into the object
pool and convert the function prepare_commit_graft to work
an arbitrary repositories.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to convert the shallow functions all at the same time
as we move the data structures they operate on into the repository.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate all git_path_* functions that are defined in path.c to take a
repository argument. Unlike other patches in this series, do not use the
#define trick, as we rewrite the whole function, which is rather small.
This doesn't migrate all the functions, as other builtins have their own
local path functions defined using GIT_PATH_FUNC. So keep that macro
around to serve the other locations.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as all lines are converted and it has only one caller.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of is_repository_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of check_shallow_file_for_update
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of register_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of set_alternate_shallow_file
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_commit_graft to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the caller of prepare_commit_graft
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the caller of read_graft_file to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of register_commit_graft to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of commit_graft_pos to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Grafts are only meaningful in the context of a single repository.
Therefore they cannot be global.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have to convert all of the alloc functions at once, because alloc_report
uses a funky macro for reporting. It is better for the sake of mechanical
conversion to convert multiple functions at once rather than changing the
structure of the reporting function.
We record all memory allocation in alloc.c, and free them in
clear_alloc_state, which is called for all repositories except
the_repository.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This aims to make git-submodule foreach a builtin. 'foreach' is ported to
the submodule--helper, and submodule--helper is called from
git-submodule.sh.
Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As using a variable '$path' may be harmful to users due to
capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
use $path variable in eval_gettext string, 2012-04-17). Adjust
the documentation to advocate for using $sm_path, which contains
the same value. We still make the 'path' variable available and
document it as a deprecated synonym of 'sm_path'.
Discussed-with: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git submodule foreach --recursive' from a subdirectory of
your repository, nested submodules get a bogus value for $path:
For a submodule 'sub' that contains a nested submodule 'nested',
running 'git -C dir submodule foreach echo $path' from the root of the
superproject would report path='../nested' for the nested submodule.
The first part '../' is derived from the logic computing the relative
path from $pwd to the root of the superproject. The second part is the
submodule path inside the submodule. This value is of little use and is
hard to document.
Also, in git-submodule.txt, $path is documented to be the "name of the
submodule directory relative to the superproject", but "the
superproject" is ambiguous.
To resolve both these issues, we could:
(a) Change "the superproject" to "its immediate superproject", so
$path would be "nested" instead of "../nested".
(b) Change "the superproject" to "the superproject the original
command was run from", so $path would be "sub/nested" instead of
"../nested".
(c) Change "the superproject" to "the directory the original command
was run from", so $path would be "../sub/nested" instead of
"../nested".
The behavior for (c) was attempted to be introduced in 091a6eb0fe
(submodule: drop the top-level requirement, 2013-06-16) with the intent
for $path to be relative from $pwd to the submodule worktree, but that
did not work for nested submodules, as the intermittent submodules
were not included in the path.
If we were to fix the meaning of the $path using (a), we would break
any existing submodule user that runs foreach from non-root of the
superproject as the non-nested submodule '../sub' would change its
path to 'sub'.
If we were to fix the meaning of $path using (b), then we would break
any user that uses nested submodules (even from the root directory)
as the 'nested' would become 'sub/nested'.
If we were to fix the meaning of $path using (c), then we would break
the same users as in (b) as 'nested' would become 'sub/nested' from
the root directory of the superproject.
All groups can be found in the wild. The author has no data if one group
outweighs the other by large margin, and offending each one seems equally
bad at first. However in the authors imagination it is better to go with
(a) as running from a sub directory sounds like it is carried out by a
human rather than by some automation task. With a human on the keyboard
the feedback loop is short and the changed behavior can be adapted to
quickly unlike some automation that can break silently.
Discussed-with: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the caller of grow_object_hash to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of create_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the existing global cache for parsed objects (obj_hash) into
repository-specific parsed object caches. Existing code that uses
obj_hash are modified to use the parsed object cache of
the_repository; future patches will use the parsed object caches of
other repositories.
Another future use case for a pool of objects is ease of memory management
in revision walking: If we can free the rev-list related memory early in
pack-objects (e.g. part of repack operation) then it could lower memory
pressure significantly when running on large repos. While this has been
discussed on the mailing list lately, this series doesn't implement this.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The in_commit_list() method does not check the parents of
the candidate for containment in the list. Fix the comment
that incorrectly states that it does.
Reported-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:39:53 +09:00
784 changed files with 24125 additions and 9813 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.