Commit Graph

12618 Commits

Author SHA1 Message Date
a9e87e3204 Merge branch 'cc/shared-index-permfix' into maint
The split index code did not honor core.sharedrepository setting
correctly.

* cc/shared-index-permfix:
  t1700: make sure split-index respects core.sharedrepository
  t1301: move modebits() to test-lib-functions.sh
  read-cache: use shared perms when writing shared index
2017-07-10 13:59:05 -07:00
8f3a16c390 Merge branch 'ks/t7508-indent-fix' into maint
Cosmetic update to a test.

* ks/t7508-indent-fix:
  t7508: fix a broken indentation
2017-07-10 13:59:03 -07:00
dbcf77592a Merge branch 'sb/t4005-modernize' into maint
Test clean-up.

* sb/t4005-modernize:
  t4005: modernize style and drop hard coded sha1
2017-07-10 13:59:02 -07:00
33c3c2d368 Merge branch 'rs/apply-validate-input' into maint
Tighten error checks for invalid "git apply" input.

* rs/apply-validate-input:
  apply: check git diffs for mutually exclusive header lines
  apply: check git diffs for invalid file modes
  apply: check git diffs for missing old filenames
2017-07-10 13:59:01 -07:00
9f6728da31 Merge branch 'pw/rebase-i-regression-fix-tests' into maint
Fix a recent regression to "git rebase -i" and add tests that would
have caught it and others.

* pw/rebase-i-regression-fix-tests:
  t3420: fix under GETTEXT_POISON build
  rebase: add more regression tests for console output
  rebase: add regression tests for console output
  rebase -i: add test for reflog message
  sequencer: print autostash messages to stderr
2017-07-10 13:59:00 -07:00
f904494574 Merge branch 'jk/add-p-commentchar-fix' into maint
"git add -p" were updated in 2.12 timeframe to cope with custom
core.commentchar but the implementation was buggy and a
metacharacter like $ and * did not work.

* jk/add-p-commentchar-fix:
  add--interactive: quote commentChar regex
  add--interactive: handle EOF in prompt_yesno
2017-07-10 13:58:59 -07:00
040746c061 Merge branch 'js/alias-early-config' into maint
The code to pick up and execute command alias definition from the
configuration used to switch to the top of the working tree and
then come back when the expanded alias was executed, which was
unnecessarilyl complex.  Attempt to simplify the logic by using the
early-config mechanism that does not chdir around.

* js/alias-early-config:
  alias: use the early config machinery to expand aliases
  t7006: demonstrate a problem with aliases in subdirectories
  t1308: relax the test verifying that empty alias values are disallowed
  help: use early config when autocorrecting aliases
  config: report correct line number upon error
  discover_git_directory(): avoid setting invalid git_dir
2017-07-10 13:58:58 -07:00
4dc59cba81 Merge branch 'jk/reflog-walk-maint'
After "git branch --move" of the currently checked out branch, the
code to walk the reflog of HEAD via "log -g" and friends
incorrectly stopped at the reflog entry that records the renaming
of the branch.

* jk/reflog-walk-maint:
  reflog-walk: include all fields when freeing complete_reflogs
  reflog-walk: don't free reflogs added to cache
  reflog-walk: duplicate strings in complete_reflogs list
  reflog-walk: skip over double-null oid due to HEAD rename
2017-07-10 13:42:53 -07:00
0c6435a4d6 Merge branch 'ab/wildmatch'
Minor code cleanup.

* ab/wildmatch:
  wildmatch: remove unused wildopts parameter
2017-07-10 13:42:51 -07:00
9bf8e0c73d Merge branch 'pw/unquote-path-in-git-pm'
Code refactoring.

* pw/unquote-path-in-git-pm:
  t9700: add tests for Git::unquote_path()
  Git::unquote_path(): throw an exception on bad path
  Git::unquote_path(): handle '\a'
  add -i: move unquote_path() to Git.pm
2017-07-10 13:42:50 -07:00
c4f70d2c90 Merge branch 'ks/commit-assuming-only-warning-removal'
An old message shown in the commit log template was removed, as it
has outlived its usefulness.

* ks/commit-assuming-only-warning-removal:
  commit-template: distinguish status information unconditionally
  commit-template: remove outdated notice about explicit paths
2017-07-10 13:42:50 -07:00
bf1ce904b7 test-lib: turn on ASan abort_on_error by default
By default, ASan will exit with code 1 when it sees an
error. This means we'll notice a problem when we expected
git to succeed, but not in a test_must_fail block.

Let's ask it to actually raise SIGABRT instead. That will
give us a signal death that test_must_fail will notice. As a
bonus, it may also leave a coredump, which can be handy for
digging into a failure.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 10:02:28 -07:00
d0cc5796f3 test-lib: set ASAN_OPTIONS variable before we run git
We turn off ASan's leak detection by default in the test
suite because it's too noisy. But we don't do so until
part-way through test-lib. This is before we've run any
tests, but after we do our initial "./git" to see if the
binary has even been built.

When built with clang, this seems to work fine. However,
using "gcc -fsanitize=address", the leak checker seems to
complain more aggressively:

  $ ./git
  ...
  ==5352==ERROR: LeakSanitizer: detected memory leaks
  Direct leak of 2 byte(s) in 1 object(s) allocated from:
      #0 0x7f120e7afcf8 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1cf8)
      #1 0x559fc2a3ce41 in do_xmalloc /home/peff/compile/git/wrapper.c:60
      #2 0x559fc2a3cf1a in do_xmallocz /home/peff/compile/git/wrapper.c:100
      #3 0x559fc2a3d0ad in xmallocz /home/peff/compile/git/wrapper.c:108
      #4 0x559fc2a3d0ad in xmemdupz /home/peff/compile/git/wrapper.c:124
      #5 0x559fc2a3d0ad in xstrndup /home/peff/compile/git/wrapper.c:130
      #6 0x559fc274535a in main /home/peff/compile/git/common-main.c:39
      #7 0x7f120dabd2b0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202b0)

This is a leak in the sense that we never free it, but it's
in a global that is meant to last the whole program. So it's
not really interesting or in need of fixing. And at any
rate, mentioning leaks outside of the test_expect blocks is
certainly unwelcome, as it pollutes stderr.

Let's bump the setting of ASAN_OPTIONS higher in test-lib.sh
to catch our initial "can we even run git?" test.  While
we're at it, we can add a comment to make it a bit less
inscrutable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10 10:02:27 -07:00
de239446b6 reflog-walk: apply --since/--until to reflog dates
When doing a reflog walk, we use the commit's date to
do any date limiting. In earlier versions of Git, this could
lead to nonsense results, since a skipped commit would
truncate the traversal. So a sequence like:

  git commit ...
  git checkout week-old-branch
  git checkout -
  git log -g --since=1.day.ago

would stop at the week-old-branch, even though the "git
commit" entry further back is still interesting.

As of the prior commit, which uses a parent-less traversal
of the reflog, you get the whole reflog minus any commits
whose dates do not match the specified options. This is
arguably useful, as you could scan the reflogs for commits
that originated in a certain range.

But more likely a user doing a reflog walk wants to limit
based on the reflog entries themselves. You can simulate
--until with:

  git log -g @{1.day.ago}

but there's no way to ask Git to traverse only back to a
certain date. E.g.:

  # show me reflog entries from the past day
  git log -g --since=1.day.ago

This patch teaches the revision machinery to prefer the
reflog entry dates to the commit dates when doing a reflog
walk. Technically this is a change in behavior that affects
plumbing, but the previous behavior was so buggy that it's
unlikely anyone was relying on it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
d08565bf2d reflog-walk: stop using fake parents
The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.

This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.

The implementation is fairly straight-forward, but there are
a few items to note:

  1. We obviously must skip calling add_parents_to_list()
     when we are traversing reflogs, since we do not want to
     walk the original parents at all.  As a result, we must call
     try_to_simplify_commit() ourselves.

     There are other parts of add_parents_to_list() we skip,
     as well, but none of them should matter for a reflog
     traversal:

       -  We do not allow UNINTERESTING commits, nor
          symmetric ranges (and we bail when these are used
          with "-g").

       - Using --source makes no sense, since we aren't
         traversing. The reflog selector shows the same
         information with more detail.

       - Using --first-parent is still sensible, since you
         may want to see the first-parent diff for each
         entry. But since we're not traversing, we don't
         need to cull the parent list here.

  2. Since we now just walk the reflog entries themselves,
     rather than starting with the ref tip, we now look at
     the "new" field of each entry rather than the "old"
     (i.e., we are showing entries, not faking parents).
     This removes all of the tricky logic around skipping
     past root commits.

     But note that we have no way to show an entry with the
     null sha1 in its "new" field (because such a commit
     obviously does not exist). Normally this would not
     happen, since we delete reflogs along with refs, but
     there is one special case. When we rename the currently
     checked out branch, we write two reflog entries into
     the HEAD log: one where the commit goes away, and
     another where it comes back.

     Prior to this commit, we show both entries with
     identical reflog messages. After this commit, we show
     only the "comes back" entry. See the update in t3200
     which demonstrates this.

     Arguably either is fine, as the whole double-entry
     thing is a bit hacky in the first place. And until a
     recent fix, we truncated the traversal in such a case
     anyway, which was _definitely_ wrong.

  3. We show individual reflogs in order, but choose which
     reflog to show at each stage based on which has the
     most recent timestamp.  This interleaves the output
     from multiple reflogs based on date order, which is
     probably what you'd want with limiting like "-n 30".

     Note that the implementation aims for simplicity. It
     does a linear walk over the reflog queue for each
     commit it pulls, which may perform badly if you
     interleave an enormous number of reflogs. That seems
     like an unlikely use case; if we did want to handle it,
     we could probably keep a priority queue of reflogs,
     ordered by the timestamp of their current tip entry.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
7f97de5ee1 rev-list: check reflog_info before showing usage
When git-rev-list sees no pending commits, it shows a usage
message. This works even when reflog-walking is requested,
because the reflog-walk code currently puts the reflog tips
into the pending queue.

In preparation for refactoring the reflog-walk code, let's
explicitly check whether we have any reflogs to walk. For
now this is a noop, but the existing reflog tests will make
sure that it kicks in after the refactoring. Likewise, we'll
add a test that "rev-list -g" without specifying any reflogs
continues to fail (so that we know our check does not kick
in too aggressively).

Note that the implementation needs to go into its own
sub-function, as the walk code does not expose its innards
outside of reflog-walk.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
34d820ee33 branch: use BRANCH_COLOR_LOCAL in ref-filter format
Since 949af0684 (branch: use ref-filter printing APIs,
2017-01-10), git-branch's output is generated by passing a
custom format to the ref-filter code. This format forgot to
pass BRANCH_COLOR_LOCAL, meaning that local branches
(besides the current one) were never colored at all.

We can add it in the %(if) block where we decide whether the
branch is "current" or merely "local".  Note that this means
the current/local coloring is either/or. You can't set:

  [color "branch"]
  local = blue
  current = bold

and expect the current branch to be "bold blue". This
matches the pre-949af0684 behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 09:24:42 -07:00
7cf686b9a8 t1414: document some reflog-walk oddities
Since its inception, the general strategy of the reflog-walk
code has been to start with the tip commit for the ref, and
as we traverse replace each commit's parent pointers with
fake parents pointing to the previous reflog entry.

This lets us traverse the reflog as if it were a real
history, but it has some user-visible oddities. Namely:

  1. The fake parents are used for commit selection and
     display. So for example, "--merges" or "--no-merges"
     are not useful, because the history appears as a linear
     string of commits. Likewise, pathspec limiting is based
     on the diff between adjacent entries, not the changes
     actually introduced by a commit.

     These are often the same (e.g., because the entry was
     just running "git commit" and the adjacent entry _is_
     the true parent), but it may not be in several common
     cases. For instance, using "git reset" to jump around
     history, or "git checkout" to move HEAD.

  2. We reverse-map each commit back to its reflog. So when
     it comes time to show commit X, we say "a-ha, we added
     X because it was at the tip of the 'foo' reflog, so
     let's show the foo reflog". But this leads to nonsense
     results when you ask to traverse multiple reflogs: if
     two reflogs have the same tip commit, we only map back
     to one of them.  Instead, we should show both.

  3. If the tip of the reflog and the ref tip disagree on
     the current value, we show the ref tip but give no
     indication of the value in the reflog.  This situation
     isn't supposed to happen (since any ref update should
     touch the reflog). But if it does, given that the
     requested operation is to show the reflog, it makes
     sense to prefer that.

This commit adds a new script with several expect_failure
tests to demonstrate the problems.  This could be part of
the existing t1411, but it's a bit easier to start from a
fresh state, where we know exactly what will be in the log.

Since the new multiple-reflog tests are checking the actual
output, we can drop the "make sure we don't segfault" tests
from t1411, which are a strict subset of what we're doing
here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-07 10:03:48 -07:00
be5982a794 Merge branch 'jk/reflog-walk-maint' into jk/reflog-walk
* jk/reflog-walk-maint:
  reflog-walk: include all fields when freeing complete_reflogs
  reflog-walk: don't free reflogs added to cache
  reflog-walk: duplicate strings in complete_reflogs list
  reflog-walk: skip over double-null oid due to HEAD rename
2017-07-07 10:02:42 -07:00
8aae3cf755 reflog-walk: don't free reflogs added to cache
The add_reflog_for_walk() function keeps a cache mapping
refnames to their reflog contents. We use a cached reflog
entry if available, and otherwise allocate and store a new
one.

Since 5026b47175 (add_reflog_for_walk: avoid memory leak,
2017-05-04), when we hit an error parsing a date-based
reflog spec, we free the reflog memory but leave the cache
entry pointing to the now-freed memory.

We can fix this by just leaving the memory intact once it
has made it into the cache. This may leave an unused entry
in the cache, but that's OK. And it means we also catch a
similar situation: we may not have allocated at all in this
invocation, but simply be pointing to a cached entry from a
previous invocation (which is relying on that entry being
present).

The new test in t1411 exercises this case and fails when run
with --valgrind or ASan.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-07 09:00:31 -07:00
75afe7ac87 reflog-walk: duplicate strings in complete_reflogs list
As part of the add_reflog_to_walk() function, we keep a
string_list mapping refnames to their reflog contents. This
serves as a cache so that accessing the same reflog twice
requires only a single copy of the log in memory.

The string_list is initialized via xcalloc, meaning its
strdup_strings field is set to 0. But after inserting a
string into the list, we unconditionally call free() on the
string, leaving the list pointing to freed memory. If
another reflog is added (e.g., "git log -g HEAD HEAD"), then
the second one may have unpredictable results.

The extra free was added by 5026b47175 (add_reflog_for_walk:
avoid memory leak, 2017-05-04). Though if you look
carefully, you can see that the code was buggy even before
then. If we tried to read the reflogs by time but came up
with no entries, we exited with an error, freeing the string
in that code path. So the bug was harder to trigger, but
still there.

We can fix it by just asking the string list to make a copy
of the string. Technically we could fix the problem by not
calling free() on our string (and just handing over
ownership to the string list), but there are enough
conditionals that it's quite hard to figure out which code
paths need the free and which do not. Simpler is better
here.

The new test reliably shows the problem when run with
--valgrind or ASAN.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-07 08:58:17 -07:00
ccce1e51e8 Merge branch 'js/t5534-rev-parse-gives-multi-line-output-fix'
A few tests that tried to verify the contents of push certificates
did not use 'git rev-parse' to formulate the line to look for in
the certificate correctly.

* js/t5534-rev-parse-gives-multi-line-output-fix:
  t5534: fix misleading grep invocation
2017-07-06 18:14:46 -07:00
8f58a34cad Merge branch 'js/fsck-name-object'
Test fix.

* js/fsck-name-object:
  t1450: use egrep for regexp "alternation"
2017-07-06 18:14:43 -07:00
496f256989 cygwin: allow pushing to UNC paths
cygwin can use an UNC path like //server/share/repo

 $ cd //server/share/dir
 $ mkdir test
 $ cd test
 $ git init --bare

 However, when we try to push from a local Git repository to this repo,
 there is a problem: Git converts the leading "//" into a single "/".

 As cygwin handles an UNC path so well, Git can support them better:

 - Introduce cygwin_offset_1st_component() which keeps the leading "//",
   similar to what Git for Windows does.

 - Move CYGWIN out of the POSIX in the tests for path normalization in t0060

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 14:01:03 -07:00
6815d11431 t/helper/test-hashmap: use custom data instead of duplicate cmp functions
With the new field that is passed to the compare function, we can pass
through flags there instead of having multiple compare functions.
Also drop the cast to hashmap_cmp_fn.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 13:53:12 -07:00
8e90578ffb Merge branch 'cc/shared-index-permfix'
The split index code did not honor core.sharedrepository setting
correctly.

* cc/shared-index-permfix:
  t1700: make sure split-index respects core.sharedrepository
  t1301: move modebits() to test-lib-functions.sh
  read-cache: use shared perms when writing shared index
2017-07-05 13:32:57 -07:00
5ab148dda0 Merge branch 'rs/sha1-name-readdir-optim'
Optimize "what are the object names already taken in an alternate
object database?" query that is used to derive the length of prefix
an object name is uniquely abbreviated to.

* rs/sha1-name-readdir-optim:
  sha1_file: guard against invalid loose subdirectory numbers
  sha1_file: let for_each_file_in_obj_subdir() handle subdir names
  p4205: add perf test script for pretty log formats
  sha1_name: cache readdir(3) results in find_short_object_filename()
2017-07-05 13:32:56 -07:00
85ce4a6828 Merge branch 'bw/repo-object'
Introduce a "repository" object to eventually make it easier to
work in multiple repositories (the primary focus is to work with
the superproject and its submodules) in a single process.

* bw/repo-object:
  ls-files: use repository object
  repository: enable initialization of submodules
  submodule: convert is_submodule_initialized to work on a repository
  submodule: add repo_read_gitmodules
  submodule-config: store the_submodule_cache in the_repository
  repository: add index_state to struct repo
  config: read config from a repository object
  path: add repo_worktree_path and strbuf_repo_worktree_path
  path: add repo_git_path and strbuf_repo_git_path
  path: worktree_git_path() should not use file relocation
  path: convert do_git_path to take a 'struct repository'
  path: convert strbuf_git_common_path to take a 'struct repository'
  path: always pass in commondir to update_common_dir
  path: create path.h
  environment: store worktree in the_repository
  environment: place key repository state in the_repository
  repository: introduce the repository object
  environment: remove namespace_len variable
  setup: add comment indicating a hack
  setup: don't perform lazy initialization of repository state
2017-07-05 13:32:56 -07:00
2272d3e542 reflog-walk: skip over double-null oid due to HEAD rename
Since 39ee4c6c2f (branch: record creation of renamed branch
in HEAD's log, 2017-02-20), a rename on the currently
checked out branch will create two entries in the HEAD
reflog: one where the branch goes away (switching to the
null oid), and one where it comes back (switching away from
the null oid).

This confuses the reflog-walk code. When walking backwards,
it first sees the null oid in the "old" field of the second
entry. Thanks to the "root commit" logic added by 71abeb753f
(reflog: continue walking the reflog past root commits,
2016-06-03), we keep looking for the next entry by scanning
the "new" field from the previous entry. But that field is
also null! We need to go just a tiny bit further, and look
at its "old" field. But with the current code, we decide the
reflog has nothing else to show and just give up. To the
user this looks like the reflog was truncated by the rename
operation, when in fact those entries are still there.

This patch does the absolute minimal fix, which is to look
back that one extra level and keep traversing.

The resulting behavior may not be the _best_ thing to do in
the long run (for example, we show both reflog entries each
with the same commit id), but it's a simple way to fix the
problem without risking further regressions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 10:34:00 -07:00
8722947e5c t5534: fix misleading grep invocation
It seems to be a little-known feature of `grep` (and it certainly came
as a surprise to this here developer who believed to know the Unix tools
pretty well) that multiple patterns can be passed in the same
command-line argument simply by separating them by newlines. Watch, and
learn:

	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
	1
	3

That behavior also extends to patterns passed via `-e`, and it is not
modified by passing the option `-E` (but trying this with -P issues the
error "grep: the -P option only supports a single pattern").

It seems that there are more old Unix hands who are surprised by this
behavior, as grep invocations of the form

	grep "$(git rev-parse A B) C" file

were introduced in a85b377d04 (push: the beginning of "git push
--signed", 2014-09-12), and later faithfully copy-edited in b9459019bb
(push: heed user.signingkey for signed pushes, 2014-10-22).

Please note that the output of `git rev-parse A B` separates the object
IDs via *newlines*, not via spaces, and those newlines are preserved
because the interpolation is enclosed in double quotes.

As a consequence, these tests try to validate that the file contains
either A's object ID, or B's object ID followed by C, or both. Clearly,
however, what the test wanted to see is that there is a line that
contains all of them.

This is clearly unintended, and the grep invocations in question really
match too many lines.

Fix the test by avoiding the newlines in the patterns.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 09:26:52 -07:00
9308b7f3ca read_packed_refs(): die if packed-refs contains bogus data
The old code ignored any lines that it didn't understand, including
unterminated lines. This is dangerous. Instead, `die()` if the
`packed-refs` file contains any unterminated lines or lines that we
don't know how to handle.

This fixes the tests added in the last commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:57 -07:00
02a1a42056 t3210: add some tests of bogus packed-refs file contents
If `packed-refs` contains indecipherable lines, we should emit an
error and quit rather than just skipping the lines. Unfortunately, we
currently do the latter. Add some failing tests demonstrating the
problem.

This will be fixed in the next commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:57 -07:00
86b452e276 diff.c: add dimming to moved line detection
Any lines inside a moved block of code are not interesting. Boundaries
of blocks are only interesting if they are next to another block of moved
code.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
176841f0c9 diff.c: color moved lines differently, plain mode
Add the 'plain' mode for move detection of code. This omits the checking
for adjacent blocks, so it is not as useful. If you have a lot of the
same blocks moved in the same patch, the 'Zebra' would end up slow as it
is O(n^2) (n is number of same blocks). So this may be useful there and
is generally easy to add. Instead be very literal at the move detection,
do not skip over short blocks here.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
2e2d5ac184 diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

This patch implements the first mode:
* basic alternating 'Zebra' mode
  This conveys all information needed to the user.  Defer customization to
  later patches.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.

While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.

For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
2841e8f81c convert: add "status=delayed" to filter process protocol
Some `clean` / `smudge` filters may require a significant amount of
time to process a single blob (e.g. the Git LFS smudge filter might
perform network requests). During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol, introduced in edcc8581 ("convert: add
filter.<driver>.process option", 2016-10-16), to accept the status
"delayed" as response to a filter request. Upon this response Git
continues with the checkout operation. After the checkout operation Git
calls "finish_delayed_checkout" which queries the filter for remaining
blobs. If the filter is still working on the completion, then the filter
is expected to block. If the filter has completed all remaining blobs
then an empty response is expected.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations
for now. The optimization is most effective in these code paths as all
files of the tree are processed.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:50:41 -07:00
748cffc22b Merge branch 'vs/typofixes'
Many typofixes.

* vs/typofixes:
  Spelling fixes
2017-06-30 13:45:25 -07:00
53ee6b8f1a Merge branch 'rs/apply-validate-input'
Tighten error checks for invalid "git apply" input.

* rs/apply-validate-input:
  apply: check git diffs for mutually exclusive header lines
  apply: check git diffs for invalid file modes
  apply: check git diffs for missing old filenames
2017-06-30 13:45:24 -07:00
7e46f19a10 Merge branch 'ks/status-initial-commit'
"git status" has long shown essentially the same message as "git
commit"; the message it gives while preparing for the root commit,
i.e. "Initial commit", was hard to understand for some new users.
Now it says "No commits yet" to stress more on the current status
(rather than the commit the user is preparing for, which is more in
line with the focus of "git commit").

* ks/status-initial-commit:
  status: contextually notify user about an initial commit
2017-06-30 13:45:22 -07:00
5452224710 Merge branch 'pw/rebase-i-regression-fix-tests'
Fix a recent regression to "git rebase -i" and add tests that would
have caught it and others.

* pw/rebase-i-regression-fix-tests:
  t3420: fix under GETTEXT_POISON build
  rebase: add more regression tests for console output
  rebase: add regression tests for console output
  rebase -i: add test for reflog message
  sequencer: print autostash messages to stderr
2017-06-30 13:45:21 -07:00
7663cdc86c hashmap.h: compare function has access to a data field
When using the hashmap a common need is to have access to caller provided
data in the compare function. A couple of times we abuse the keydata field
to pass in the data needed. This happens for example in patch-ids.c.

This patch changes the function signature of the compare function
to have one more void pointer available. The pointer given for each
invocation of the compare function must be defined in the init function
of the hashmap and is just passed through.

Documentation of this new feature is deferred to a later patch.
This is a rather mechanical conversion, just adding the new pass-through
parameter.  However while at it improve the naming of the fields of all
compare functions used by hashmaps by ensuring unused parameters are
prefixed with 'unused_' and naming the parameters what they are (instead
of 'unused' make it 'unused_keydata').

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 12:49:28 -07:00
3f9c637ec7 t9700: add tests for Git::unquote_path()
Check that unquote_path() handles spaces and escape sequences
properly.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 08:05:15 -07:00
b3cf1b7789 commit-template: distinguish status information unconditionally
The commit template adds the status information without
adding a new line to distinguish them in the absence
of optional parts. This results in difficulty in interpreting
it's content, specifically for inexperienced users.

Unconditionally, add new lines to separate the status message
from the other parts of the commit-template to make it more
readable.

Signed-off-by: Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 07:50:21 -07:00
1757333410 t0021: write "OUT <size>" only on success
"rot13-filter.pl" always writes "OUT <size>" to the debug log at the end
of a response.

This works perfectly for the existing responses "abort", "error", and
"success". A new response "delayed", that will be introduced in a
subsequent patch, accepts the input without giving the filtered result
right away. At this point we cannot know the size of the response.
Therefore, we do not write "OUT <size>" for "delayed" responses.

To simplify the code we do not write "OUT <size>" for "abort" and
"error" responses either as their size is always zero.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-29 11:23:47 -07:00
73fc2aadc7 t1450: use egrep for regexp "alternation"
GNU grep allows "\(A\|B\)" as alternation in BRE, but this is an
extension not understood by some other implementations of grep
(Michael Kebe reported an breakage on Solaris).

Rewrite the offending test to ERE and use egrep instead.

Noticed-by: Michael Kebe <michael.kebe@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-28 10:17:50 -07:00
d70e9c5c8c apply: check git diffs for mutually exclusive header lines
A file can either be added, removed, copied, or renamed, but no two of
these actions can be done by the same patch.  Some of these combinations
provoke error messages due to missing file names, and some are only
caught by an assertion.  Check git patches already as they are parsed
and report conflicting lines on sight.

Found by Vegard Nossum using AFL.

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27 14:41:10 -07:00
44e5471a8d apply: check git diffs for invalid file modes
An empty string as mode specification is accepted silently by git apply,
as Vegard Nossum found out using AFL.  It's interpreted as zero.  Reject
such bogus file modes, and only accept ones consisting exclusively of
octal digits.

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27 10:59:38 -07:00
4269974179 apply: check git diffs for missing old filenames
2c93286a (fix "git apply --index ..." not to deref NULL) added a check
for git patches missing a +++ line, preventing a segfault.  Check for
missing --- lines as well, and add a test for each case.

Found by Vegard Nossum using AFL.

Original-patch-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27 10:58:30 -07:00
6412757514 Spelling fixes
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27 10:35:49 -07:00
6968ca9b25 Merge branch 'ks/t7508-indent-fix'
Cosmetic update to a test.

* ks/t7508-indent-fix:
  t7508: fix a broken indentation
2017-06-26 14:09:32 -07:00