Typing ^C to pager, which usually does not kill it, killed Git and
took the pager down as a collateral damage in certain process-tree
structure. This has been fixed.
* jk/execv-dashed-external:
execv_dashed_external: wait for child on signal death
execv_dashed_external: stop exiting with negative code
execv_dashed_external: use child_process struct
Code cleanup.
* js/exec-path-coverity-workaround:
git_exec_path: do not return the result of getenv()
git_exec_path: avoid Coverity warning about unfree()d result
Tighten a test to avoid mistaking an extended ERE regexp engine as
a PRE regexp engine.
* jk/grep-e-could-be-extended-beyond-posix:
t7810: avoid assumption about invalid regex syntax
"git <cmd> @{push}" on a detached HEAD used to segfault; it has
been corrected to error out with a message.
* km/branch-get-push-while-detached:
branch_get_push: do not segfault when HEAD is detached
"git rebase -i" with a recent update started showing an incorrect
count when squashing more than 10 commits.
* jk/rebase-i-squash-count-fix:
rebase--interactive: count squash commits above 10 correctly
"git blame --porcelain" misidentified the "previous" <commit, path>
pair (aka "source") when contents came from two or more files.
* jk/blame-fixes:
blame: output porcelain "previous" header for each file
blame: handle --no-abbrev
blame: fix alignment with --abbrev=40
"git archive" did not read the standard configuration files, and
failed to notice a file that is marked as binary via the userdiff
driver configuration.
* jk/archive-zip-userdiff-config:
archive-zip: load userdiff config
It is natural that "git gc --auto" may not attempt to pack
everything into a single pack, and there is no point in warning
when the user has configured the system to use the pack bitmap,
leading to disabling further "gc".
* dt/disable-bitmap-in-auto-gc:
repack: die on incremental + write-bitmap-index
auto gc: don't write bitmaps for incremental repacks
Leakage of lockfiles in the config subsystem has been fixed.
* nd/config-misc-fixes:
config.c: handle lock file in error case in git_config_rename_...
config.c: rename label unlock_and_out
config.c: handle error case for fstat() calls
Recent update to the default abbreviation length that auto-scales
lacked documentation update, which has been corrected.
* jc/abbrev-autoscale-config:
config.abbrev: document the new default that auto-scales
"git fast-import" sometimes mishandled while rebalancing notes
tree, which has been fixed.
* mh/fast-import-notes-fix-new:
fast-import: properly fanout notes when tree is imported
Compression setting for producing packfiles were spread across
three codepaths, one of which did not honor any configuration.
Unify these so that all of them honor core.compression and
pack.compression variables the same way.
* jc/compression-config:
compression: unify pack.compression configuration parsing
Update the procedure to generate "tags" for developer support.
* jk/make-tags-find-sources-tweak:
Makefile: exclude contrib from FIND_SOURCE_FILES
Makefile: match shell scripts in FIND_SOURCE_FILES
Makefile: exclude test cruft from FIND_SOURCE_FILES
Makefile: reformat FIND_SOURCE_FILES
Some platforms no longer understand "latin-1" that is still seen in
the wild in e-mail headers; replace them with "iso-8859-1" that is
more widely known when conversion fails from/to it.
* jc/latin-1:
utf8: accept "latin-1" as ISO-8859-1
utf8: refactor code to decide fallback encoding
The `perforce` and `perforce-server` package were moved from brew [1][2]
to cask [3]. Teach TravisCI the new location.
Perforce updates their binaries without version bumps. That made the
brew install (legitimately!) fail due to checksum mismatches. The
workaround is not necessary anymore as Cask [4] allows to disable the
checksum test for individual formulas.
[1] 1394e42de0
[2] f8da22d6b8
[3] https://github.com/caskroom/homebrew-cask/pull/29180
[4] https://caskroom.github.io/
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When looking for documentation for a specific function, you may be tempted
to run
git -C Documentation grep index_name_pos
only to find the file technical/api-in-core-index.txt, which doesn't
help for understanding the given function. It would be better to not find
these functions in the documentation, such that people directly dive into
the code instead.
In the previous patches we have documented
* index_name_pos()
* remove_index_entry_at()
* add_[file_]to_index()
in cache.h
We already have documentation for:
* add_index_entry()
* read_index()
Which leaves us with a TODO for:
* cache -> the_index macros
* refresh_index()
* discard_index()
* ie_match_stat() and ie_modified(); how they are different and when to
use which.
* write_index() that was renamed to write_locked_index
* cache_tree_invalidate_path()
* cache_tree_update()
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do this by moving the existing documentation from
read-cache.c to cache.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The version of the "replace isatty() hack" that got merged a few
weeks ago did not actually reflect the latest iteration of the patch
series: v3 was sent out with these changes, as requested by the
reviewer Johannes Sixt:
- reworded the comment about "recycling handles"
- moved the reassignment of the `console` variable before the dup2()
call so that it is valid at all times
- removed the "handle = INVALID_HANDLE_VALUE" assignment, as the local
variable `handle` is not used afterwards anyway
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup to avoid using redundant refspecs while fetching with
the --tags option.
* jt/fetch-no-redundant-tag-fetch-map:
fetch: do not redundantly calculate tag refmap
* kh/tutorial-grammofix:
doc: omit needless "for"
doc: make the intent of sentence clearer
doc: add verb in front of command to run
doc: add articles (grammar)
When the http server gives an incomplete response to a smart-http
rpc call, it could lead to client waiting for a full response that
will never come. Teach the client side to notice this condition
and abort the transfer.
An improvement counterproposal has failed.
cf. <20161114194049.mktpsvgdhex2f4zv@sigill.intra.peff.net>
* dt/smart-http-detect-server-going-away:
upload-pack: optionally allow fetching any sha1
remote-curl: don't hang when a server dies before any output
A potential but unlikely buffer overflow in Windows port has been
fixed.
* mk/mingw-winansi-ttyname-termination-fix:
mingw: consider that UNICODE_STRING::Length counts bytes
"git p4" that tracks multile p4 paths imported a single changelist
that touches files in these multiple paths as one commit, followed
by many empty commits. This has been fixed.
* gv/p4-multi-path-commit-fix:
git-p4: fix multi-path changelist empty commits
Even though an fix was attempted in Git 2.9.3 days, but running
"git difftool --dir-diff" from a subdirectory never worked. This
has been fixed.
* jk/difftool-in-subdir:
difftool: rename variables for consistency
difftool: chdir as early as possible
difftool: sanitize $workdir as early as possible
difftool: fix dir-diff index creation when in a subdirectory
A lazy "git push" without refspec did not internally use a fully
specified refspec to perform 'current', 'simple', or 'upstream'
push, causing unnecessary "ambiguous ref" errors.
* jc/push-default-explicit:
push: test pushing ambiguously named branches
push: do not use potentially ambiguous default refspec
"git index-pack --stdin" needs an access to an existing repository,
but "git index-pack file.pack" to generate an .idx file that
corresponds to a packfile does not.
* jk/index-pack-wo-repo-from-stdin:
index-pack: skip collision check when not in repository
t: use nongit() function where applicable
index-pack: complain when --stdin is used outside of a repo
t5000: extract nongit function to test-lib-functions.sh
The function usage_msg_opt() has been updated to say "fatal:"
before the custom message programs give, when they want to die
with a message about wrong command line options followed by the
standard usage string.
* jk/parseopt-usage-msg-opt:
parse-options: print "fatal:" before usage_msg_opt()
A recent update to receive-pack to make it easier to drop garbage
objects made it clear that GIT_ALTERNATE_OBJECT_DIRECTORIES cannot
have a pathname with a colon in it (no surprise!), and this in turn
made it impossible to push into a repository at such a path. This
has been fixed by introducing a quoting mechanism used when
appending such a path to the colon-separated list.
* jk/quote-env-path-list-component:
t5615-alternate-env: double-quotes in file names do not work on Windows
t5547-push-quarantine: run the path separator test on Windows, too
tmp-objdir: quote paths we add to alternates
alternates: accept double-quoted paths
Unlike "git am --abort", "git cherry-pick --abort" moved HEAD back
to where cherry-pick started while picking multiple changes, when
the cherry-pick stopped to ask for help from the user, and the user
did "git reset --hard" to a different commit in order to re-attempt
the operation.
* sb/sequencer-abort-safety:
Revert "sequencer: remove useless get_dir() function"
sequencer: remove useless get_dir() function
sequencer: make sequencer abort safer
t3510: test that cherry-pick --abort does not unsafely change HEAD
am: change safe_to_abort()'s not rewinding error into a warning
am: fix filename in safe_to_abort() error message
The way to specify hotkeys to "xxdiff" that is used by "git
mergetool" has been modernized to match recent versions of xxdiff.
* da/mergetool-xxdiff-hotkey:
mergetools: fix xxdiff hotkeys
"git pull --rebase", when there is no new commits on our side since
we forked from the upstream, should be able to fast-forward without
invoking "git rebase", but it didn't.
* jc/pull-rebase-ff:
pull: fast-forward "pull --rebase=true"
A pathname that begins with "//" or "\\" on Windows is special but
path normalization logic was unaware of it.
* js/normalize-path-copy-ceil:
normalize_path_copy(): fix pushing to //server/share/dir on Windows
"git commit --allow-empty --only" (no pathspec) with dirty index
ought to be an acceptable way to create a new commit that does not
change any paths, but it was forbidden, perhaps because nobody
needed it so far.
* ak/commit-only-allow-empty:
commit: remove 'Clever' message for --only --amend
commit: make --only --allow-empty work without paths
"git difftool --dir-diff" had a minor regression when started from
a subdirectory, which has been fixed.
* da/difftool-dir-diff-fix:
difftool: fix dir-diff index creation when in a subdirectory
When diff.renames configuration is on (and with Git 2.9 and later,
it is enabled by default, which made it worse), "git stash"
misbehaved if a file is removed and another file with a very
similar content is added.
* jk/stash-disable-renames-internally:
stash: prefer plumbing over git-diff
Update the error messages from the dumb-http client when it fails
to obtain loose objects; we used to give sensible error message
only upon 404 but we now forbid unexpected redirects that needs to
be reported with something sensible.
* jk/http-walker-limit-redirect:
http-walker: complain about non-404 loose object errors
http: treat http-alternates like redirects
http: make redirects more obvious
remote-curl: rename shadowed options variable
http: always update the base URL for redirects
http: simplify update_url_from_redirect
Fix a corner case in merge-recursive regression that crept in
during 2.10 development cycle.
* jc/renormalize-merge-kill-safer-crlf:
convert: git cherry-pick -Xrenormalize did not work
merge-recursive: handle NULL in add_cacheinfo() correctly
cherry-pick: demonstrate a segmentation fault
"git p4" LFS support was broken when LFS stores an empty blob.
* ls/p4-empty-file-on-lfs:
git-p4: fix empty file processing for large file system backend GitLFS
mergetool.<tool>.trustExitCode configuration variable did not apply
to built-in tools, but now it does.
* da/mergetool-trust-exit-code:
mergetools/vimdiff: trust Vim's exit code
mergetool: honor mergetool.$tool.trustExitCode for built-in tools
The output from "git worktree list" was made in readdir() order,
and was unstable.
* nd/worktree-list-fixup:
worktree list: keep the list sorted
worktree.c: get_worktrees() takes a new flag argument
get_worktrees() must return main worktree as first item even on error
worktree: reorder an if statement
worktree.c: zero new 'struct worktree' on allocation
"git push --dry-run --recurse-submodule=on-demand" wasn't
"--dry-run" in the submodules.
* bw/push-dry-run:
push: fix --dry-run to not push submodules
push: --dry-run updates submodules when --recurse-submodules=on-demand
The code in "git push" to compute if any commit being pushed in the
superproject binds a commit in a submodule that hasn't been pushed
out was overly inefficient, making it unusable even for a small
project that does not have any submodule but have a reasonable
number of refs.
* hv/submodule-not-yet-pushed-fix:
submodule_needs_pushing(): explain the behaviour when we cannot answer
batch check whether submodule needs pushing into one call
serialize collection of refs that contain submodule changes
serialize collection of changed submodules
An empty directory in a working tree that can simply be nuked used
to interfere while merging or cherry-picking a change to create a
submodule directory there, which has been fixed..
* dt/empty-submodule-in-merge:
submodules: allow empty working-tree dirs in merge/cherry-pick
"git rev-parse --symbolic" failed with a more recent notation like
"HEAD^-1" and "HEAD^!".
* jk/rev-parse-symbolic-parents-fix:
rev-parse: fix parent shorthands with --symbolic
Update the isatty() emulation for Windows by updating the previous
hack that depended on internals of (older) MSVC runtime.
* js/mingw-isatty:
mingw: replace isatty() hack
mingw: fix colourization on Cygwin pseudo terminals
mingw: adjust is_console() to work with stdin
mingw: intercept isatty() to handle /dev/null as Git expects it
The character width table has been updated to match Unicode 9.0
* bb/unicode-9.0:
unicode_width.h: update the width tables to Unicode 9.0
update_unicode.sh: remove the plane filter
update_unicode.sh: automatically download newer definition files
update_unicode.sh: pin the uniset repo to a known good commit
update_unicode.sh: remove an unnecessary subshell level
update_unicode.sh: move it into contrib/update-unicode
The default Travis-CI configuration specifies newer P4 and GitLFS.
* ls/travis-update-p4-and-lfs:
travis-ci: update P4 to 16.2 and GitLFS to 1.5.2 in Linux build
There are some "gray areas" around when to omit braces from
a conditional or loop body. Since that seems to have
resulted in some arguments, let's be a little more clear
about our preferred style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
request-pull uses OPTIONS_SPEC, so no need for (meanwhile incomplete)
USAGE and LONG_USAGE anymore.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following part of the description:
git bisect (bad|new) [<rev>]
git bisect (good|old) [<rev>...]
may be a bit confusing, as a reader may wonder if instead it should be:
git bisect (bad|good) [<rev>]
git bisect (old|new) [<rev>...]
Of course the difference between "[<rev>]" and "[<rev>...]" should hint
that there is a good reason for the way it is.
But we can further clarify and complete the description by adding
"<term-new>" and "<term-old>" to the "bad|new" and "good|old"
alternatives.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few of the tests want to check that "git grep -P -E" will
override -P with -E, and vice versa. To do so, we use a
regex with "\x{..}", which is valid in PCRE but not defined
by POSIX (for basic or extended regular expressions).
However, POSIX declares quite a lot of syntax, including
"\x", as "undefined". That leaves implementations free to
extend the standard if they choose. At least one, musl libc,
implements "\x" in the same way as PCRE. Our tests check
that "-E" complains about "\x", which fails with musl.
We can fix this by finding some construct which behaves
reliably on both PCRE and POSIX, but differently in each
system.
One such construct is the use of backslash inside brackets.
In PCRE, "[\d]" interprets "\d" as it would outside the
brackets, matching a digit. Whereas in POSIX, the backslash
must be treated literally, and we match either it or a
literal "d". Moreover, implementations are not free to
change this according to POSIX, so we should be able to rely
on it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you hit ^C to interrupt a git command going to a pager,
this usually leaves the pager running. But when a dashed
external is in use, the pager ends up in a funny state and
quits (but only after eating one more character from the
terminal!). This fixes it.
Explaining the reason will require a little background.
When git runs a pager, it's important for the git process to
hang around and wait for the pager to finish, even though it
has no more data to feed it. This is because git spawns the
pager as a child, and thus the git process is the session
leader on the terminal. After it dies, the pager will finish
its current read from the terminal (eating the one
character), and then get EIO trying to read again.
When you hit ^C, that sends SIGINT to git and to the pager,
and it's a similar situation. The pager ignores it, but the
git process needs to hang around until the pager is done. We
addressed that long ago in a3da882120 (pager: do
wait_for_pager on signal death, 2009-01-22).
But when you have a dashed external (or an alias pointing to
a builtin, which will re-exec git for the builtin), there's
an extra process in the mix. For instance, running:
$ git -c alias.l=log l
will end up with a process tree like:
git (parent)
\
git-log (child)
\
less (pager)
If you hit ^C, SIGINT goes to all of them. The pager ignores
it, and the child git process will end up in wait_for_pager().
But the parent git process will die, and the usual EIO
trouble happens.
So we really want the parent git process to wait_for_pager(),
but of course it doesn't know anything about the pager at
all, since it was started by the child. However, we can
have it wait on the git-log child, which in turn is waiting
on the pager. And that's what this patch does.
There are a few design decisions here worth explaining:
1. The new feature is attached to run-command's
clean_on_exit feature. Partly this is convenience,
since that feature already has a signal handler that
deals with child cleanup.
But it's also a meaningful connection. The main reason
that dashed externals use clean_on_exit is to bind the
two processes together. If somebody kills the parent
with a signal, we propagate that to the child (in this
instance with SIGINT, we do propagate but it doesn't
matter because the original signal went to the whole
process group). Likewise, we do not want the parent
to go away until the child has done so.
In a traditional Unix world, we'd probably accomplish
this binding by just having the parent execve() the
child directly. But since that doesn't work on Windows,
everything goes through run_command's more spawn-like
interface.
2. We do _not_ automatically waitpid() on any
clean_on_exit children. For dashed externals this makes
sense; we know that the parent is doing nothing but
waiting for the child to exit anyway. But with other
children, it's possible that the child, after getting
the signal, could be waiting on the parent to do
something (like closing a descriptor). If we were to
wait on such a child, we'd end up in a deadlock. So
this errs on the side of caution, and lets callers
enable the feature explicitly.
3. When we send children the cleanup signal, we send all
the signals first, before waiting on any children. This
is to avoid the case where one child might be waiting
on another one to exit, causing a deadlock. We inform
all of them that it's time to die before reaping any.
In practice, there is only ever one dashed external run
from a given process, so this doesn't matter much now.
But it future-proofs us if other callers start using
the wait_after_clean mechanism.
There's no automated test here, because it would end up racy
and unportable. But it's easy to reproduce the situation by
running the log command given above and hitting ^C.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to exec a git sub-command, we pass along the
status code from run_command(). But that may return -1 if we
ran into an error with pipe() or execve(). This tends to
work (and end up as 255 due to twos-complement wraparound
and truncation), but in general it's probably a good idea to
avoid negative exit codes for portability.
We can easily translate to the normal generic "128" code we
get when syscalls cause us to die.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we run a dashed external, we use the one-liner
run_command_v_opt() to do so. Let's switch to using a
child_process struct, which has two advantages:
1. We can drop all of the allocation and cleanup code for
building our custom argv array, and just rely on the
builtin argv_array (at the minor cost of doing a few
extra mallocs).
2. We have access to the complete range of child_process
options, not just the ones that the "_opt()" form can
forward.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The result of getenv() is not guaranteed by POSIX to last
beyond another call to getenv(), or setenv(), etc. We
should duplicate the string before returning to the caller
to avoid any surprises.
We already keep a cached pointer to avoid repeatedly leaking
the result of system_path(). We can use the same pointer
here to avoid allocating and leaking for each call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Technically, it is correct that git_exec_path() returns a possibly
malloc()ed string returned from system_path(), and it is sometimes
not allocated. Cache the result in a static variable and make sure
that we call system_path() only once, which plugs a potential leak.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible for content currently found in one file to
have originated in two separate files, each of which may
have been modified in some single older commit. The
--porcelain output generates an incorrect "previous" header
in this case, whereas --line-porcelain gets it right. The
problem is that the porcelain output tries to omit repeated
details of commits, and treats "previous" as a property of
the commit, when it is really a property of the blamed block
of lines.
Let's look at an example. In a case like this, you might see
this output from --line-porcelain:
SOME_SHA1 1 1 1
author ...
committer ...
previous SOME_SHA1^ file_one
filename file_one
...some line content...
SOME_SHA1 2 1 1
author ...
committer ...
previous SOME_SHA1^ file_two
filename file_two
...some different content....
The "filename" fields tell us that the two lines are from
two different files. But notice that the filename also
appears in the "previous" field, which tells us where to
start a re-blame. The second content line never appeared in
file_one at all, so we would obviously need to re-blame from
file_two (or possibly even some other file, if had just been
renamed to file_two in SOME_SHA1).
So far so good. Now here's what --porcelain looks like:
SOME_SHA1 1 1 1
author ...
committer ...
previous SOME_SHA1^ file_one
filename file_one
...some line content...
SOME_SHA1 2 1 1
filename file_two
...some different content....
We've dropped the author and committer fields from the
second line, as they would just be repeats. But we can't
omit "filename", because it depends on the actual block of
blamed lines, not just the commit. This is handled by
emit_porcelain_details(), which will show the filename
either if it is the first mention of the commit _or_ if the
commit has multiple paths in it.
But we don't give "previous" the same handling. It's written
inside emit_one_suspect_detail(), which bails early if we've
already seen that commit. And so the output above is wrong;
a reader would assume that the correct place to re-blame
line two is from file_one, but that's obviously nonsense.
Let's treat "previous" the same as "filename", and show it
fresh whenever we know we are in a confusing case like this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
You can already ask blame for full sha1s with "-l" or with
"--abbrev=40". But for consistency with other parts of Git,
we should support "--no-abbrev".
Worse, blame already accepts --no-abbrev, but it's totally
broken. When we see --no-abbrev, the abbrev variable is set
to 0, which is then used as a printf precision. For regular
sha1s, that means we print nothing at all (which is very
wrong). For boundary commits we decrement it to "-1", which
printf interprets as "no limit" (which is almost correct,
except it misses the 39-length magic explained in the
previous commit).
Let's detect --no-abbrev and behave as if --abbrev=40 was
given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The blame command internally adds 1 to any requested sha1
abbreviation length, and then subtracts it when outputting a
boundary commit. This lets regular and boundary sha1s line
up visually, but it misses one corner case.
When the requested length is 40, we bump the value to 41.
But since we only have 40 characters, that's all we can show
(fortunately the truncation is done by a printf precision
field, so it never tries to read past the end of the
buffer). So a normal sha1 shows 40 hex characters, and a
boundary sha1 shows "^" plus 40 hex characters. The result
is misaligned.
The "-l" option to show long sha1s gets around this by
skipping the "abbrev" variable entirely and just always
using GIT_SHA1_HEXSZ. This avoids the "+1" issue, but it
does mean that boundary commits only have 39 characters
printed. This is somewhat odd, but it does look good
visually: the results are aligned and left-justified. The
alternative would be to allocate an extra column that would
contain either an extra space or the "^" boundary marker.
As this is by definition the human-readable view, it's
probably not that big a deal either way (and of course
--porcelain, etc, correctly produce correct 40-hex sha1s).
But for consistency, this patch teaches --abbrev=40 to
produce the same output as "-l" (always left-aligned, with
40-hex for normal sha1s, and "^" plus 39-hex for
boundaries).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generate the squash commit message incrementally running
a sed script once for each commit. It parses "This is
a combination of <N> commits" from the first line of the
existing message, adds one to <N>, and uses the result as
the number of our current message.
Since f2d17068fd (i18n: rebase-interactive: mark comments of
squash for translation, 2016-06-17), the first line may be
localized, and sed uses a pretty liberal regex, looking for:
/^#.*([0-9][0-9]*)/
The "[0-9][0-9]*" tries to match double digits, but it
doesn't quite work. The first ".*" is greedy, so if you
have:
This is a combination of 10 commits.
it will eat up "This is a combination of 1", leaving "0" to
match the first "[0-9]" digit, and then skipping the
optional match of "[0-9]*".
As a result, the count resets every 10 commits, and a
15-commit squash would end up as:
# This is a combination of 5 commits.
# This is the 1st commit message:
...
# This is the commit message #2:
... and so on ..
# This is the commit message #10:
...
# This is the commit message #1:
...
# This is the commit message #2:
... etc, up to 5 ...
We can fix this by making the ".*" less greedy. Instead of
depending on ".*?" working portably, we can just limit the
match to non-digit characters, which accomplishes the same
thing.
Reported-by: Brandon Tolsch <btolsch@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the detached HEAD check from branch_get_push_1() to
branch_get_push() to avoid setting branch->push_tracking_ref when
branch is NULL.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 4aff646d17 (archive-zip: mark text files in archives,
2015-03-05), the zip archiver will look at the userdiff
driver to decide whether a file is text or binary. This
usually doesn't need to look any further than the attributes
themselves (e.g., "-diff", etc). But if the user defines a
custom driver like "diff=foo", we need to look at
"diff.foo.binary" in the config. Prior to this patch, we
didn't actually load it.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bitmap index only works for single packs, so requesting an
incremental repack with bitmap indexes makes no sense.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git gc --auto does an incremental repack of loose objects, we do
not expect to be able to write a bitmap; it is very likely that
objects in the new pack will have references to objects outside of the
pack. So we shouldn't try to write a bitmap, because doing so will
likely issue a warning.
This warning was making its way into gc.log. When the gc.log was
present, future auto gc runs would refuse to run.
Patch by Jeff King.
Bug report, test, and commit message by David Turner.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We somehow forgot to update the "default is 7" in the
documentation. Also give a way to explicitly ask the auto-scaling
by setting config.abbrev to "auto".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We could rely on atexit() to clean up everything, but let's be
explicit when we can. And it's good anyway because the function is
called the second time in the same process, we're in trouble.
This function should not affect the successful case because after
commit_lock_file() is called, rollback_lock_file() becomes no-op,
as long as it is initialized.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows has carried a patch that depended on internals
of MSVC runtime, but it does not work correctly with recent MSVC
runtime. A replacement was written originally for compiling
with VC++. The patch in this message is a backport of that
replacement, and it also fixes the previous attempt to make
isatty() tell that /dev/null is *not* an interactive terminal.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Tested-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git only colours the output and uses pagination if isatty() returns 1.
MSYS2 and Cygwin emulate pseudo terminals via named pipes, meaning that
isatty() returns 0.
f7f90e0f4f (mingw: make isatty() recognize MSYS2's pseudo terminals
(/dev/pty*), 2016-04-27) fixed this for MSYS2 terminals, but not for
Cygwin.
The named pipes that Cygwin and MSYS2 use are very similar. MSYS2 PTY pipes
are called 'msys-*-pty*' and Cygwin uses 'cygwin-*-pty*'. This commit
modifies the existing check to allow both MSYS2 and Cygwin PTY pipes to be
identified as TTYs.
Note that pagination is still broken when running Git for Windows from
within Cygwin, as MSYS2's less.exe is spawned (and does not like to
interact with Cygwin's PTY).
This partially fixes https://github.com/git-for-windows/git/issues/267
Signed-off-by: Alan Davies <alan.n.davies@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When determining whether a handle corresponds to a *real* Win32 Console
(as opposed to, say, a character device such as /dev/null), we use the
GetConsoleOutputBufferInfo() function as a tell-tale.
However, that does not work for *input* handles associated with a
console. Let's just use the GetConsoleMode() function for input handles,
and since it does not work on output handles fall back to the previous
method for those.
This patch prepares for using is_console() instead of my previous
misguided attempt in cbb3f3c9b1 (mingw: intercept isatty() to handle
/dev/null as Git expects it, 2016-12-11) that broke everything on
Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In typical uses of fast-import, trees are inherited from a parent
commit. In that case, the tree_entry for the branch looks like:
.versions[1].sha1 = $some_sha1
.tree = <tree structure loaded from $some_sha1>
However, when trees are imported, rather than inherited, that is not the
case. One can import a tree with a filemodify command, replacing the
root tree object.
e.g.
"M 040000 $some_sha1 \n"
In this case, the tree_entry for the branch looks like:
.versions[1].sha1 = $some_sha1
.tree = NULL
When adding new notes with the notemodify command, do_change_note_fanout
is called to get a notes count, and to do so, it loops over the
tree_entry->tree, but doesn't do anything when the tree is NULL.
In the latter case above, it means do_change_note_fanout thinks the tree
contains no notes, and new notes are added with no fanout.
Interestingly, do_change_note_fanout does check whether subdirectories
have a NULL .tree, in which case it uses load_tree(). Which means the
right behaviour happens when using the filemodify command to import
subdirectories.
This change makes do_change_note_fanount call load_tree() whenever the
tree_entry it is given has no tree loaded, making all cases handled
equally.
Signed-off-by: Mike Hommey <mh@glandium.org>
Reviewed-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two ways to unlock a file: commit, or revert. Rename it to
commit_and_out to avoid confusion.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 6b4b013f18 (mailinfo: handle in-body header continuations,
2016-09-20, v2.11.0) mailinfo.c has contained new code with an
assert of the form:
assert(call_a_function(...))
The function in question, check_header, has side effects. This
means that when NDEBUG is defined during a release build the
function call is omitted entirely, the side effects do not
take place and tests (fortunately) start failing.
Since the only time that mi->inbody_header_accum is appended to is
in check_inbody_header, and appending onto a blank
mi->inbody_header_accum always happens when is_inbody_header is
true, this guarantees a prefix that causes check_header to always
return true.
Therefore replace the assert with an if !check_header + DIE
combination to reflect this.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When importing from multiple perforce paths - we may attempt to
import a changelist that contains files from two (or more) of these
depot paths. Currently, this results in multiple git commits - one
containing the changes, and the other(s) as empty commit(s). This
behavior was introduced in commit 1f90a64891 ("git-p4: reduce number
of server queries for fetches", 2015-12-19).
Reproduction Steps:
1. Have a git repo cloned from a perforce repo using multiple
depot paths (e.g. //depot/foo and //depot/bar).
2. Submit a single change to the perforce repo that makes changes
in both //depot/foo and //depot/bar.
3. Run "git p4 sync" to sync the change from #2.
Change is synced as multiple commits, one for each depot path that
was affected.
Using a set, instead of a list inside p4ChangesForPaths() ensures
that each changelist is unique to the returned list, and therefore
only a single commit is generated for each changelist.
Reported-by: James Farwell <jfarwell@vmware.com>
Signed-off-by: George Vanburgh <gvanburgh@bloomberg.net>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When submitting to P4, if git-p4 came across a symlinked
directory, then during the generation of the submit diff, it would
try to open it as a normal file and fail.
Spot symlinks (of any type) and output a description of the symlink
instead.
Add a test case.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t0021.15 creates files, adds them to the index, and commits them. All
this usually happens in a test run within the same second and Git cannot
know if the files have been changed between `add` and `commit`. Thus,
Git has to run the clean filter in both operations. Sometimes these
invocations spread over two different seconds and Git can infer that the
files were not changed between `add` and `commit` based on their
modification timestamp. The test would fail as it expects the filter
invocation. Remove this expectation to make the test stable.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
You can run "git index-pack path/to/foo.pack" outside of a
repository to generate an index file, or just to verify the
contents. There's no point in doing a collision check, since
we obviously do not have any objects to collide with.
The current code will blindly look in .git/objects based on
the result of setup_git_env(). That effectively gives us the
right answer (since we won't find any objects), but it's a
waste of time, and it conflicts with our desire to
eventually get rid of the "fallback to .git" behavior of
setup_git_env().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
normalize_path_copy() is not prepared to keep the double-slash of a
//server/share/dir kind of path, but treats it like a regular POSIX
style path and transforms it to /server/share/dir.
The bug manifests when 'git push //server/share/dir master' is run,
because tmp_objdir_add_as_alternate() uses the path in normalized
form when it registers the quarantine object database via
link_alt_odb_entries(). Needless to say that the directory cannot be
accessed using the wrongly normalized path.
Fix it by skipping all of the root part, not just a potential drive
prefix. offset_1st_component takes care of this, see the
implementation in compat/mingw.c::mingw_offset_1st_component().
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many tests want to run a command outside of any git repo;
with the nongit() function this is now a one-liner. It saves
a few lines, but more importantly, it's immediately obvious
what the code is trying to accomplish.
This doesn't convert every such case in the test suite; it
just covers those that want to do a one-off command. Other
cases, such as the ones in t4035, are part of a larger
scheme of outside-repo files, and it's less confusing for
them to stay consistent with the surrounding tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index-pack builtin is marked as RUN_SETUP_GENTLY,
because it's perfectly fine to index a pack in the
filesystem outside of any repository. However, --stdin mode
will write the result to the object database, which does not
make sense outside of a repository. Doing so creates a bogus
".git" directory with nothing in it except the newly-created
pack and its index.
Instead, let's flag this as an error and abort.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function abstracts the idea of running a command
outside of any repository (which is slightly awkward to do
because even if you make a non-repo directory, git may keep
walking up outside of the trash directory). There are
several scripts that use the same technique, so let's make
the function available for everyone.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The general status and future of gmane is unclear at this
point, but certainly it does not seem to be carrying
gmane.comp.version-control.git at all anymore. Let's point
to public-inbox.org, which seems to be the favored archive
on the list these days (and which uses message-ids in its
URLs, making the links somewhat future-proof).
Reported-by: Chiel ten Brinke <ctenbrinke@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 39784cd362.
The function had only one caller when the "remove useless" was
written, but another topic will soon make heavy use of it and more
importantly the function will return different paths depending on
the value in opts.
Programs may use usage_msg_opt() to print a brief message
followed by the program usage, and then exit. The message
isn't prefixed at all, though, so it doesn't match our usual
error output and is easy to overlook:
$ git clone 1 2 3
Too many arguments.
usage: git clone [<options>] [--] <repo> [<dir>]
-v, --verbose be more verbose
-q, --quiet be more quiet
--progress force progress reporting
-n, --no-checkout don't create a checkout
--bare create a bare repository
[...and so on for another 31 lines...]
It looks especially bad when the message starts with an
option, like:
$ git replace -e
-e needs exactly one argument
usage: git replace [-f] <object> <replacement>
or: git replace [-f] --edit <object>
[...etc...]
Let's put our usual "fatal:" prefix in front of it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you're working on the git project, you're unlikely to
care about random bits in contrib/ (e.g., you would not want
to jump to the copy of xmalloc in the wincred credential
helper). Nobody has really complained because there are
relatively few C files in contrib.
Now that we're matching shell scripts, too, we get quite a
few more hits, especially in the obsolete contrib/examples
directory. Looking for usage() should turn up the one in
git-sh-setup, not in some long-dead version of git-clone.
Let's just exclude all of contrib. Any specific projects
there which are big enough to want tags can generate them
separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We feed FIND_SOURCE_FILES to ctags to help developers
navigate to particular functions, but we only feed C source
code. The same feature can be helpful when working with
shell scripts (especially the test suite). Modern versions
of ctags know how to parse shell scripts; we just need to
feed the filenames to it.
This patch specifically avoids including the individual test
scripts themselves. Those are unlikely to be of interest,
and there are a lot of them to process. It does pick up
test-lib.sh and test-lib-functions.sh.
Note that our negative pathspec already excludes the
individual scripts for the ls-files case, but we need to
loosen the `find` rule to match it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test directory may contain three types of files that
match our patterns:
1. Helper programs in t/helper.
2. Sample data files (e.g., t/t4051/hello.c).
3. Untracked cruft in trash directories and t/perf/build.
We want to match (1), but not the other two, as they just
clutter up the list.
For the ls-files method, we can drop (2) with a negative
pathspec. We do not have to care about (3), since ls-files
will not list untracked files.
For `find`, we can match both cases with `-prune` patterns.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we add to this in future commits, the formatting is going
to make it harder and harder to read. Let's write it more as
we would in a shell script, putting each logical block on
its own line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rerunning update-unicode.sh that we fixed in the previous commits
produces these new tables.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The uniset upstream has accepted my patches that eliminate the Unicode
plane offsets from the output in '--32' mode.
Remove the corresponding filter in update_unicode.sh.
This also fixes the issue that the plane offsets were not removed from
the second uniset call.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking just for the unicode data files' existence is not sufficient;
we should also download them if a newer version exists on the Unicode
consortium's servers. Option -N of wget does this nicely for us.
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The uniset upstream has added more commits that for example change the
hexadecimal output in '--32' mode to decimal. Let's pin the repo to a
commit that still outputs the width tables in the format we want.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the move into contrib/update-unicode, we no longer create the
unicode directory to have a clean working folder. Instead, the directory
of the script is used. This means that the subshell can be removed.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As it's used only by a tiny minority of the Git developer population,
this script does not belong into the main Git source directory.
Move it into contrib/ and adjust the paths to account for the new
location.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To perform the test case on Windows in a way that corresponds to the
POSIX version, inject the semicolon in a directory name.
Typically, an absolute POSIX style path, such as the one in $PWD, is
translated into a Windows style path by bash when it invokes git.exe.
However, the presence of the semicolon suppresses this translation;
but the untranslated POSIX style path is useless for git.exe.
Therefore, instead of $PWD pass the Windows style path that $(pwd)
produces.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the space between redirection and file name.
Also remove unnecessary invocations of subshells, such as
(cd submod &&
echo X >untracked
) &&
as there is no point of having the shell for functional purposes.
In case of a single Git command use the `-C` option to let Git cd into
the directory.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03) regressed pushes to
repositories with colon (or semi-colon in Windows in them)
because it adds the repository's main object directory to
GIT_ALTERNATE_OBJECT_DIRECTORIES. The receiver interprets
the colon as a delimiter, not as part of the path, and
index-pack is unable to find objects which it needs to
resolve deltas.
The previous commit introduced a quoting mechanism for the
alternates list; let's use it here to cover this case. We'll
avoid quoting when we can, though. This alternate setup is
also used when calling hooks, so it's possible that the user
may call older git implementations which don't understand
the quoting mechanism. By quoting only when necessary, this
setup will continue to work unless the user _also_ has a
repository whose path contains the delimiter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read lists of alternates from objects/info/alternates
files (delimited by newline), as well as from the
GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable
(delimited by colon or semi-colon, depending on the
platform).
There's no mechanism for quoting the delimiters, so it's
impossible to specify an alternate path that contains a
colon in the environment, or one that contains a newline in
a file. We've lived with that restriction for ages because
both alternates and filenames with colons are relatively
rare, and it's only a problem when the two meet. But since
722ff7f87 (receive-pack: quarantine objects until
pre-receive accepts, 2016-10-03), which builds on the
alternates system, every push causes the receiver to set
GIT_ALTERNATE_OBJECT_DIRECTORIES internally.
It would be convenient to have some way to quote the
delimiter so that we can represent arbitrary paths.
The simplest thing would be an escape character before a
quoted delimiter (e.g., "\:" as a literal colon). But that
creates a backwards compatibility problem: any path which
uses that escape character is now broken, and we've just
shifted the problem. We could choose an unlikely escape
character (e.g., something from the non-printable ASCII
range), but that's awkward to use.
Instead, let's treat names as unquoted unless they begin
with a double-quote, in which case they are interpreted via
our usual C-stylke quoting rules. This also breaks
backwards-compatibility, but in a smaller way: it only
matters if your file has a double-quote as the very _first_
character in the path (whereas an escape character is a
problem anywhere in the path). It's also consistent with
many other parts of git, which accept either a bare pathname
or a double-quoted one, and the sender can choose to quote
or not as required.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Last time I checked, I was living in the UTC+01:00 time zone. UTC+02:00
would be Central European _Summer_ Time.
Signed-off-by: Luis Ressel <aranea@aixah.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've always supported these config keys in git-svn,
so document them so users won't have to respecify them
on every invocation.
Reported-by: Juergen Kosel <juergen.kosel@gmx.de>
Signed-off-by: Eric Wong <e@80x24.org>
Blindly checking a path component for falsiness is unwise, as
"0" is false to Perl, but a valid pathname component for SVN
(or any filesystem).
Found via random code reading.
Signed-off-by: Eric Wong <e@80x24.org>
xxdiff was using a mix of "Ctrl-<key>" and "Ctrl+<key>" hotkeys.
The dashed "-" form is not accepted by newer xxdiff versions.
Use the plus "+" form only.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Always call the list of files @files.
Always call the worktree $worktree.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make difftool chdir to the top-level of the repository as soon as it can
so that we can simplify how paths are handled. Replace construction of
absolute paths via string concatenation with relative paths wherever
possible. The bulk of the code no longer needs to use absolute paths.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The double-slash fixup on the $workdir variable was being
performed just-in-time to avoid double-slashes in symlink
targets, but the rest of the code was silently using paths with
embedded "//" in them.
A recent user-reported error message contained double-slashes.
Eliminate the issue by sanitizing inputs as soon as they arrive.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9ec26e7977 (difftool: fix argument handling in subdirs, 2016-07-18)
corrected how path arguments are handled in a subdirectory, but
it introduced a regression in how entries outside of the
subdirectory are handled by dir-diff.
When preparing the right-side of the diff we only include the
changed paths in the temporary area.
The left side of the diff is constructed from a temporary
index that is built from the same set of changed files, but it
was being constructed from within the subdirectory. This is a
problem because the indexed paths are toplevel-relative, and
thus they were not getting added to the index.
Teach difftool to chdir to the toplevel of the repository before
preparing its temporary indexes. This ensures that all of the
toplevel-relative paths are valid.
Add test cases to more thoroughly exercise this scenario.
Reported-by: Frank Becker <fb@mooflu.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git's source code calls isatty(), it really asks whether the
respective file descriptor is connected to an interactive terminal.
Windows' _isatty() function, however, determines whether the file
descriptor is associated with a character device. And NUL, Windows'
equivalent of /dev/null, is a character device.
Which means that for years, Git mistakenly detected an associated
interactive terminal when being run through the test suite, which
almost always redirects stdin, stdout and stderr to /dev/null.
This bug only became obvious, and painfully so, when the new
bisect--helper entered the `pu` branch and made the automatic build & test
time out because t6030 was waiting for an answer.
For details, see
https://msdn.microsoft.com/en-us/library/f4s0ddew.aspx
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
What was intended was perhaps "... plumbing does for you" ("you" added), but
simply omitting the word "for" is more terse and gets the intended point across
just as well, if not more so.
I originally went with the approach of writing "for you", but Junio C
Hamano suggested this approach instead.
Signed-off-by: Kristoffer Haugsbakk <kristoffer.haugsbakk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By adding the word "just", which might have been accidentally omitted.
Adding the word "just" makes it clear that the point is to *not* do an
octopus merge simply because you *can* do it. In other words, you
should have a reason for doing it beyond simply having two (seemingly)
independent commits that you need to merge into another branch, since
it's not always the best approach.
The previous sentence made it look more like it was trying to say that
you shouldn't do an octopus merge *because* you can do an octopus merge.
Although this interpretation doesn't make sense and the rest of the
paragraph makes the intended meaning clear, this adjustment should make
the intent of the sentence more immediately clear to the reader.
Signed-off-by: Kristoffer Haugsbakk <kristoffer.haugsbakk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using the command 'git clone' as a verb, use "run" as the
verb indicating the action of executing the command 'git clone'.
Signed-off-by: Kristoffer Haugsbakk <kristoffer.haugsbakk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add definite and indefinite articles in three places where they were
missing.
- Use "the" in front of a directory name
- Use "the" in front of "style of cooperation"
- Use an indefinite article in front of "CVS background"
Signed-off-by: Kristoffer Haugsbakk <kristoffer.haugsbakk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is used only once, for the removal of the
directory. It is not used for the creation of the directory nor
anywhere else.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In contrast to "git am --abort", a sequencer abort did not check
whether the current HEAD is the one that is expected. This can lead
to loss of work (when not spotted and resolved using reflog before
the garbage collector chimes in).
This behavior is now changed by mimicking "git am --abort". The
abortion is done but HEAD is not changed when the current HEAD is
not the expected HEAD.
A new file "sequencer/abort-safety" is added to save the expected
HEAD.
The new behavior is only active when --abort is invoked on multiple
picks. The problem does not occur for the single-pick case because
it is handled differently.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior is now documented; more importantly, rewarding the user
with a "Wow, you are clever" praise afterwards is not an effective
way to advertise the feature--at that point the user already knows.
Signed-off-by: Andreas Krey <a.krey@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two different places where the --no-abbrev option is parsed,
and two different places where SHA-1s are abbreviated. We normally parse
--no-abbrev with setup_revisions(), but in the no-index case, "git diff"
calls diff_opt_parse() directly, and diff_opt_parse() didn't handle
--no-abbrev until now. (It did handle --abbrev, however.) We normally
abbreviate SHA-1s with find_unique_abbrev(), but commit 4f03666 ("diff:
handle sha1 abbreviations outside of repository, 2016-10-20) recently
introduced a special case when you run "git diff" outside of a
repository.
setup_revisions() does also call diff_opt_parse(), but not for --abbrev
or --no-abbrev, which it handles itself. setup_revisions() sets
rev_info->abbrev, and later copies that to diff_options->abbrev. It
handles --no-abbrev by setting abbrev to zero. (This change doesn't
touch that.)
Setting abbrev to zero was broken in the outside-of-a-repository special
case, which until now resulted in a truly zero-length SHA-1, rather than
taking zero to mean do not abbreviate. The only way to trigger this bug,
however, was by running "git diff --raw" without either the --abbrev or
--no-abbrev options, because 1) without --raw it doesn't respect abbrev
(which is bizarre, but has been that way forever), 2) we silently clamp
--abbrev=0 to MINIMUM_ABBREV, and 3) --no-abbrev wasn't handled until
now.
The outside-of-a-repository case is one of three no-index cases. The
other two are when one of the files you're comparing is outside of the
repository you're in, and the --no-index option.
Signed-off-by: Jack Bates <jack@nottheoilrig.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9ec26e7977 (difftool: fix argument handling in subdirs, 2016-07-18)
corrected how path arguments are handled in a subdirectory, but
it introduced a regression in how entries outside of the
subdirectory are handled by dir-diff.
When preparing the right-side of the diff we only include the
changed paths in the temporary area.
The left side of the diff is constructed from a temporary
index that is built from the same set of changed files, but it
was being constructed from within the subdirectory. This is a
problem because the indexed paths are toplevel-relative, and
thus they were not getting added to the index.
Teach difftool to chdir to the toplevel of the repository before
preparing its temporary indexes. This ensures that all of the
toplevel-relative paths are valid.
Add test cases to more thoroughly exercise this scenario.
Reported-by: Frank Becker <fb@mooflu.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message tells the user that something went terribly wrong
and the --abort could not be performed. But the --abort is performed,
only without rewinding. By simply changing the error into a warning,
we indicate the user that she must not try something like
"git am --abort --force", instead she just has to check the HEAD.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some context before we talk about the removed code.
This paint_down() is part of step 6 of 58babff (shallow.c: the 8 steps
to select new commits for .git/shallow - 2013-12-05). When we fetch from
a shallow repository, we need to know if one of the new/updated refs
needs new "shallow commits" in .git/shallow (because we don't have
enough history of those refs) and which one.
The question at step 6 is, what (new) shallow commits are required in
other to maintain reachability throughout the repository _without_
cutting our history short? To answer, we mark all commits reachable from
existing refs with UNINTERESTING ("rev-list --not --all"), mark shallow
commits with BOTTOM, then for each new/updated refs, walk through the
commit graph until we either hit UNINTERESTING or BOTTOM, marking the
ref on the commit as we walk.
After all the walking is done, we check the new shallow commits. If we
have not seen any new ref marked on a new shallow commit, we know all
new/updated refs are reachable using just our history and .git/shallow.
The shallow commit in question is not needed and can be thrown away.
So, the code.
The loop here (to walk through commits) is basically
1. get one commit from the queue
2. ignore if it's SEEN or UNINTERESTING
3. mark it
4. go through all the parents and..
5a. mark it if it's never marked before
5b. put it back in the queue
What we do in this patch is drop step 5a because it is not
necessary. The commit being marked at 5a is put back on the queue, and
will be marked at step 3 at the next iteration. The only case it will
not be marked is when the commit is already marked UNINTERESTING (5a
does not check this), which will be ignored at step 2.
But we don't care about refs marking on UNINTERESTING. We care about the
marking on _shallow commits_ that are not reachable from our current
history (and having UNINTERESTING on it means it's reachable). So it's
ok for an UNINTERESTING not to be ref-marked.
Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
First of all, 1 << 31 is technically undefined behaviour, so let's just
use an unsigned literal.
If i is 'signed int' and gcc doesn't know that i is positive, gcc
generates code to compute the C99-mandated values of "i / 32" and "i %
32", which is a lot more complicated than simple a simple shifts/mask.
The only caller of paint_down actually passes an "unsigned int" value,
but the prototype of paint_down causes (completely well-defined)
conversion to signed int, and gcc has no way of knowing that the
converted value is non-negative. Just make the id parameter unsigned.
In update_refstatus, the change in generated code is much smaller,
presumably because gcc is smart enough to see that i starts as 0 and is
only incremented, so it is allowed (per the UD of signed overflow) to
assume that i is always non-negative. But let's just help less smart
compilers generate good code anyway.
Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expression info->free+size is technically undefined behaviour in
exactly the case we want to test for. Moreover, the compiler is likely
to translate the expression to
(unsigned long)info->free + size > (unsigned long)info->end
where there's at least a theoretical chance that the LHS could wrap
around 0, giving a false negative.
This might as well be written using pointer subtraction avoiding these
issues.
Signed-off-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
paint_alloc() allocates a big block of memory and splits it into
smaller, fixed size, chunks of memory whenever it's called. Each chunk
contains enough bits to present all "new refs" [1] in a fetch from a
shallow repository.
We do not check if the new "big block" is smaller than the requested
memory chunk though. If it happens, we'll happily pass back a memory
region smaller than expected. Which will lead to problems eventually.
A normal fetch may add/update a dozen new refs. Let's stay on the
"reasonably extreme" side and say we need 16k refs (or bits from
paint_alloc's perspective). Each chunk of memory would be 2k, much
smaller than the memory pool (512k).
So, normally, the under-allocation situation should never happen. A bad
guy, however, could make a fetch that adds more than 4m new/updated refs
to this code which results in a memory chunk larger than pool size.
Check this case and abort.
Noticed-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Reviewed-by: Jeff King <peff@peff.net>
[1] Details are in commit message of 58babff (shallow.c: the 8 steps to
select new commits for .git/shallow - 2013-12-05), step 6.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We need to allocate a "big" block of memory in paint_alloc(). The exact
size does not really matter. But the pool size has no relation with
commit-slab. Stop using that macro here.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
paint_alloc() is basically malloc(), tuned for allocating a fixed number
of bits on every call without worrying about freeing any individual
allocation since all will be freed at the end. It does it by allocating
a big block of memory every time it runs out of "free memory". "slab" is
a poor choice of name, at least poorer than "pool".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a stash, we need to look at the diff between
the working tree and HEAD, and do so using the git-diff
porcelain. Because git-diff enables porcelain config like
renames by default, this causes at least one problem. The
--name-only format will not mention the source side of a
rename, meaning we will fail to stash a deletion that is
part of a rename.
We could fix that case by passing --no-renames, but this is
a symptom of a larger problem. We should be using the
diff-index plumbing here, which does not have renames
enabled by default, and also does not respect any
potentially confusing config options.
Reported-by: Matthew Patey <matthew.patey2167@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 17966c0a6 (http: avoid disconnecting on 404s
for loose objects, 2016-07-11), we turn off curl's
FAILONERROR option and instead manually deal with failing
HTTP codes.
However, the logic to do so only recognizes HTTP 404 as a
failure. This is probably the most common result, but if we
were to get another code, the curl result remains CURLE_OK,
and we treat it as success. We still end up detecting the
failure when we try to zlib-inflate the object (which will
fail), but instead of reporting the HTTP error, we just
claim that the object is corrupt.
Instead, let's catch anything in the 300's or above as an
error (300's are redirects which are not an error at the
HTTP level, but are an indication that we've explicitly
disabled redirects, so we should treat them as such; we
certainly don't have the resulting object content).
Note that we also fill in req->errorstr, which we didn't do
before. Without FAILONERROR, curl will not have filled this
in, and it will remain a blank string. This never mattered
for the 404 case, because in the logic below we hit the
"missing_target()" branch and print nothing. But for other
errors, we'd want to say _something_, if only to fill in the
blank slot in the error message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ew/http-walker:
list: avoid incompatibility with *BSD sys/queue.h
http-walker: reduce O(n) ops with doubly-linked list
http: avoid disconnecting on 404s for loose objects
http-walker: remove unused parameter from fetch_object
The previous commit made HTTP redirects more obvious and
tightened up the default behavior. However, there's another
way for a server to ask a git client to fetch arbitrary
content: by having an http-alternates file (or a regular
alternates file, which is used as a backup).
Similar to the HTTP redirect case, a malicious server can
claim to have refs pointing at object X, return a 404 when
the client asks for X, but point to some other URL via
http-alternates, which the client will transparently fetch.
The end result is that it looks from the user's perspective
like the objects came from the malicious server, as the
other URL is not mentioned at all.
Worse, because we feed the new URL to curl ourselves, the
usual protocol restrictions do not kick in (neither curl's
default of disallowing file://, nor the protocol
whitelisting in f4113cac0 (http: limit redirection to
protocol-whitelist, 2015-09-22).
Let's apply the same rules here as we do for HTTP redirects.
Namely:
- unless http.followRedirects is set to "always", we will
not follow remote redirects from http-alternates (or
alternates) at all
- set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS
restrict ourselves to a known-safe set and respect any
user-provided whitelist.
- mention alternate object stores on stderr so that the
user is aware another source of objects may be involved
The first item may prove to be too restrictive. The most
common use of alternates is to point to another path on the
same server. While it's possible for a single-server
redirect to be an attack, it takes a fairly obscure setup
(victim and evil repository on the same host, host speaks
dumb http, and evil repository has access to edit its own
http-alternates file).
So we could make the checks more specific, and only cover
cross-server redirects. But that means parsing the URLs
ourselves, rather than letting curl handle them. This patch
goes for the simpler approach. Given that they are only used
with dumb http, http-alternates are probably pretty rare.
And there's an escape hatch: the user can allow redirects on
a specific server by setting http.<url>.followRedirects to
"always".
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The discover_refs() function has a local "options" variable
to hold the http_get_options we pass to http_get_strbuf().
But this shadows the global "struct options" that holds our
program-level options, which cannot be accessed from this
function.
Let's give the local one a more descriptive name so we can
tell the two apart.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.
Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?
If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.
Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs",
and we got pointed at "http://bob/other.git/info/refs", then
we know that the right root is "http://bob/other.git".
If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1".
We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.
The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters). Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function looks for a common tail between what we asked
for and where we were redirected to, but it open-codes the
comparison. We can avoid some confusing subtractions by
using strip_suffix_mem().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A pathname value in a clean/smudge filter process "key=value" pair can
contain the '=' character (introduced in edcc858). Make the user aware
of this issue in the docs, add a corresponding test case, and fix the
issue in filter process value parser of the example implementation in
contrib.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove superfluous .gitignore pattern and invalid '.' in `git commit`
calls.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If git-p4 tried to store an empty file in GitLFS then it crashed while
parsing the pointer file:
oid = re.search(r'^oid \w+:(\w+)', pointerFile, re.MULTILINE).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
This happens because GitLFS does not create a pointer file for an empty
file. Teach git-p4 this behavior to fix the problem and add a test case.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update Travis-CI dependencies to the latest available versions in
Linux build.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--only is implied when paths are present, and required
them unless --amend. But with --allow-empty it should
be allowed as well - it is the only way to create an
empty commit in the presence of staged changes.
Signed-off-by: Andreas Krey <a.krey@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next line the `actual` is overwritten again, so no need to redirect
the output of checkout into that file.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Working with a repo that used to be all CRLF. At some point it
was changed to all LF, with `text=auto` in .gitattributes.
Trying to cherry-pick a commit from before the switchover fails:
$ git cherry-pick -Xrenormalize <commit>
fatal: CRLF would be replaced by LF in [path]
Commit 65237284 "unify the "auto" handling of CRLF" introduced
a regression:
Whenever crlf_action is CRLF_TEXT_XXX and not CRLF_AUTO_XXX,
SAFE_CRLF_RENORMALIZE was feed into check_safe_crlf(). This is
wrong because here everything else than SAFE_CRLF_WARN is treated as
SAFE_CRLF_FAIL.
Call check_safe_crlf() only if checksafe is SAFE_CRLF_WARN or
SAFE_CRLF_FAIL.
Reported-by: Eevee (Lexy Munroe) <eevee@veekun.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull --rebase" always runs "git rebase" after fetching the
commit to serve as the new base, even when the new base is a
descendant of the current HEAD, i.e. we haven't done any work.
In such a case, we can instead fast-forward to the new base without
invoking the rebase process.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
l10n-2.11.0-rnd3.1: update ru and ca translations
* tag 'l10n-2.11.0-rnd3.1' of git://github.com/git-l10n/git-po:
l10n: ru.po: update Russian translation
l10n: ca.po: update translation
The lazy prereq for MKTEMP uses "mktemp -t" to see if
mergetool's internal mktemp call will be able to run. But
unlike the call inside mergetool, we do not ever bother to
clean up the result, and the /tmp of git developers will
slowly fill up with "foo.XXXXXX" directories as they run the
test suite over and over. Let's clean up the directory
after we've verified its creation.
Note that we don't use test_when_finished here, and instead
just make rmdir part of the &&-chain. We should only remove
something that we're confident we just created. A failure in
the middle of the chain either means there's nothing to
clean up, or we are very confused and should err on the side
of caution.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow vimdiff users to signal that they do not want to use the
result of a merge by exiting with ":cquit", which tells Vim to
exit with an error code.
This is better than the current behavior because it allows users
to directly flag that the merge is bad, using a standard Vim
feature, rather than relying on a timestamp heuristic that is
unforgiving to users that save in-progress merge files.
The original behavior can be restored by configuring
mergetool.vimdiff.trustExitCode to false.
Reported-by: Dun Peal <dunpealer@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Built-in merge tools contain a hard-coded assumption about
whether or not a tool's exit code can be trusted to determine
the success or failure of a merge. Tools whose exit codes are
not trusted contain calls to check_unchanged() in their
merge_cmd() functions.
A problem with this is that the trustExitCode configuration is
not honored for built-in tools.
Teach built-in tools to honor the trustExitCode configuration.
Extend run_merge_cmd() so that it is responsible for calling
check_unchanged() when a tool's exit code cannot be trusted.
Remove check_unchanged() calls from scriptlets since they are no
longer responsible for calling it.
When no configuration is present, exit_code_trustable() is
checked to see whether the exit code should be trusted.
The default implementation returns false.
Tools whose exit codes can be trusted override
exit_code_trustable() to true.
Reported-by: Dun Peal <dunpealer@gmail.com>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge-recursive sorts a string list using a raw qsort(), where it
feeds the "items" from one struct but the "nr" and size fields from
another struct. This isn't a bug because one list is a copy of the
other, but it's unnecessarily confusing (and also caused our recent
QSORT() cleanups via coccinelle to miss this call site).
Let's use string_list_sort() instead, which is more concise and harder
to get wrong. Note that we need to adjust our comparison function,
which gets fed only the strings now, not the string_list_items. That's
OK because we don't use the "util" field as part of our sort.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
l10n-2.11.0-rnd3
* tag 'l10n-2.11.0-rnd3' of git://github.com/git-l10n/git-po:
l10n: de.po: translate 210 new messages
l10n: fix unmatched single quote in error message
It makes it easier to write tests for. But it should also be good for
the user since locating a worktree by eye would be easier once they
notice this.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another no-op patch, in preparation for get_worktrees() to do
optional things, like sorting.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is required by git-worktree.txt, stating that the main worktree is
the first line (especially in --porcelain mode when we can't just change
behavior at will).
There's only one case when get_worktrees() may skip main worktree, when
parse_ref() fails. Update the code so that we keep first item as main
worktree and return something sensible in this case:
- In user-friendly mode, since we're not constraint by anything,
returning "(error)" should do the job (we already show "(detached
HEAD)" which is not machine-friendly). Actually errors should be
printed on stderr by parse_ref() (*)
- In plumbing mode, we do not show neither 'bare', 'detached' or
'branch ...', which is possible by the format description if I read
it right.
Careful readers may realize that when the local variable "head_ref" in
get_main_worktree() is emptied, add_head_info() will do nothing to
wt->head_sha1. But that's ok because head_sha1 is zero-ized in the
previous patch.
(*) Well, it does not. But it's supposed to be a stop gap implementation
until we can reuse refs code to parse "ref: " stuff in HEAD, from
resolve_refs_unsafe(). Now may be the time since refs refactoring is
mostly done.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is no-op. But it helps reduce diff noise in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
1335d76e45 ("merge: avoid "safer crlf" during recording of merge
results", 2016-07-08) tried to split make_cache_entry() call made
with CE_MATCH_REFRESH into a call to make_cache_entry() without one,
followed by a call to add_cache_entry(), refresh_cache() and another
add_cache_entry() as needed. However the conversion was botched in
that it forgot that refresh_cache() can return NULL, which was
handled correctly in make_cache_entry() but in the updated code.
This fixes https://github.com/git-for-windows/git/issues/952
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In https://github.com/git-for-windows/git/issues/952, a complicated
scenario was described that leads to a segmentation fault in
cherry-pick.
It boils down to a certain code path involving a renamed file that is
dirty, for which `refresh_cache_entry()` returns `NULL`, and that
`NULL` not being handled properly.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Translate one message introduced by commit:
* 358718064b i18n: fix unmatched single quote in error message
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
"git archive" and "git mailinfo" stopped reading from local
configuration file with a recent update.
* jc/setup-cleanup-fix:
archive: read local configuration
mailinfo: read local configuration
"git rebase -i" did not work well with core.commentchar
configuration variable for two reasons, both of which have been
fixed.
* js/rebase-i-commentchar-fix:
rebase -i: handle core.commentChar=auto
stripspace: respect repository config
rebase -i: highlight problems with core.commentchar
Using a %(HEAD) placeholder in "for-each-ref --format=" option
caused the command to segfault when on an unborn branch.
* jc/for-each-ref-head-segfault-fix:
for-each-ref: do not segv with %(HEAD) on an unborn branch
This keeps things a bit simpler when we add more fields, knowing that
default values are always zero.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach push to respect the --dry-run option when configured to
recursively push submodules 'on-demand'. This is done by passing the
--dry-run flag to the child process which performs a push for a
submodules when performing a dry-run.
In order to preserve good user experience, the additional check for
unpushed submodules is skipped during a dry-run when
--recurse-submodules=on-demand. The check is skipped because the submodule
pushes were performed as dry-runs and this check would always fail as the
submodules would still need to be pushed.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds a test to illustrate how push run with --dry-run doesn't
actually perform a dry-run when push is configured to push submodules
on-demand. Instead all submodules which need to be pushed are actually
pushed to their remotes while any updates for the superproject are
performed as a dry-run. This is a bug and not the intended behaviour of
a dry-run.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository. "git archive" however didn't do the
repository discovery and instead relied on the old behaviour.
Teach the command to run a "gentle" version of repository discovery
so that local configuration variables are honoured.
[jc: stole tests from peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository. "git mailinfo" however didn't do the
repository discovery and instead relied on the old behaviour. This
was mostly OK because it was merely run as a helper program by other
porcelain scripts that first chdir's up to the root of the working
tree.
Teach the command to run a "gentle" version of repository discovery
so that local configuration variables like mailinfo.scissors are
honoured.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.11.0-rc2 introduced one small l10n update, and this commit fixed
the affected translations all in one batch.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
In commit 1462450 ("trailer: allow non-trailers in trailer block",
2016-10-21), functionality was added (and tested [1]) to allow
non-trailer lines in trailer blocks, as long as those blocks contain at
least one Git-generated or user-configured trailer, and consists of at
least 25% trailers. The documentation was updated to mention this new
functionality, but did not mention "user-configured trailer".
Further update the documentation to also mention "user-configured
trailer".
[1] "with non-trailer lines mixed with a configured trailer" in
t/t7513-interpret-trailers.sh
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 84c9dc2 (commit: allow core.commentChar=auto for character auto
selection, 2014-05-17) extended the core.commentChar functionality to
allow for the value 'auto', it forgot that rebase -i was already taught to
handle core.commentChar, and in turn forgot to let rebase -i handle that
new value gracefully.
Reported by Taufiq Hoven.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way "git stripspace" reads the configuration was not quite
kosher, in that the code forgot to probe for a possibly existing
repository (note: stripspace is designed to be usable outside the
repository as well). It read .git/config only when it was run from
the top-level of the working tree by accident. A recent change
b9605bc4f2 ("config: only read .git/config from configured repos",
2016-09-12) stopped reading the repository-local configuration file
".git/config" unless the repository discovery process is done, so
that .git/config is never read even when run from the top-level,
exposing the old bug more.
When rebasing interactively with a commentChar defined in the
current repository's config, the help text at the bottom of the edit
script potentially used an incorrect comment character. This was not
only funny-looking, but also resulted in tons of warnings like this
one:
Warning: the command isn't recognized in the following line
- #
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interactive rebase does not currently play well with
core.commentchar. Let's add some tests to highlight those problems
that will be fixed in the remainder of the series.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to flip between "*" and " " prefixes depending on what
branch is checked out used in --format='%(HEAD)' did not consider
that HEAD may resolve to an unborn branch and dereferenced a NULL.
This will become a lot easier to trigger as the codepath will be
used to reimplement "git branch [--list]" in the future.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems a little silly to do a reachabilty check in the case where we
trust the user to access absolutely everything in the repository.
Also, it's racy in a distributed system -- perhaps one server
advertises a ref, but another has since had a force-push to that ref,
and perhaps the two HTTP requests end up directed to these different
servers.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the event that a HTTP server closes the connection after giving a
200 but before giving any packets, we don't want to hang forever
waiting for a response that will never come. Instead, we should die
immediately.
One case where this happens is when attempting to fetch a dangling
object by its object name. In this case, the server dies before
sending any data. Prior to this patch, fetch-pack would wait for
data from the server, and remote-curl would wait for fetch-pack,
causing a deadlock.
Despite this patch, there is other possible malformed input that could
cause the same deadlock (e.g. a half-finished pktline, or a pktline but
no trailing flush). There are a few possible solutions to this:
1. Allowing remote-curl to tell fetch-pack about the EOF (so that
fetch-pack could know that no more data is coming until it says
something else). This is tricky because an out-of-band signal would
be required, or the http response would have to be re-framed inside
another layer of pkt-line or something.
2. Make remote-curl understand some of the protocol. It turns out
that in addition to understanding pkt-line, it would need to watch for
ack/nak. This is somewhat fragile, as information about the protocol
would end up in two places. Also, pkt-lines which are already at the
length limit would need special handling.
Both of these solutions would require a fair amount of work, whereas
this hack is easy and solves at least some of the problem.
Still to do: it would be good to give a better error message
than "fatal: The remote end hung up unexpectedly".
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a submodule is being merged or cherry-picked into a working
tree that already contains a corresponding empty directory, do not
record a conflict.
One situation where this bug appears is:
- Commit 1 adds a submodule
- Commit 2 removes that submodule and re-adds it into a subdirectory
(sub1 to sub1/sub1).
- Commit 3 adds an unrelated file.
Now the user checks out commit 1 (first deinitializing the submodule),
and attempts to cherry-pick commit 3. Previously, this would fail,
because the incoming submodule sub1/sub1 would falsely conflict with
the empty sub1 directory.
This patch ignores the empty sub1 directory, fixing the bug. We only
ignore the empty directory if the object being emplaced is a
submodule, which expects an empty directory.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, "git gc" may delete objects that another concurrent process
is using but hasn't created a reference to. Git has some mitigations,
but they fall short of a complete solution. Document this in the
git-gc(1) man page and add a reference from the documentation of the
gc.pruneExpire config variable.
Based on a write-up by Jeff King:
http://marc.info/?l=git&m=147922960131779&w=2
Signed-off-by: Matt McCutchen <matt@mattmccutchen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we do not have commits that are involved in the update of the
superproject in our copy of submodule, we cannot tell if the remote
end needs to acquire these commits to be able to check out the
superproject tree. Explain why we answer "no there is no need/point
in pushing from our submodule repository" in this case.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We run a command for each sha1 change in a submodule. This is
unnecessary since we can simply batch all sha1's we want to check into
one command. Lets do it so we can speedup the check when many submodule
changes are in need of checking.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are iterating over each pushed ref and want to check whether it
contains changes to submodules. Instead of immediately checking each ref
lets first collect them and then do the check for all of them in one
revision walk.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To check whether a submodule needs to be pushed we need to collect all
changed submodules. Lets collect them first and then execute the
possibly expensive test whether certain revisions are already pushed
only once per submodule.
There is further potential for optimization since we can assemble one
command and only issued that instead of one call for each remote ref in
the submodule.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The try_parent_shorthands() function shows each parent via
show_rev(). We pass the correct parent sha1, but our "name"
parameter still points at the original refname. So asking
for a regular rev-parse works fine (it prints the sha1s),
but asking for the symbolic name gives nonsense like:
$ git rev-parse --symbolic HEAD^-1
HEAD
^HEAD
which is always an empty set of commits. Asking for "^!" is
likewise broken, with the added bonus that its prints ^HEAD
for _each_ parent. And "^@" just prints HEAD repeatedly.
Arguably it would be correct to just pass NULL as the name
here, and always get the parent expressed as a sha1. The
"--symbolic" documentaton claims only "as close to the
original input as possible", and we certainly fallback to
sha1s where necessary. But it's pretty easy to generate a
symbolic name on the fly from the original.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are three codepaths that use a variable whose name is
pack_compression_level to affect how objects and deltas sent to a
packfile is compressed. Unlike zlib_compression_level that controls
the loose object compression, however, this variable was static to
each of these codepaths. Two of them read the pack.compression
configuration variable, using core.compression as the default, and
one of them also allowed overriding it from the command line.
The other codepath in bulk-checkin did not pay any attention to the
configuration.
Unify the configuration parsing to git_default_config(), where we
implement the parsing of core.loosecompression and core.compression
and make the former override the latter, by moving code to parse
pack.compression and also allow core.compression to give default to
this variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "SECURITY" section of the gitnamespaces(7) man page described two
ways for a client to steal data from a server that wasn't intended to be
shared. Similar attacks can be performed by a server on a client, so
adapt the section to cover both directions and add it to the
git-fetch(1), git-pull(1), and git-push(1) man pages. Also add
references to this section from the documentation of server
configuration options that attempt to control data leakage but may not
be fully effective.
Signed-off-by: Matt McCutchen <matt@mattmccutchen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An error message in fetch-pack executable that was newly marked for
translation was misspelt, which has been fixed.
* rt/fetch-pack-error-message-fix:
fetch-pack.c: correct command at the beginning of an error message
Last minute fixes to two fixups merged to 'master' recently.
* js/pwd-var-vs-pwd-cmd-fix:
t0021, t5615: use $PWD instead of $(pwd) in PATH-like shell variables
Fix for a racy false-positive test failure.
* as/merge-attr-sleep:
t6026: clarify the point of "kill $(cat sleep.pid)"
t6026: ensure that long-running script really is
Revert "t6026-merge-attr: don't fail if sleep exits early"
Revert "t6026-merge-attr: ensure that the merge driver was called"
t6026-merge-attr: ensure that the merge driver was called
t6026-merge-attr: don't fail if sleep exits early
Portability update and workaround for builds on recent Mac OS X.
* ls/macos-update:
travis-ci: disable GIT_TEST_HTTPD for macOS
Makefile: set NO_OPENSSL on macOS by default
Silence a clang warning introduced by a recently graduated topic.
* js/prepare-sequencer:
sequencer: silence -Wtautological-constant-out-of-range-compare
One error message in fetch-pack.c uses 'git fetch_pack' at the beginning
which is not a git command. Use 'git fetch-pack' instead.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The redirection of the standard error stream to a temporary file is
a leftover cruft during debugging. Remove it.
Besides, it is reported by folks on the Windows that the test is
flaky with this redirection; somebody gets confused and this
merely-redirected-to file gets marked as delete-pending by git.exe
and makes it finish with a non-zero exit status when "git checkout"
finishes. Windows folks may want to figure that one out, but for
the purpose of this test, it shouldn't become a show-stopper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have to use $PWD instead of $(pwd) because on Windows the latter
would add a C: style path to bash's Unix-style $PATH variable, which
becomes confused by the colon after the drive letter. ($PWD is a
Unix-style path.)
In the case of GIT_ALTERNATE_OBJECT_DIRECTORIES, bash on Windows
assembles a Unix-style path list with the colon as separators. It
converts the value to a Windows-style path list with the semicolon as
path separator when it forwards the variable to git.exe. The same
confusion happens when bash's original value is contaminated with
Windows style paths.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/fetch.c redundantly calculates refmaps for tags twice. Remove
the first calculation.
This is only a code simplification and slight performance improvement -
the result is unchanged, as the redundant refmaps are subsequently
removed by the invocation to "ref_remove_duplicates" anyway.
This was introduced in commit c5a84e9 ("fetch --tags: fetch tags *in
addition to* other stuff", 2013-10-29) when modifying the effect of the
--tags parameter to "git fetch". The refmap-for-tag calculation was
copied instead of moved.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a corner-case regression in a topic that graduated during the
v2.11 cycle.
* jk/alt-odb-cleanup:
alternates: re-allow relative paths from environment
Test portability improvements and cleanups for t0021.
* jk/filter-process-fix:
t0021: fix filehandle usage on older perl
t0021: use $PERL_PATH for rot13-filter.pl
t0021: put $TEST_ROOT in $PATH
t0021: use write_script to create rot13 shell script
Test portability improvements and optimization for an
already-graduated topic.
* ls/filter-process:
t0021: compute file size with a single process instead of a pipeline
t0021: expect more variations in the output of uniq -c
When clang compiles sequencer.c, it complains:
sequencer.c:632:14: warning: comparison of constant 2 with
expression of type 'const enum todo_command' is always
true [-Wtautological-constant-out-of-range-compare]
if (command < ARRAY_SIZE(todo_command_strings))
This is because "command" is an enum that may only have two
values (0 and 1) and the array in question has two elements.
As it turns out, clang is actually wrong here, at least
according to its own bug tracker:
https://llvm.org/bugs/show_bug.cgi?id=16154
But it's still worth working around this, as the warning is
present with -Wall, meaning we fail compilation with "make
DEVELOPER=1".
Casting the enum to size_t sufficiently unconfuses clang. As
a bonus, it also catches any possible out-of-bounds access
if the enum takes on a negative value (which shouldn't
happen either, but again, this is a defensive check).
Signed-off-by: Jeff King <peff@peff.net>
Commit 670c359da (link_alt_odb_entry: handle normalize_path
errors, 2016-10-03) regressed the handling of relative paths
in the GIT_ALTERNATE_OBJECT_DIRECTORIES variable. It's not
entirely clear this was ever meant to work, but it _has_
worked for several years, so this commit restores the
original behavior.
When we get a path in GIT_ALTERNATE_OBJECT_DIRECTORIES, we
add it the path to the list of alternate object directories
as if it were found in objects/info/alternates, but with one
difference: we do not provide the link_alt_odb_entry()
function with a base for relative paths. That function
doesn't turn it into an absolute path, and we end up feeding
the relative path to the strbuf_normalize_path() function.
Most relative paths break out of the top-level directory
(e.g., "../foo.git/objects"), and thus normalizing fails.
Prior to 670c359da, we simply ignored the error, and due to
the way normalize_path_copy() was implemented it happened to
return the original path in this case. We then accessed the
alternate objects using this relative path.
By storing the relative path in the alt_odb list, the path
is relative to wherever we happen to be at the time we do an
object lookup. That means we look from $GIT_DIR in a bare
repository, and from the top of the worktree in a non-bare
repository.
If this were being designed from scratch, it would make
sense to pick a stable location (probably $GIT_DIR, or even
the object directory) and use that as the relative base,
turning the result into an absolute path. However, given
the history, at this point the minimal fix is to match the
pre-670c359da behavior.
We can do this simply by ignoring the error when we have no
relative base and using the original value (which we now
reliably have, thanks to strbuf_normalize_path()).
That still leaves us with a relative path that foils our
duplicate detection, and may act strangely if we ever
chdir() later in the process. We could solve that by storing
an absolute path based on getcwd(). That may be a good
future direction; for now we'll do just the minimum to fix
the regression.
The new t5615 script demonstrates the fix in its final three
tests. Since we didn't have any tests of the alternates
environment variable at all, it also adds some tests of
absolute paths.
Reported-by: Bryan Turner <bturner@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Avoid unwanted coding patterns (prodigal use of pipelines), and in
particular a useless use of cat.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
Some versions of uniq -c write the count left-justified, other version
write it right-justified. Be prepared for both kinds.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jeff King <peff@peff.net>
The rot13-filter.pl script calls methods on implicitly
defined filehandles (STDOUT, and the result of an open()
call). Prior to perl 5.13, these methods are not
automatically loaded, and perl will complain with:
Can't locate object method "flush" via package "IO::Handle"
Let's explicitly load IO::File (which inherits from
IO::Handle). That's more than we need for just "flush", but
matches what perl has done since:
http://perl5.git.perl.org/perl.git/commit/15e6cdd91beb4cefae4b65e855d68cf64766965d
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rot13-filter.pl script hardcodes "#!/usr/bin/perl", and
does not respect $PERL_PATH at all. That is a problem if the
system does not have perl at that path, or if it has a perl
that is too old to run a complicated script like the
rot13-filter (but PERL_PATH points to a more modern one).
We can fix this by using write_script() to create a new copy
of the script with the correct #!-line. In theory we could
move the whole script inside t0021-conversion.sh rather than
having it as an auxiliary file, but it's long enough that
it just makes things harder to read.
As a bonus, we can stop using the full path to the script in
the filter-process config we add (because the trash
directory is in our PATH). Not only is this shorter, but it
sidesteps any shell-quoting issues. The original was broken
when $TEST_DIRECTORY contained a space, because it was
interpolated in the outer script.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create a rot13.sh script in the trash directory, but need
to call it by its full path when we have moved our cwd to
another directory. Let's just put $TEST_ROOT in our $PATH so
that the script is always found.
This is a minor convenience for rot13.sh, but will be a
major one when we switch rot13-filter.pl to a script in the
same directory, as it means we will not have to deal with
shell quoting inside the filter-process config.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids us fooling around with $SHELL_PATH and the
executable bit ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the rule to convert "unsigned char [20]" into "struct
object_id *" in contrib/coccinelle/
* rs/cocci:
cocci: avoid self-references in object_id transformations
Overflow is defined for unsigned integers, but not for signed ones.
Wrap around explicitly for the new ring-buffer in find_unique_abbrev()
as we did in bb84735c for the ones in sha1_to_hex() and get_pathname(),
thus avoiding signed overflows and getting rid of the magic number 3.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update to the test framework made in 2.9 timeframe broke running
the tests under valgrind, which has been fixed.
* nd/test-helpers:
valgrind: support test helpers
Recent update to git-sh-setup (a library of shell functions that
are used by our in-tree scripted Porcelain commands) included
another shell library git-sh-i18n without specifying where it is,
relying on the $PATH. This has been fixed to be more explicit by
prefixing $(git --exec-path) output in front.
* ak/sh-setup-dot-source-i18n-fix:
git-sh-setup: be explicit where to dot-source git-sh-i18n from.
The user always has to say "stash@{$N}" when naming a single
element in the default location of the stash, i.e. reflogs in
refs/stash. The "git stash" command learned to accept "git stash
apply 4" as a short-hand for "git stash apply stash@{4}".
* aw/numbered-stash:
stash: allow stashes to be referenced by index only
Update "interpret-trailers" machinery and teaches it that people in
real world write all sorts of crufts in the "trailer" that was
originally designed to have the neat-o "Mail-Header: like thing"
and nothing else.
* jt/trailer-with-cruft:
trailer: support values folded to multiple lines
trailer: forbid leading whitespace in trailers
trailer: allow non-trailers in trailer block
trailer: clarify failure modes in parse_trailer
trailer: make args have their own struct
trailer: streamline trailer item create and add
trailer: use list.h for doubly-linked list
trailer: improve const correctness
The smudge/clean filter API expect an external process is spawned
to filter the contents for each path that has a filter defined. A
new type of "process" filter API has been added to allow the first
request to run the filter for a path to spawn a single process, and
all filtering need is served by this single process for multiple
paths, reducing the process creation overhead.
* ls/filter-process:
contrib/long-running-filter: add long running filter example
convert: add filter.<driver>.process option
convert: prepare filter.<driver>.process option
convert: make apply_filter() adhere to standard Git error handling
pkt-line: add functions to read/write flush terminated packet streams
pkt-line: add packet_write_gently()
pkt-line: add packet_flush_gently()
pkt-line: add packet_write_fmt_gently()
pkt-line: extract set_packet_header()
pkt-line: rename packet_write() to packet_write_fmt()
run-command: add clean_on_exit_handler
run-command: move check_pipe() from write_or_die to run_command
convert: modernize tests
convert: quote filter names in error messages
Git generally does not explicitly close file descriptors that were
open in the parent process when spawning a child process, but most
of the time the child does not want to access them. As Windows does
not allow removing or renaming a file that has a file descriptor
open, a slow-to-exit child can even break the parent process by
holding onto them. Use O_CLOEXEC flag to open files in various
codepaths.
* ls/git-open-cloexec:
read-cache: make sure file handles are not inherited by child processes
sha1_file: open window into packfiles with O_CLOEXEC
sha1_file: rename git_open_noatime() to git_open()
When the user does the lazy "git push" with no parameters with
push.default set to either "upstream", "simple" or "current",
we internally generated a refspec that has the current branch name
on the source side and used it to push.
However, the branch name (say "test") may be an ambiguous refname in
the context of the source repository---there may be a tag with the
same name, for example. This would trigger an unnecessary error
without any fault on the end-user's side.
Be explicit and give a full refname as the source side to avoid the
ambiguity. The destination side when pushing with the "current"
sent only the name of the branch and forcing the receiving end to
guess, which is the same issue. Be explicit there as well.
Reported-by: Kannan Goundan <kannan@cakoose.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When new paths were added by "git add -N" to the index, it was
enough to circumvent the check by "git commit" to refrain from
making an empty commit without "--allow-empty". The same logic
prevented "git status" to show such a path as "new file" in the
"Changes not staged for commit" section.
* nd/ita-empty-commit:
commit: don't be fooled by ita entries when creating initial commit
commit: fix empty commit creation when there's no changes but ita entries
diff: add --ita-[in]visible-in-index
diff-lib: allow ita entries treated as "not yet exist in index"
Update of the sequencer codebase to make it reusable to reimplement
"rebase -i" continues.
* js/prepare-sequencer: (27 commits)
sequencer: mark all error messages for translation
sequencer: start error messages consistently with lower case
sequencer: quote filenames in error messages
sequencer: mark action_name() for translation
sequencer: remove overzealous assumption in rebase -i mode
sequencer: teach write_message() to append an optional LF
sequencer: refactor write_message() to take a pointer/length
sequencer: roll back lock file if write_message() failed
sequencer: stop releasing the strbuf in write_message()
sequencer: left-trim lines read from the script
sequencer: support cleaning up commit messages
sequencer: support amending commits
sequencer: allow editing the commit message on a case-by-case basis
sequencer: introduce a helper to read files written by scripts
sequencer: prepare for rebase -i's commit functionality
sequencer: remember the onelines when parsing the todo file
sequencer: get rid of the subcommand field
sequencer: avoid completely different messages for different actions
sequencer: strip CR from the todo script
sequencer: completely revamp the "todo" script parsing
...
"git daemon" used fixed-length buffers to turn URL to the
repository the client asked for into the server side directory
path, using snprintf() to avoid overflowing these buffers, but
allowed possibly truncated paths to the directory. This has been
tightened to reject such a request that causes overlong path to be
required to serve.
* jk/daemon-path-ok-check-truncation:
daemon: detect and reject too-long paths
The code that we have used for the past 10+ years to cycle
4-element ring buffers turns out to be not quite portable in
theoretical world.
* rs/ring-buffer-wraparound:
hex: make wraparound of the index into ring-buffer explicit
A minor regression fix for "git submodule".
* sb/submodule-ignore-trailing-slash:
t0060: sidestep surprising path mangling results on Windows
submodule: ignore trailing slash in relative url
submodule: ignore trailing slash on superproject URL
Update "git diff --no-index" codepath not to try to peek into .git/
directory that happens to be under the current directory, when we
know we are operating outside any repository.
* jk/no-looking-at-dotgit-outside-repo:
diff: handle sha1 abbreviations outside of repository
diff_aligned_abbrev: use "struct oid"
diff_unique_abbrev: rename to diff_aligned_abbrev
find_unique_abbrev: use 4-buffer ring
test-*-cache-tree: setup git dir
read info/{attributes,exclude} only when in repository
"git push" and "git fetch" reports from what old object to what new
object each ref was updated, using abbreviated refnames, and they
attempt to align the columns for this and other pieces of
information. The way these codepaths compute how many display
columns to allocate for the object names portion of this output has
been updated to match the recent "auto scale the default
abbreviation length" change.
* jc/abbrev-auto:
transport: compute summary-width dynamically
transport: allow summary-width to be computed dynamically
fetch: pass summary_width down the callchain
transport: pass summary_width down the callchain
Updates the way approximate count of total objects is computed
while attempting to come up with a unique abbreviated object name,
which in turn needs to estimate how many hexdigits are necessary to
ensure uniqueness.
* jk/abbrev-auto:
find_unique_abbrev: move logic out of get_short_sha1()
Allow the default abbreviation length, which has historically been
7, to scale as the repository grows. The logic suggests to use 12
hexdigits for the Linux kernel, and 9 to 10 for Git itself.
* lt/abbrev-auto:
abbrev: auto size the default abbreviation
abbrev: prepare for new world order
abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
Reusing cached data speeds up git-svn by quite a fair bit. However, if
the YAML module is unavailable, the caches are written to disk in an
architecture-dependent manner. That leads to problems when upgrading,
say, from 32-bit to 64-bit Git for Windows.
Let's just try to read those caches back if we detect the absence of the
YAML module and the presence of the file, and delete the file if it
could not be read back correctly.
Note that the only way to catch the error when the memoized cache could
not be read back is to put the call inside an `eval { ... }` block
because it would die otherwise; the `eval` block should also return `1`
in case of success explicitly since the function reading back the cached
data does not return an appropriate value to test for success.
This fixes https://github.com/git-for-windows/git/issues/233.
[ew: import "retrieve" explictly, check unlink result]
Signed-off-by: Gavin Lambert <github@mirality.co.nz>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Eric Wong <e@80x24.org>
When generating diffs outside a repository (e.g., with "diff
--no-index"), we may write abbreviated sha1s as part of
"--raw" output or the "index" lines of "--patch" output.
Since we have no object database, we never find any
collisions, and these sha1s get whatever static abbreviation
length is configured (typically 7).
However, we do blindly look in ".git/objects" to see if any
objects exist, even though we know we are not in a
repository. This is usually harmless because such a
directory is unlikely to exist, but could be wrong in rare
circumstances.
Let's instead notice when we are not in a repository and
behave as if the object database is empty (i.e., just use
the default abbrev length). It would perhaps make sense to
be conservative and show full sha1s in that case, but
showing the default abbreviation is what we've always done
(and is certainly less ugly).
Note that this does mean that:
cd /not/a/repo
GIT_OBJECT_DIRECTORY=/some/real/objdir git diff --no-index ...
used to look for collisions in /some/real/objdir but now
does not. This could be considered either a bugfix (we do
not look at objects if we have no repository) or a
regression, but it seems unlikely that anybody would care
much either way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we're modifying this function anyway, it's a good time
to update it to the more modern "struct oid". We can also
drop some of the magic numbers in favor of GIT_SHA1_HEXSZ,
along with some descriptive comments.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The word "align" describes how the function actually differs
from find_unique_abbrev, and will make it less confusing
when we add more diff-specific abbrevation functions that do
not do this alignment.
Since this is a globally available function, let's also move
its descriptive comment to the header file, where we
typically document function interfaces.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some code paths want to format multiple abbreviated sha1s in
the same output line. Because we use a single static buffer
for our return value, they have to either break their output
into several calls or allocate their own arrays and use
find_unique_abbrev_r().
Intead, let's mimic sha1_to_hex() and use a ring of several
buffers, so that the return value stays valid through
multiple calls. This shortens some of the callers, and makes
it harder to for them to make a silly mistake.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These test helper programs access the index, but do not ever
setup_git_directory(), meaning we just blindly looked in
".git/index". This happened to work for the purposes of our
tests (which do not run from subdirectories, nor in
non-repos), but it's a bad habit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level attribute and gitignore code will try to look
in $GIT_DIR/info for any repo-level configuration files,
even if we have not actually determined that we are in a
repository (e.g., running "git grep --no-index"). In such a
case they end up looking for ".git/info/attributes", etc.
This is generally harmless, as such a file is unlikely to
exist outside of a repository, but it's still conceptually
the wrong thing to do.
Let's detect this situation explicitly and skip reading the
file (i.e., the same behavior we'd get if we were in a
repository and the file did not exist).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There still are a few topics that need to go in before -rc0 which
would make the shape of the upcoming release clearer, but here is
the final batch before it happens.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An empty string used as a pathspec element has always meant
'everything matches', but it is too easy to write a script that
finds a path to remove in $path and run 'git rm "$paht"', which
ends up removing everything. Start warning about this use of an
empty string used for 'everything matches' and ask users to use a
more explicit '.' for that instead.
The hope is that existing users will not mind this change, and
eventually the warning can be turned into a hard error, upgrading
the deprecation into removal of this (mis)feature.
* ex/deprecate-empty-pathspec-as-match-all:
pathspec: warn on empty strings as pathspec
Some AsciiDoc formatter mishandles a displayed illustration with
tabs in it. Adjust a few of them in merge-base documentation to
work around them.
* po/fix-doc-merge-base-illustration:
doc: fix the 'revert a faulty merge' ASCII art tab spacing
doc: fix merge-base ASCII art tab spacing
The Travis CI configuration we ship ran the tests with --verbose
option but this risks non-TAP output that happens to be "ok" to be
misinterpreted as TAP signalling a test that passed. This resulted
in unnecessary failure. This has been corrected by introducing a
new mode to run our tests in the test harness to send the verbose
output separately to the log file.
* jk/tap-verbose-fix:
test-lib: bail out when "-v" used under "prove"
travis: use --verbose-log test option
test-lib: add --verbose-log option
test-lib: handle TEST_OUTPUT_DIRECTORY with spaces
"git send-email" attempts to pick up valid e-mails from the
trailers, but people in real world write non-addresses there, like
"Cc: Stable <add@re.ss> # 4.8+", which broke the output depending
on the availability and vintage of Mail::Address perl module.
* mm/send-email-cc-cruft-after-address:
Git.pm: add comment pointing to t9000
t9000-addresses: update expected results after fix
parse_mailboxes: accept extra text after <...> address
Shorten description of auto-following in "git tag" by removing a
mention of historical remotes layout which is not relevant to the
main topic.
* yk/git-tag-remove-mention-of-old-layout-in-doc:
doc: remove reference to the traditional layout in git-tag.txt
A new version of git-gui, now at its 0.21.0 tag.
* pt/gitgui-updates: (22 commits)
git-gui: set version 0.21
git-gui: Mark 'All' in remote.tcl for translation
git-gui i18n: Updated Bulgarian translation (565,0f,0u)
git-gui: avoid persisting modified author identity
git-gui: handle the encoding of Git's output correctly
git-gui: unicode file name support on windows
git-gui: Update Russian translation
git-gui: maintain backwards compatibility for merge syntax
git-gui i18n: mark string in lib/error.tcl for translation
git-gui: fix incorrect use of Tcl append command
git-gui i18n: mark "usage:" strings for translation
git-gui i18n: internationalize use of colon punctuation
git-gui: ensure the file in the diff pane is in the list of selected files
git-gui: support for $FILENAMES in tool definitions
git-gui: fix initial git gui message encoding
git-gui/po/glossary/txt-to-pot.sh: use the $( ... ) construct for command substitution
git-gui (Windows): use git-gui.exe in `Create Desktop Shortcut`
git-gui: fix detection of Cygwin
Amend tab ordering and text widget border and highlighting.
Allow keyboard control to work in the staging widgets.
...
A recently graduated topic regressed "git rev-list --header"
output, breaking "gitweb". This has been fixed.
* jk/diff-submodule-diff-inline:
rev-list: use hdr_termination instead of a always using a newline
A hot-fix for a test added by a recent topic that went to both
'master' and 'maint' already.
* tg/add-chmod+x-fix:
t3700: fix broken test under !SANITY
Recent git allows submodule.<name>.branch to use a special token
"." instead of the branch name; the documentation has been updated
to describe it.
* bw/submodule-branch-dot-doc:
submodules doc: update documentation for "." used for submodule branches
More i18n.
* va/i18n:
i18n: diff: mark warnings for translation
i18n: credential-cache--daemon: mark advice for translation
i18n: convert mark error messages for translation
i18n: apply: mark error message for translation
i18n: apply: mark error messages for translation
i18n: apply: mark info messages for translation
i18n: apply: mark plural string for translation
When fetching from a remote that has many tags that are irrelevant
to branches we are following, we used to waste way too many cycles
when checking if the object pointed at by a tag (that we are not
going to fetch!) exists in our repository too carefully.
* jk/fetch-quick-tag-following:
fetch: use "quick" has_sha1_file for tag following
"git rebase" immediately after "git clone" failed to find the fork
point from the upstream.
* jk/merge-base-fork-point-without-reflog:
merge-base: handle --fork-point without reflog
Code clean-up and performance improvement to reduce use of
timestamp-ordered commit-list by replacing it with a priority
queue.
* jk/upload-pack-use-prio-queue:
upload-pack: use priority queue in reachable() check
In addition to purely abbreviated commit object names, "gitweb"
learned to turn "git describe" output (e.g. v2.9.3-599-g2376d31787)
into clickable links in its output.
* ab/gitweb-abbrev-links:
gitweb: link to "git describe"'d commits in log messages
gitweb: link to 7-char+ SHA-1s, not only 8-char+
gitweb: fix a typo in a comment
In a worktree connected to a repository elsewhere, created via "git
worktree", "git checkout" attempts to protect users from confusion
by refusing to check out a branch that is already checked out in
another worktree. However, this also prevented checking out a
branch, which is designated as the primary branch of a bare
reopsitory, in a worktree that is connected to the bare
repository. The check has been corrected to allow it.
* dk/worktree-dup-checkout-with-bare-is-ok:
worktree: allow the main brach of a bare repository to be checked out
The GPG verification status shown in "%G?" pretty format specifier
was not rich enough to differentiate a signature made by an expired
key, a signature made by a revoked key, etc. New output letters
have been assigned to express them.
* mg/gpg-richer-status:
gpg-interface: use more status letters
A new credential helper that talks via "libsecret" with
implementations of XDG Secret Service API has been added to
contrib/credential/.
* mm/credential-libsecret:
contrib: add credential helper for libsecret
"git ls-files" learned "--recurse-submodules" option that can be
used to get a listing of tracked files across submodules (i.e. this
only works with "--cached" option, not for listing untracked or
ignored files). This would be a useful tool to sit on the upstream
side of a pipe that is read with xargs to work on all working tree
files from the top-level superproject.
* bw/ls-files-recurse-submodules:
ls-files: add pathspec matching for submodules
ls-files: pass through safe options for --recurse-submodules
ls-files: optionally recurse into submodules
git: make super-prefix option
The require_clean_work_tree() helper was recreated in C when "git
pull" was rewritten from shell; the helper is now made available to
other callers in preparation for upcoming "rebase -i" work.
* js/libify-require-clean-work-tree:
wt-status: begin error messages with lower-case
wt-status: teach has_{unstaged,uncommitted}_changes() about submodules
wt-status: export also the has_un{staged,committed}_changes() functions
wt-status: make the require_clean_work_tree() function reusable
pull: make code more similar to the shell script again
pull: drop confusing prefix parameter of die_on_unclean_work_tree()
"git diff/log --ws-error-highlight=<kind>" lacked the corresponding
configuration variable to set it by default.
* jc/ws-error-highlight:
diff: introduce diff.wsErrorHighlight option
diff.c: move ws-error-highlight parsing helpers up
diff.c: refactor parse_ws_error_highlight()
t4015: split out the "setup" part of ws-error-highlight test
Instead of referencing "stash@{n}" explicitly, make it possible to
simply reference as "n". Most users only reference stashes by their
position in the stash stack (what I refer to as the "index" here).
The syntax for the typical stash (stash@{n}) is slightly annoying and
easy to forget, and sometimes difficult to escape properly in a
script. Because of this the capability to do things with the stash by
simply referencing the index is desirable.
This patch includes the superior implementation provided by Øsse Walle
(thanks for that), with a slight change to fix a broken test in the test
suite. I also merged the test scripts as suggested by Jeff King, and
un-wrapped the documentation as suggested by Junio Hamano.
Signed-off-by: Aaron M Watson <watsona4@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
When an MSYS program (such as the bash that drives the test suite)
invokes git on Windows, absolute Unix style paths are transformed into
Windows native absolute paths (drive letter form). However, this
transformation also includes some simplifications that are not just
straight-forward textual substitutions:
- When the path ends in "/.", then the dot is stripped, but not the
directory separator.
- When the path contains "..", then it is optimized away if possible,
e.g., "/c/dir/foo/../bar" becomes "c:/dir/bar".
These additional transformations violate the assumptions of some
submodule path tests. We can avoid them when the input is already a
Windows native path, because then MSYS leaves the path unmolested.
Convert the uses of $PWD to $(pwd); the latter returns a native Windows
path.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes "convert: add filter.<driver>.process option" (edcc8581) on
Windows.
Consider the case of a file that requires filtering and is present in
branch A but not in branch B. If A is the current HEAD and we checkout B
then the following happens:
1. ce_compare_data() opens the file
2. index_fd() detects that the file requires to run a clean filter and
calls index_stream_convert_blob()
4. index_stream_convert_blob() calls convert_to_git_filter_fd()
5. convert_to_git_filter_fd() calls apply_filter() which creates a
new long running filter process (in case it is the first file
of this kind to be filtered)
6. The new filter process inherits all file handles. This is the
default on Linux/OSX and is explicitly defined in the
`CreateProcessW` call in `mingw.c` on Windows.
7. ce_compare_data() closes the file
8. Git unlinks the file as it is not present in B
The unlink operation does not work on Windows because the filter process
has still an open handle to the file. On Linux/OSX the unlink operation
succeeds but the file descriptors still leak into the child process.
Fix this problem by opening files in read-cache with the O_CLOEXEC flag
to ensure that the file descriptor does not remain open in a newly
spawned process similar to 05d1ed6148 ("mingw: ensure temporary file
handles are not inherited by child processes", 2016-08-22).
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All processes that the Git main process spawns inherit the open file
descriptors of the main process. These leaked file descriptors can
cause problems.
Use the O_CLOEXEC flag similar to 05d1ed61 to fix the leaked file
descriptors.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is meant to be used when reading from files in the
object store, and the original objective was to avoid smudging atime
of loose object files too often, hence its name. Because we'll be
extending its role in the next commit to also arrange the file
descriptors they return auto-closed in the child processes, rename
it to lose "noatime" part that is too specific.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ita entries are dropped at tree generation phase. If the entire index
consists of just ita entries, the result would be a a commit with no
entries, which should be caught unless --allow-empty is specified. The
test "!!active_nr" is not sufficient to catch this.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If i-t-a entries are present and there is no change between the index
and HEAD i-t-a entries, index_differs_from() still returns "dirty, new
entries" (aka, the resulting commit is not empty), but cache-tree will
skip i-t-a entries and produce the exact same tree of current
commit.
index_differs_from() is supposed to catch this so we can abort
git-commit (unless --no-empty is specified). Update it to optionally
ignore i-t-a entries when doing a diff between the index and HEAD so
that it would return "no change" in this case and abort commit.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option --ita-invisible-in-index exposes the "ita_invisible_in_index"
diff flag to outside to allow easier experimentation with this new mode.
The "plan" is to make --ita-invisible-in-index default to keep consistent
behavior with 'status' and 'commit', but a bunch other commands like
'apply', 'merge', 'reset'.... need to be taken into consideration as well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When comparing the index and the working tree to show which paths are
new, and comparing the tree recorded in the HEAD and the index to see if
committing the contents recorded in the index would result in an empty
commit, we would want the former comparison to say "these are new paths"
and the latter to say "there is no change" for paths that are marked as
intent-to-add.
We made a similar attempt at d95d728a ("diff-lib.c: adjust position of
i-t-a entries in diff", 2015-03-16), which redefined the semantics of
these two comparison modes globally, which was a disaster and had to be
reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a
entries in diff"", 2015-06-23).
To make sure we do not repeat the same mistake, introduce a new internal
diffopt option so that this different semantics can be asked for only by
callers that ask it, while making sure other unaudited callers will get
the same comparison result.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now all that is left to do is to actually iterate over the refs
and measure the display width needed to show their abbreviation.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now we have identified three callchains that have a set of refs that
they want to show their <old, new> object names in an aligned output,
we can replace their reference to the constant TRANSPORT_SUMMARY_WIDTH
with a helper function call to transport_summary_width() that takes
the set of ref as a parameter. This step does not yet iterate over
the refs and compute, which is left as an exercise to the readers.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The leaf function on the "fetch" side that uses TRANSPORT_SUMMARY_WIDTH
constant is builtin/fetch.c::format_display() and it has two distinct
callchains. The one that reports the primary result of fetch originates
at store_updated_refs(); the other one that reports the pruning of
the remote-tracking refs originates at prune_refs().
Teach these two places to pass summary_width down the callchain,
just like we did for the "push" side in the previous commit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callchain that originates at transport_print_push_status()
eventually hits a single leaf function, print_ref_status(), that is
used to show from what old object to what new object a ref got
updated, and the width of the part that shows old and new object
names used a constant TRANSPORT_SUMMARY_WIDTH.
Teach the callchain to pass the width down from the top instead.
This allows a future enhancement to compute the necessary display
width before calling down this callchain.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, interpret-trailers requires that a trailer be only on 1 line.
For example:
a: first line
second line
would be interpreted as one trailer line followed by one non-trailer line.
Make interpret-trailers support RFC 822-style folding, treating those
lines as one logical trailer.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, interpret-trailers allows leading whitespace in trailer
lines. This leads to false positives, especially for quoted lines or
bullet lists.
Forbid leading whitespace in trailers.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, interpret-trailers requires all lines of a trailer block to
be trailers (or comments) - if not it would not identify that block as a
trailer block, and thus create its own trailer block, inserting a blank
line. For example:
echo -e "\nSigned-off-by: x\nnot trailer" |
git interpret-trailers --trailer "c: d"
would result in:
Signed-off-by: x
not trailer
c: d
Relax the definition of a trailer block to require that the trailers (i)
are all trailers, or (ii) contain at least one Git-generated trailer and
consists of at least 25% trailers.
Signed-off-by: x
not trailer
c: d
(i) is the existing functionality. (ii) allows arbitrary lines to be
included in trailer blocks, like those in [1], and still allow
interpret-trailers to be used.
[1]
e7d316a02f
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parse_trailer function has a few modes of operation, all depending
on whether the separator is present in its input, and if yes, the
separator's position. Some of these modes are failure modes, and these
failure modes are handled differently depending on whether the trailer
line was sourced from a file or from a command-line argument.
Extract a function to find the separator, allowing the invokers of
parse_trailer to determine how to handle the failure modes instead of
making parse_trailer do it. In this function, also take in the list of
separators, so that we can distinguish between command line arguments
(which allow '=' as separator) and file input (which does not allow '='
as separator).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was actually only one error message that was not yet marked for
translation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Quite a few error messages touched by this developer during the work to
speed up rebase -i started with an upper case letter, violating our
current conventions. Instead of sneaking in this fix (and forgetting
quite a few error messages), let's just have one wholesale patch fixing
all of the error messages in the sequencer.
While at it, the funny "error: Error wrapping up..." was changed to a
less funny, but more helpful, "error: failed to finalize...".
Pointed out by Junio Hamano.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the code consistent by fixing quite a couple of error messages.
Suggested by Jakub Narębski.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of this function goes back all the way to 043a449
(sequencer: factor code out of revert builtin, 2012-01-11), long before a
serious effort was made to translate all the error messages.
It is slightly out of the context of the current patch series (whose
purpose it is to re-implement the performance critical parts of the
interactive rebase in C) to make the error messages in the sequencer
translatable, but what the heck. We'll just do it while we're looking at
this part of the code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sequencer was introduced to make the cherry-pick and revert
functionality available as library function, with the original idea
being to extend the sequencer to also implement the rebase -i
functionality.
The test to ensure that all of the commands in the script are identical
to the overall operation does not mesh well with that.
Therefore let's disable the test in rebase -i mode.
While at it, error out early if the "instruction sheet" (i.e. the todo
script) could not be parsed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit prepares for future callers that will have a pointer/length
to some text to be written that lacks an LF, yet an LF is desired.
Instead of requiring the caller to append an LF to the buffer (and
potentially allocate memory to do so), the write_message() function
learns to append an LF at the end of the file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we required an strbuf. But that limits the use case too much.
In the upcoming patch series (for which the current patch series prepares
the sequencer), we will want to write content to a file for which we have
a pointer and a length, not an strbuf.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no need to wait until the atexit() handler kicks in at the end.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Nothing in the name "write_message()" suggests that the function
releases the strbuf passed to it. So let's release the strbuf in the
caller instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Interactive rebase's scripts may be indented; we need to handle this
case, too, now that we prepare the sequencer to process interactive
rebases.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The run_git_commit() function already knows how to amend commits, and
with this new option, it can also clean up commit messages (i.e. strip
out commented lines). This is needed to implement rebase -i's 'fixup'
and 'squash' commands as sequencer commands.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches the run_git_commit() function to take an argument that will
allow us to implement "todo" commands that need to amend the commit
messages ("fixup", "squash" and "reword").
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the upcoming commits, we will implement more and more of rebase -i's
functionality inside the sequencer. One particular feature of the
commands to come is that some of them allow editing the commit message
while others don't, i.e. we cannot define in the replay_opts whether the
commit message should be edited or not.
Let's add a new parameter to the run_git_commit() function. Previously,
it was the duty of the caller to ensure that the opts->edit setting
indicates whether to let the user edit the commit message or not,
indicating that it is an "all or nothing" setting, i.e. that the
sequencer wants to let the user edit *all* commit message, or none at
all. In the upcoming rebase -i mode, it will depend on the particular
command that is currently executed, though.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we are slowly teaching the sequencer to perform the hard work for
the interactive rebase, we need to read files that were written by
shell scripts.
These files typically contain a single line and are invariably ended
by a line feed (and possibly a carriage return before that). Let's use
a helper to read such files and to remove the line ending.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In interactive rebases, we commit a little bit differently than the
sequencer did so far: we heed the "author-script", the "message" and the
"amend" files in the .git/rebase-merge/ subdirectory.
Likewise, we may want to edit the commit message *even* when providing a
file containing the suggested commit message. Therefore we change the
code to not even provide a default message when we do not want any, and
to call the editor explicitly.
Also, in "interactive rebase" mode we want to skip reading the options
in the state directory of the cherry-pick/revert commands.
Finally, as interactive rebase's GPG settings are configured differently
from how cherry-pick (and therefore sequencer) handles them, we will
leave support for that to the next commit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git-rebase-todo` file contains a list of commands. Most of those
commands have the form
<verb> <sha1> <oneline>
The <oneline> is displayed primarily for the user's convenience, as
rebase -i really interprets only the <verb> <sha1> part. However, there
are *some* places in interactive rebase where the <oneline> is used to
display messages, e.g. for reporting at which commit we stopped.
So let's just remember it when parsing the todo file; we keep a copy of
the entire todo file anyway (to write out the new `done` and
`git-rebase-todo` file just before processing each command), so all we
need to do is remember the begin offsets and lengths.
As we will have to parse and remember the command-line for `exec` commands
later, we do not call the field "oneline" but rather "arg" (and will reuse
that for exec's command-line).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subcommands are used exactly once, at the very beginning of
sequencer_pick_revisions(), to determine what to do. This is an
unnecessary level of indirection: we can simply call the correct
function to begin with. So let's do that.
While at it, ensure that the subcommands return an error code so that
they do not have to die() all over the place (bad practice for library
functions...).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not unheard of that editors on Windows write CR/LF even if the
file originally had only LF. This is particularly awkward for exec lines
of a rebase -i todo sheet. Take for example the insn "exec echo": The
shell script parser splits at the LF and leaves the CR attached to
"echo", which leads to the unknown command "echo\r".
Work around that by stripping CR when reading the todo commands, as we
already do for LF.
This happens to fix t9903.14 and .15 in MSYS1 environments (with the
rebase--helper patches based on this patch series): the todo script
constructed in such a setup contains CR/LF thanks to MSYS1 runtime's
cleverness.
Based on a report and a patch by Johannes Sixt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we came up with the "sequencer" idea, we really wanted to have
kind of a plumbing equivalent of the interactive rebase. Hence the
choice of words: the "todo" script, a "pick", etc.
However, when it came time to implement the entire shebang, somehow this
idea got lost and the sequencer was used as working horse for
cherry-pick and revert instead. So as not to interfere with the
interactive rebase, it even uses a separate directory to store its
state.
Furthermore, it also is stupidly strict about the "todo" script it
accepts: while it parses commands in a way that was *designed* to be
similar to the interactive rebase, it then goes on to *error out* if the
commands disagree with the overall action (cherry-pick or revert).
Finally, the sequencer code chose to deviate from the interactive rebase
code insofar that when it comes to writing the file with the remaining
commands, it *reformats* the "todo" script instead of just writing the
part of the parsed script that were not yet processed. This is not only
unnecessary churn, but might well lose information that is valuable to
the user (i.e. comments after the commands).
Let's just bite the bullet and rewrite the entire parser; the code now
becomes not only more elegant: it allows us to go on and teach the
sequencer how to parse *true* "todo" scripts as used by the interactive
rebase itself. In a way, the sequencer is about to grow up to do its
older brother's job. Better.
In particular, we choose to maintain the list of commands in an array
instead of a linked list: this is flexible enough to allow us later on to
even implement rebase -i's reordering of fixup!/squash! commits very
easily (and with a very nice speed bonus, at least on Windows).
While at it, do not stop at the first problem, but list *all* of the
problems. This will help the user when the sequencer will do `rebase
-i`'s work by allowing to address all issues in one go rather than going
back and forth until the todo list is valid.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not only does this DRY up the code (providing a better documentation what
the code is about, as well as allowing to change the behavior in a single
place), it also makes it substantially shorter to use the same
functionality in functions to be introduced when we teach the sequencer to
process interactive-rebase's git-rebase-todo file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over the next commits, we will work on improving the sequencer to the
point where it can process the todo script of an interactive rebase. To
that end, we will need to teach the sequencer to read interactive
rebase's todo file. In preparation, we consolidate all places where
that todo file is needed to call a function that we will later extend.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sequencer is our attempt to lib-ify cherry-pick. Yet it behaves
like a one-shot command when it reads its configuration: memory is
allocated and released only when the command exits.
This is kind of okay for git-cherry-pick, which *is* a one-shot
command. All the work to make the sequencer its work horse was
done to allow using the functionality as a library function, though,
including proper clean-up after use.
To remedy that, take custody of the option values in question,
allocating and duping literal constants as needed and freeing them
at end.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve type safety by making arguments (whether from configuration or
from the command line) have their own "struct arg_item" type, separate
from the "struct trailer_item" type used to represent the trailers in
the buffer being manipulated.
This change also prepares "struct trailer_item" to be further
differentiated from "struct arg_item" in future patches.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, creation and addition (to a list) of trailer items are spread
across multiple functions. Streamline this by only having 2 functions:
one to parse the user-supplied string, and one to add the parsed
information to a list.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the existing handwritten implementation of a doubly-linked list
in trailer.c with the functions and macros from list.h. This
significantly simplifies the code.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding support for prefixing output of log and other commands using
--line-prefix, commit 660e113ce1 ("graph: add support for
--line-prefix on all graph-aware output", 2016-08-31) accidentally
broke rev-list --header output.
In order to make the output appear with a line-prefix, the flow was
changed to always use the graph subsystem for display. Unfortunately
the graph flow in rev-list did not use info->hdr_termination as it was
assumed that graph output would never need to putput NULs.
Since we now always use the graph code in order to handle the case of
line-prefix, simply replace putchar('\n') with
putchar(info->hdr_termination) which will correct this issue.
Add a test for the --header case to make sure we don't break it in the
future.
Reported-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui 0.21.0
* tag 'gitgui-0.21.0' of git://repo.or.cz/git-gui: (22 commits)
git-gui: set version 0.21
git-gui: Mark 'All' in remote.tcl for translation
git-gui i18n: Updated Bulgarian translation (565,0f,0u)
git-gui: avoid persisting modified author identity
git-gui: handle the encoding of Git's output correctly
git-gui: unicode file name support on windows
git-gui: Update Russian translation
git-gui: maintain backwards compatibility for merge syntax
git-gui i18n: mark string in lib/error.tcl for translation
git-gui: fix incorrect use of Tcl append command
git-gui i18n: mark "usage:" strings for translation
git-gui i18n: internationalize use of colon punctuation
git-gui: ensure the file in the diff pane is in the list of selected files
git-gui: support for $FILENAMES in tool definitions
git-gui: fix initial git gui message encoding
git-gui/po/glossary/txt-to-pot.sh: use the $( ... ) construct for command substitution
git-gui (Windows): use git-gui.exe in `Create Desktop Shortcut`
git-gui: fix detection of Cygwin
Amend tab ordering and text widget border and highlighting.
Allow keyboard control to work in the staging widgets.
...
Mark rename_limit_warning and degrade_cc_to_c_warning and
rename_limit_warning for translation.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark error messages about CRLF for translation.
Update test to reflect changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "submodule.<name>.path" stored in .gitmodules is never copied
to .git/config and such a key in .git/config has no meaning, but
the documentation described it and submodule.<name>.url next to
each other as if both belong to .git/config. This has been fixed.
* sb/submodule-config-doc-drop-path:
documentation: improve submodule.<name>.{url, path} description
"git mergetool" learned to honor "-O<orderfile>" to control the
order of paths to present to the end user.
* da/mergetool-diff-order:
mergetool: honor -O<orderfile>
mergetool: honor diff.orderFile
mergetool: move main program flow into a main() function
mergetool: add copyright
Codepaths involved in interacting alternate object store have
been cleaned up.
* jk/alt-odb-cleanup:
alternates: use fspathcmp to detect duplicates
sha1_file: always allow relative paths to alternates
count-objects: report alternates via verbose mode
fill_sha1_file: write into a strbuf
alternates: store scratch buffer as strbuf
fill_sha1_file: write "boring" characters
alternates: use a separate scratch space
alternates: encapsulate alt->base munging
alternates: provide helper for allocating alternate
alternates: provide helper for adding to alternates list
link_alt_odb_entry: refactor string handling
link_alt_odb_entry: handle normalize_path errors
t5613: clarify "too deep" recursion tests
t5613: do not chdir in main process
t5613: whitespace/style cleanups
t5613: use test_must_fail
t5613: drop test_valid_repo function
t5613: drop reachable_via function
A stray symbolic link in $GIT_DIR/refs/ directory could make name
resolution loop forever, which has been corrected.
* jk/ref-symlink-loop:
files_read_raw_ref: prevent infinite retry loops in general
files_read_raw_ref: avoid infinite loop on broken symlinks
In order for the receiving end of "git push" to inspect the
received history and decide to reject the push, the objects sent
from the sending end need to be made available to the hook and
the mechanism for the connectivity check, and this was done
traditionally by storing the objects in the receiving repository
and letting "git gc" to expire it. Instead, store the newly
received objects in a temporary area, and make them available by
reusing the alternate object store mechanism to them only while we
decide if we accept the check, and once we decide, either migrate
them to the repository or purge them immediately.
* jk/quarantine-received-objects:
tmp-objdir: do not migrate files starting with '.'
tmp-objdir: put quarantine information in the environment
receive-pack: quarantine objects until pre-receive accepts
tmp-objdir: introduce API for temporary object directories
check_connected: accept an env argument
Documentation for "git commit" was updated to clarify that "commit
-p <paths>" adds to the current contents of the index to come up
with what to commit.
* nd/commit-p-doc:
git-commit.txt: clarify --patch mode with pathspec
"git clone" of a local repository can be done at the filesystem
level, but the codepath did not check errors while copying and
adjusting the file that lists alternate object stores.
* jk/clone-copy-alternates-fix:
clone: detect errors in normalize_path_copy
http.emptyauth configuration is a way to allow an empty username to
pass when attempting to authenticate using mechanisms like
Kerberos. We took an unspecified (NULL) username and sent ":"
(i.e. no username, no password) to CURLOPT_USERPWD, but did not do
the same when the username is explicitly set to an empty string.
* dt/http-empty-auth:
http: http.emptyauth should allow empty (not just NULL) usernames
In a couple of commits, we will teach the sequencer to handle the
nitty gritty of the interactive rebase, which keeps its state in a
different directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We really do not need the *pointer to a* pointer to the options in
the read_populate_opts() function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change is not completely faithful: instead of initializing all fields
to 0, we choose to initialize command and subcommand to -1 (instead of
defaulting to REPLAY_REVERT and REPLAY_NONE, respectively). Practically,
it makes no difference at all, but future-proofs the code to require
explicit assignments for both fields.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a simple pass-thru filter as example implementation for the Git
filter protocol version 2. See Documentation/gitattributes.txt, section
"Filter Protocol" for more info.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's clean/smudge mechanism invokes an external filter process for
every single blob that is affected by a filter. If Git filters a lot of
blobs then the startup time of the external filter processes can become
a significant part of the overall Git execution time.
In a preliminary performance test this developer used a clean/smudge
filter written in golang to filter 12,000 files. This process took 364s
with the existing filter mechanism and 5s with the new mechanism. See
details here: https://github.com/github/git-lfs/pull/1382
This patch adds the `filter.<driver>.process` string option which, if
used, keeps the external filter process running and processes all blobs
with the packet format (pkt-line) based protocol over standard input and
standard output. The full protocol is explained in detail in
`Documentation/gitattributes.txt`.
A few key decisions:
* The long running filter process is referred to as filter protocol
version 2 because the existing single shot filter invocation is
considered version 1.
* Git sends a welcome message and expects a response right after the
external filter process has started. This ensures that Git will not
hang if a version 1 filter is incorrectly used with the
filter.<driver>.process option for version 2 filters. In addition,
Git can detect this kind of error and warn the user.
* The status of a filter operation (e.g. "success" or "error) is set
before the actual response and (if necessary!) re-set after the
response. The advantage of this two step status response is that if
the filter detects an error early, then the filter can communicate
this and Git does not even need to create structures to read the
response.
* All status responses are pkt-line lists terminated with a flush
packet. This allows us to send other status fields with the same
protocol in the future.
Helped-by: Martin-Louis Bright <mlbright@gmail.com>
Reviewed-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the existing 'single shot filter mechanism' and prepare the
new 'long running filter mechanism'.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
apply_filter() returns a boolean that tells the caller if it
"did convert or did not convert". The variable `ret` was used throughout
the function to track errors whereas `1` denoted success and `0`
failure. This is unusual for the Git source where `0` denotes success.
Rename the variable and flip its value to make the function easier
readable for Git developers.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_packetized_from_fd() and write_packetized_from_buf() write a
stream of packets. All content packets use the maximal packet size
except for the last one. After the last content packet a `flush` control
packet is written.
read_packetized_to_strbuf() reads arbitrary sized packets until it
detects a `flush` packet.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
packet_write_fmt_gently() uses format_packet() which lets the caller
only send string data via "%s". That means it cannot be used for
arbitrary data that may contain NULs.
Add packet_write_gently() which writes arbitrary data and does not die
in case of an error. The function is used by other pkt-line functions in
a subsequent patch.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
packet_flush() would die in case of a write error even though for some
callers an error would be acceptable. Add packet_flush_gently() which
writes a pkt-line flush packet like packet_flush() but does not die in
case of an error. The function is used in a subsequent patch.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
packet_write_fmt() would die in case of a write error even though for
some callers an error would be acceptable. Add packet_write_fmt_gently()
which writes a formatted pkt-line like packet_write_fmt() but does not
die in case of an error. The function is used in a subsequent patch.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extracted set_packet_header() function converts an integer to a 4 byte
hex string. Make this function locally available so that other pkt-line
functions could use it.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
packet_write() should be called packet_write_fmt() because it is a
printf-like function that takes a format string as first parameter.
packet_write_fmt() should be used for text strings only. Arbitrary
binary data should use a new packet_write() function that is introduced
in a subsequent patch.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some processes might want to perform cleanup tasks before Git kills them
due to the 'clean_on_exit' flag. Let's give them an interface for doing
this. The feature is used in a subsequent patch.
Please note, that the cleanup callback is not executed if Git dies of a
signal. The reason is that only "async-signal-safe" functions would be
allowed to be call in that case. Since we cannot control what functions
the callback will use, we will not support the case. See 507d7804 for
more details.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move check_pipe() to run_command and make it public. This is necessary
to call the function from pkt-line in a subsequent patch.
While at it, make async_exit() static to run_command.c as it is no
longer used from outside.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `test_config` to set the config, check that files are empty with
`test_must_be_empty`, compare files with `test_cmp`, and remove spaces
after ">" and "<".
Please note that the "rot13" filter configured in "setup" keeps using
`git config` instead of `test_config` because subsequent tests might
depend on it.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git filter driver commands with spaces (e.g. `filter.sh foo`) are hard
to read in error messages. Quote them to improve the readability.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the log formatting function to know about "git describe" output
such as "v2.8.0-4-g867ad08", in addition to just plain "867ad08".
There are still many valid refnames that we don't link to
e.g. v2.10.0-rc1~2^2~1 is also a valid way to refer to
v2.8.0-4-g867ad08, but I'm not supporting that with this commit,
similarly it's trivially possible to create some refnames like
"æ/var-gf6727b0" or which won't be picked up by this regex.
There's surely room for improvement here, but I just wanted to address
the very common case of sticking "git describe" output into commit
messages without trying to link to all possible refnames, that's going
to be a rather futile exercise given that this is free text, and it
would be prohibitively expensive to look up whether the references in
question exist in our repository.
There was on-list discussion about how we could do better than this
patch. Junio suggested to update parse_commits() to call a new
"gitweb--helper" command which would pass each of the revision
candidates through "rev-parse --verify --quiet". That would cut down
on our false positives (e.g. we'll link to "deadbeef"), and also allow
us to be more aggressive in selecting candidate revisions.
That may be too expensive to work in practice, or it may
not. Investigating that would be a good follow-up to this patch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the minimum length of an abbreviated object identifier in the
commit message gitweb tries to turn into link from 8 hexchars to 7.
This arbitrary minimum length of 8 was introduced in bfe2191 ("gitweb:
SHA-1 in commit log message links to "object" view", 2006-12-10), but
the default abbreviation length is 7, and has been for a long time.
It's still possible to reference SHA-1s down to 4 characters in length,
see v1.7.4-1-gdce9648's MINIMUM_ABBREV, but I can't see how to make
git actually produce that, so I doubt anyone is putting that into log
messages in practice, but people definitely do put 7 character SHA-1s
into log messages.
I think it's fairly dubious to link to things matching [0-9a-fA-F]
here as opposed to just [0-9a-f], that dates back to the initial
version of gitweb from 161332a ("first working version",
2005-08-07). Git will accept all-caps SHA-1s, but didn't ever produce
them as far as I can tell.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a typo'd MIME type in a comment. The Content-Type is
application/xhtml+xml, not application/xhtm+xml.
Fixes up code originally added in 53c4031 ("gitweb: Strip
non-printable characters from syntax highlighter output", 2011-09-16).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "const char *" to "char *" in struct trailer_item and in the
return value of apply_command (since those strings are owned strings).
Change "struct conf_info *" to "const struct conf_info *" (since that
struct is not modified).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark error messages for translation passed to error() and die()
functions.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-svn internals were previously not aware of repository
layout differences for users of the "git worktree" command.
Introduce this awareness by using "git rev-parse --git-path"
instead of relying on outdated uses of GIT_DIR and friends.
Thanks-to: Duy Nguyen <pclouds@gmail.com>
Reported-by: Mathieu Arnold <mat@freebsd.org>
Signed-off-by: Eric Wong <e@80x24.org>
Reducing the scope of where we change the record separator ($/)
avoids bugs in calls which rely on the input record separator
further down, such as the 'chomp' usage in command_oneline.
This is necessary for a future change to git-svn, but exists in
Git.pm since it seems useful for gitweb and our other Perl
scripts, too.
Signed-off-by: Eric Wong <e@80x24.org>
According to gpg2's doc/DETAILS:
For each signature only one of the codes GOODSIG, BADSIG,
EXPSIG, EXPKEYSIG, REVKEYSIG or ERRSIG will be emitted.
gpg1 ("classic") behaves the same (although doc/DETAILS differs).
Currently, we parse gpg's status output for GOODSIG, BADSIG and
trust information and translate that into status codes G, B, U, N
for the %G? format specifier.
git-verify-* returns success in the GOODSIG case only. This is
somewhat in disagreement with gpg, which considers the first 5 of
the 6 above as VALIDSIG, but we err on the very safe side.
Introduce additional status codes E, X, Y, R for ERRSIG, EXPSIG,
EXPKEYSIG, and REVKEYSIG so that a user of %G? gets more information
about the absence of a 'G' on first glance.
Requested-by: Alex <agrambot@gmail.com>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The concerned message was marked for translation by 0c99171
("get_short_sha1: mark ambiguity error for translation", 2016-09-26).
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like a lot of old commit-traversal code, this keeps a
commit_list in commit-date order, and and inserts parents
into the list. This means each insertion is potentially
linear, and the whole thing is quadratic (though the exact
runtime depends on the relationship between the commit dates
and the parent topology).
These days we have a priority queue, which can do the same
thing with a much better worst-case time.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is based on the existing gnome-keyring helper, but instead of
libgnome-keyring (which was specific to GNOME and is deprecated), it
uses libsecret which can support other implementations of XDG Secret
Service API.
Passes t0303-credential-external.sh.
Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Reviewed-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach mergetool to pass "-O<orderfile>" down to `git diff` when
specified on the command-line.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: David Aguilar <davvid@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach mergetool to get the list of files to edit via `diff` so that we
gain support for diff.orderFile.
Suggested-by: Luis Gutierrez <luisgutz@gmail.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: David Aguilar <davvid@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it easier to follow the program's flow by isolating all
logic into functions. Isolate the main execution code path into
a single unit instead of having prompt_after_failed_merge()
interrupt it partyway through.
The use of a main() function is borrowing a convention from C,
Python, Perl, and many other languages. This helps readers more
familiar with other languages understand the purpose of each
function when diving into the codebase with fresh eyes.
Signed-off-by: David Aguilar <davvid@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is a common mistake to say "git blame --reverse OLD path",
expecting that the command line is dwimmed as if asking how lines
in path in an old revision OLD have survived up to the current
commit.
* jc/blame-reverse:
blame: dwim "blame --reverse OLD" as "blame --reverse OLD.."
blame: improve diagnosis for "--reverse NEW"
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
The command-line completion script (in contrib/) learned to
complete "git cmd ^mas<HT>" to complete the negative end of
reference to "git cmd ^master".
* cp/completion-negative-refs:
completion: support excluding refs
The ./configure script generated from configure.ac was taught how
to detect support of SSL by libcurl better.
* dp/autoconf-curl-ssl:
./configure.ac: detect SSL in libcurl using curl-config
When we started cURL to talk to imap server when a new enough
version of cURL library is available, we forgot to explicitly add
imap(s):// before the destination. To some folks, that didn't work
and the library tried to make HTTP(s) requests instead.
* ak/curl-imap-send-explicit-scheme:
imap-send: Tell cURL to use imap:// or imaps://
"git pack-objects" in a repository with many packfiles used to
spend a lot of time looking for/at objects in them; the accesses to
the packfiles are now optimized by checking the most-recently-used
packfile first.
* jk/pack-objects-optim-mru:
pack-objects: use mru list when iterating over packs
pack-objects: break delta cycles before delta-search phase
sha1_file: make packed_object_info public
provide an initializer for "struct object_info"
We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of
the time third parameter is redundant. A new QSORT() macro lets us
omit it.
* rs/qsort:
show-branch: use QSORT
use QSORT, part 2
coccicheck: use --all-includes by default
remove unnecessary check before QSORT
use QSORT
add QSORT
This avoids "." and "..", as we already do, but also leaves
room for index-pack to store extra data in the quarantine
area (e.g., for passing back any analysis to be read by the
pre-receive hook).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The presence of the GIT_QUARANTINE_PATH variable lets any
called programs know that they're operating in a temporary
object directory (and where that directory is).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a client pushes objects to us, index-pack checks the
objects themselves and then installs them into place. If we
then reject the push due to a pre-receive hook, we cannot
just delete the packfile; other processes may be depending
on it. We have to do a normal reachability check at this
point via `git gc`.
But such objects may hang around for weeks due to the
gc.pruneExpire grace period. And worse, during that time
they may be exploded from the pack into inefficient loose
objects.
Instead, this patch teaches receive-pack to put the new
objects into a "quarantine" temporary directory. We make
these objects available to the connectivity check and to the
pre-receive hook, and then install them into place only if
it is successful (and otherwise remove them as tempfiles).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once objects are added to the object database by a process,
they cannot easily be deleted, as we don't know what other
processes may have started referencing them. We have to
clean them up with git-gc, which will apply the usual
reachability and grace-period checks.
This patch provides an alternative: it helps callers create
a temporary directory inside the object directory, and a
temporary environment which can be passed to sub-programs to
ask them to write there (the original object directory
remains accessible as an alternate of the temporary one).
See tmp-objdir.h for details on the API.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This lets callers influence the environment seen by
rev-list, which will be useful when we start providing
quarantined objects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On a case-insensitive filesystem, we should realize that
"a/objects" and "A/objects" are the same path. We already
use fspathcmp() to check against the main object directory,
but until recently we couldn't use it for comparing against
other alternates (because their paths were not
NUL-terminated strings). But now we can, so let's do so.
Note that we also need to adjust count-objects to load the
config, so that it can see the setting of core.ignorecase
(this is required by the test, but is also a general bugfix
for users of count-objects).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We recursively expand alternates repositories, so that if A
borrows from B which borrows from C, A can see all objects.
For the root object database, we allow relative paths, so A
can point to B as "../B/objects". However, we currently do
not allow relative paths when recursing, so B must use an
absolute path to reach C.
That is an ancient protection from c2f493a (Transitively
read alternatives, 2006-05-07) that tries to avoid adding
the same alternate through two different paths. Since
5bdf0a8 (sha1_file: normalize alt_odb path before comparing
and storing, 2011-09-07), we use a normalized absolute path
for each alt_odb entry.
This means that in most cases the protection is no longer
necessary; we will detect the duplicate no matter how we got
there (but see below). And it's a good idea to get rid of
it, as it creates an unnecessary complication when setting
up recursive alternates (B has to know that A is going to
borrow from it and make sure to use an absolute path).
Note that our normalization doesn't actually look at the
filesystem, so it can still be fooled by crossing symbolic
links. But that's also true of absolute paths, so it's not a
good reason to disallow only relative paths (it's
potentially a reason to switch to real_path(), but that's a
separate and non-trivial change).
We adjust the test script here to demonstrate that this now
works, and add new tests to show that the normalization does
indeed suppress duplicates.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no way to get the list of alternates that git
computes internally; our tests only infer it based on which
objects are available. In addition to testing, knowing this
list may be helpful for somebody debugging their alternates
setup.
Let's add it to the "count-objects -v" output. We could give
it a separate flag, but there's not really any need.
"count-objects -v" is already a debugging catch-all for the
object database, its output is easily extensible to new data
items, and printing the alternates is not expensive (we
already had to find them to count the objects).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's currently the responsibility of the caller to give
fill_sha1_file() enough bytes to write into, leading them to
manually compute the required lengths. Instead, let's just
write into a strbuf so that it's impossible to get this
wrong.
The alt_odb caller already has a strbuf, so this makes
things strictly simpler. The other caller, sha1_file_name(),
uses a static PATH_MAX buffer and dies when it would
overflow. We can convert this to a static strbuf, which
means our allocation cost is amortized (and as a bonus, we
no longer have to worry about PATH_MAX being too short for
normal use).
This does introduce some small overhead in fill_sha1_file(),
as each strbuf_addchar() will check whether it needs to
grow. However, between the optimization in fec501d
(strbuf_addch: avoid calling strbuf_grow, 2015-04-16) and
the fact that this is not generally called in a tight loop
(after all, the next step is typically to access the file!)
this probably doesn't matter. And even if it did, the right
place to micro-optimize is inside fill_sha1_file(), by
calling a single strbuf_grow() there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pre-size the scratch buffer to hold a loose object
filename of the form "xx/yyyy...", which leads to allocation
code that is hard to verify. We have to use some magic
numbers during the initial allocation, and then writers must
blindly assume that the buffer is big enough. Using a strbuf
makes it more clear that we cannot overflow.
Unfortunately, we do still need some magic numbers to grow
our strbuf before calling fill_sha1_path(), but the strbuf
growth is much closer to the point of use. This makes it
easier to see that it's correct, and opens the possibility
of pushing it even further down if fill_sha1_path() learns
to work on strbufs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function forms a sha1 as "xx/yyyy...", but skips over
the slot for the slash rather than writing it, leaving it to
the caller to do so. It also does not bother to put in a
trailing NUL, even though every caller would want it (we're
forming a path which by definition is not a directory, so
the only thing to do with it is feed it to a system call).
Let's make the lives of our callers easier by just writing
out the internal "/" and the NUL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The alternate_object_database struct uses a single buffer
both for storing the path to the alternate, and as a scratch
buffer for forming object names. This is efficient (since
otherwise we'd end up storing the path twice), but it makes
life hard for callers who just want to know the path to the
alternate. They have to remember to stop reading after
"alt->name - alt->base" bytes, and to subtract one for the
trailing '/'.
It would be much simpler if they could simply access a
NUL-terminated path string. We could encapsulate this in a
function which puts a NUL in the scratch buffer and returns
the string, but that opens up questions about the lifetime
of the result. The first time another caller uses the
alternate, the scratch buffer may get other data tacked onto
it.
Let's instead just store the root path separately from the
scratch buffer. There aren't enough alternates being stored
for the duplicated data to matter for performance, and this
keeps things simple and safe for the callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The alternate_object_database struct holds a path to the
alternate objects, but we also use that buffer as scratch
space for forming loose object filenames. Let's pull that
logic into a helper function so that we can more easily
modify it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allocating a struct alternate_object_database is tricky, as
we must over-allocate the buffer to provide scratch space,
and then put in particular '/' and NUL markers.
Let's encapsulate this in a function so that the complexity
doesn't leak into callers (and so that we can modify it
later).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The submodule code wants to temporarily add an alternate
object store to our in-memory alt_odb list, but does it
manually. Let's provide a helper so it can reuse the code in
link_alt_odb_entry().
While we're adding our new add_to_alternates_memory(), let's
document add_to_alternates_file(), as the two are related.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The string handling in link_alt_odb_entry() is mostly an
artifact of the original version, which took the path as a
ptr/len combo, and did not have a NUL-terminated string
until we created one in the alternate_object_database
struct. But since 5bdf0a8 (sha1_file: normalize alt_odb
path before comparing and storing, 2011-09-07), the first
thing we do is put the path into a strbuf, which gives us
some easy opportunities for cleanup.
In particular:
- we call strlen(pathbuf.buf), which is silly; we can look
at pathbuf.len.
- even though we have a strbuf, we don't maintain its
"len" field when chomping extra slashes from the
end, and instead keep a separate "pfxlen" variable. We
can fix this and then drop "pfxlen" entirely.
- we don't check whether the path is usable until after we
allocate the new struct, making extra cleanup work for
ourselves. Since we have a NUL-terminated string, we can
bump the "is it usable" checks higher in the function.
While we're at it, we can move that logic to its own
helper, which makes the flow of link_alt_odb_entry()
easier to follow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we add a new alternate to the list, we try to normalize
out any redundant "..", etc. However, we do not look at the
return value of normalize_path_copy(), and will happily
continue with a path that could not be normalized. Worse,
the normalizing process is done in-place, so we are left
with whatever half-finished working state the normalizing
function was in.
Fortunately, this cannot cause us to read past the end of
our buffer, as that working state will always leave the
NUL from the original path in place. And we do tend to
notice problems when we check is_directory() on the path.
But you can see the nonsense that we feed to is_directory
with an entry like:
this/../../is/../../way/../../too/../../deep/../../to/../../resolve
in your objects/info/alternates, which yields:
error: object directory
/to/e/deep/too/way//ects/this/../../is/../../way/../../too/../../deep/../../to/../../resolve
does not exist; check .git/objects/info/alternates.
We can easily fix this just by checking the return value.
But that makes it hard to generate a good error message,
since we're normalizing in-place and our input value has
been overwritten by cruft.
Instead, let's provide a strbuf helper that does an in-place
normalize, but restores the original contents on error. This
uses a second buffer under the hood, which is slightly less
efficient, but this is not a performance-critical code path.
The strbuf helper can also properly set the "len" parameter
of the strbuf before returning. Just doing:
normalize_path_copy(buf.buf, buf.buf);
will shorten the string, but leave buf.len at the original
length. That may be confusing to later code which uses the
strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests are just trying to show that we allow recursion
up to a certain depth, but not past it. But the counting is
a bit non-intuitive, and rather than test at the edge of the
breakage, we test "OK" cases in the middle of the chain.
Let's explain what's going on, and explicitly test the
switch between "OK" and "too deep".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is similar to the previous patch, though no user reported a bug and
I could not find a regressive behavior.
However it is a good thing to be strict on the output and for that we
always omit a trailing slash.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before 63e95beb0 (2016-04-15, submodule: port resolve_relative_url from
shell to C), it did not matter if the superprojects URL had a trailing
slash or not. It was just chopped off as one of the first steps
(The "remoteurl=${remoteurl%/}" near the beginning of
resolve_relative_url(), which was removed in said commit).
When porting this to the C version, an off-by-one error was introduced
and we did not check the actual last character to be a slash, but the
NULL delimiter.
Reintroduce the behavior from before 63e95beb0, to ignore the trailing
slash.
Reported-by: <venv21@gmail.com>
Helped-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule. The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.
This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally. If any wildcard character is encountered we'll simply punt
and produce a false positive match. More accurate matching will be done
once inside the submodule. This is due to the superproject not knowing
what files could exist in the submodule.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass through some known-safe options when recursing into submodules.
(--cached, -v, -t, -z, --debug, --eol)
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules. This is done by forking off a
process to recursively call ls-files on all submodules. Use top-level
--super-prefix option to pass a path to the submodule which it can
use to prepend to output or pathspec matching logic.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX'
which can be used to specify a path from above a repository down to its
root. When such a super-prefix is specified, the paths reported by Git
are prefixed with it to make them relative to that directory "above".
The paths given by the user on the command line
(e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken
relative to the directory "above" to match.
The immediate use of this option is by commands which have a
--recurse-submodule option in order to give context to submodules about
how they were invoked. This option is currently only allowed for
builtins which support a super-prefix.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous code still followed the old git-pull.sh code which did not
adhere to our new convention.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes we are *actually* interested in those changes... For
example when an interactive rebase wants to continue with a staged
submodule update.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function used by "git pull" to stop the user when the working
tree has changes is useful in other places.
Let's move it into a more prominent (and into an actually reusable)
spot: wt-status.[ch].
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting the pull command to a builtin, the
require_clean_work_tree() function was renamed and the pull-specific
parts hard-coded.
This makes it impossible to reuse the code, so let's modify the code to
make it more similar to the original shell script again.
Note: when the hint "Please commit or stash them" was introduced first,
Git did not have the convention of continuing error messages in lower
case, but now we do have that convention, therefore we reintroduce this
hint down-cased, obeying said convention.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cmd_pull(), when verifying that there are no changes preventing a
rebasing pull, we diligently pass the prefix parameter to the
die_on_unclean_work_tree() function which in turn diligently passes it
to the has_unstaged_changes() and has_uncommitted_changes() functions.
The casual reader might now be curious (as this developer was) whether
that means that calling `git pull --rebase` in a subdirectory will
ignore unstaged changes in other parts of the working directory. And be
puzzled that `git pull --rebase` (correctly) complains about those
changes outside of the current directory.
The puzzle is easily resolved: while we take pains to pass around the
prefix and even pass it to init_revisions(), the fact that no paths are
passed to init_revisions() ensures that the prefix is simply ignored.
That, combined with the fact that we will *always* want a *full* working
directory check before running a rebasing pull, is reason enough to
simply do away with the actual prefix parameter and to pass NULL
instead, as if we were running this from the top-level working directory
anyway.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that parses the format parameter of for-each-ref command
has seen a micro-optimization.
* sg/ref-filter-parse-optim:
ref-filter: strip format option after a field name only once while parsing
Code clean-up with help from coccinelle tool continues.
* rs/cocci:
coccicheck: make transformation for strbuf_addf(sb, "...") more precise
use strbuf_add_unique_abbrev() for adding short hashes, part 2
use strbuf_addstr() instead of strbuf_addf() with "%s", part 2
gitignore: ignore output files of coccicheck make target
When "%C(auto)" appears at the very beginning of the pretty format
string, it did not need to issue the reset sequence, but it did.
* rs/c-auto-resets-attributes:
pretty: avoid adding reset for %C(auto) if output is empty
In recent versions of cURL, GSSAPI credential delegation is
disabled by default due to CVE-2011-2192; introduce a configuration
to selectively allow enabling this.
* ps/http-gssapi-cred-delegation:
http: control GSSAPI credential delegation
The "graph" API used in "git log --graph" miscounted the number of
output columns consumed so far when drawing a padding line, which
has been fixed; this did not affect any existing code as nobody
tried to write anything after the padding on such a line, though.
* jk/graph-padding-fix:
graph: fix extra spaces in graph_padding_line
"git log rev^..rev" is an often-used revision range specification
to show what was done on a side branch merged at rev. This has
gained a short-hand "rev^-1". In general "rev^-$n" is the same as
"^rev^$n rev", i.e. what has happened on other branches while the
history leading to nth parent was looking the other way.
* vn/revision-shorthand-for-side-branch-log:
revision: new rev^-n shorthand for rev^n..rev
Almost everybody uses DEFAULT_ABBREV to refer to the default
setting for the abbreviation, but "git blame" peeked into
underlying variable bypassing the macro for no good reason.
* jc/blame-abbrev:
blame: use DEFAULT_ABBREV macro
When given an abbreviated object name that is not (or more
realistically, "no longer") unique, we gave a fatal error
"ambiguous argument". This error is now accompanied by hints that
lists the objects that begins with the given prefix. During the
course of development of this new feature, numerous minor bugs were
uncovered and corrected, the most notable one of which is that we
gave "short SHA1 xxxx is ambiguous." twice without good reason.
* jk/ambiguous-short-object-names:
get_short_sha1: make default disambiguation configurable
get_short_sha1: list ambiguous objects on error
for_each_abbrev: drop duplicate objects
sha1_array: let callbacks interrupt iteration
get_short_sha1: mark ambiguity error for translation
get_short_sha1: NUL-terminate hex prefix
get_short_sha1: refactor init of disambiguation code
get_short_sha1: parse tags when looking for treeish
get_sha1: propagate flags to child functions
get_sha1: avoid repeating ourselves via ONLY_TO_DIE
get_sha1: detect buggy calls with multiple disambiguators
Commit 7e71adc77f fixes a problem with git-gui failing to pick up the
original author identity during a commit --amend operation. However, the
new author details then become persistent for the remainder of the session.
This commit fixes this by ensuring the environment variables are reset
and the author information reset once the commit is completed.
The relevant changes were reworked to reduce global variables.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Assumes file names in git tree objects are UTF-8 encoded.
On most unix systems, the system encoding (and thus the TCL system
encoding) will be UTF-8, so file names will be displayed correctly.
On Windows, it is impossible to set the system encoding to UTF-8. Changing
the TCL system encoding (via 'encoding system ...', e.g. in the startup
code) is explicitly discouraged by the TCL docs.
Change git-gui functions dealing with file names to always convert
from and to UTF-8.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
With the preparatory steps, it has become trivial to teach the
system a new diff.wsErrorHighlight configuration that gives the
default value for --ws-error-highlight command line option.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These need to be usable from git_diff_ui_config() code to help
parsing a configuration variable, so move them up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the function to parse_ws_error_highlight_opt(), because it is
meant to parse a command line option, and then refactor the meat of
the function into a helper function that reports the parsed result
which is typically a small unsigned int (these are OR'ed bitmask
after all), or a negative offset that indicates where in the input
string a parse error happened.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We'd want to run this same set of test twice, once with the option
and another time with an equivalent configuration setting. Split
out the step that prepares the test data and expected output and
move the test for the command line option into a separate test.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b5f325c updated to use the newer merge syntax but continue to
support older versions of git.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
The get_short_sha1() is only about reading short sha1s; we
do call it in a loop to check "is this long enough" for each
object, but otherwise it should not need to know about
things like our default_abbrev setting.
So instead of asking it to set default_automatic_abbrev as a
side-effect, let's just have find_unique_abbrev() pick the
right place to start its loop. This requires a separate
approximate_object_count() function, but that naturally
belongs with the rest of sha1_file.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix wrong use of append command in strings marked for translation.
According to Tcl/Tk Documentation [1],
append varName ?value value value ...?
appends all value arguments to the current value of variable varName.
This means that
append "[appname] ([reponame]): " [mc "File Viewer"]
is setting a variable named "[appname] ([reponame]): " to the output of
[mc "File Viewer"], rather than returning the concatenation of both
expressions as one might expect.
The format for some strings enables, for instance, a French translator
to translate like "%s (%s) : Create Branch" (space before colon).
Conversely, strings already translated will be marked as fuzzy and the
translator must update them herself.
For some cases, use alternative way for concatenation instead of using
strcat procedure defined in git-gui.sh.
Reference: 31bb1d1 ("git-gui: Paper bag fix missing translated strings",
2007-09-14) fixes the same issue slightly differently.
[1] http://www.tcl.tk/man/tcl/TclCmd/append.htm
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Internationalize use of colon punctuation ':' in options window, windows
titles, database statistics window. Some languages might use a different
style, for instance French uses "User Name :" (space before colon).
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
It is very confusing that the file which diff is displayed is marked as
selected, but it is not in fact selected (that means the array of selected
files does not include the file in question).
Fixing this also improves the use of $FILENAMES in custom defined tools: one
does not have to click the file in the list to make it selected.
Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
This adds a FILENAMES environment variable, which contains the repository
pathnames of all selected files the list.
The variable contains the names separated by LF (\n, \x0a).
If the file names contain LF characters, the tool command might be unable to
unambiguously split the value of $FILENAME into the separate names.
Note that the file marked and diffed immediately after starting the GUI up,
is not actually selected. One must click on it once to really select it.
Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
This fix refers https://github.com/git-for-windows/git/issues/664
After `git merge --squash` git creates .git/SQUASH_MSG (UTF-8 encoded)
which contains squashed commits. When run `git gui` it copies SQUASH_MSG
to PREPARE_COMMIT_MSG, but without honoring UTF-8. This leads to encoding
problems on `git gui` commit prompt.
The same applies on git cherry-pick conflict, where MERGE_MSG is created
and then is copied to PREPARE_COMMIT_MSG.
In both cases PREPARE_COMMIT_MSG must be configured to store data in UTF-8.
Signed-off-by: yaras <yaras6@gmail.com>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg' "${_f}"
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Our usual style when working with subdirectories is to chdir
inside a subshell or to use "git -C", which means we do not
have to constantly return to the main test directory. Let's
convert this old test, which does not follow that style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our normal test style these days puts the opening quote of
the body on the description line, and indents the body with
a single tab. This ancient test did not follow this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Besides being our normal style, this correctly checks for an
error exit() versus signal death.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function makes sure that "git fsck" does not report any
errors. But "--full" has been the default since f29cd39
(fsck: default to "git fsck --full", 2009-10-20), and we can
use the exit code (instead of counting the lines) since
e2b4f63 (fsck: exit with non-zero status upon errors,
2007-03-05).
So we can just use "git fsck", which is shorter and more
flexible (e.g., we can use "git -C").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was never used since its inception in dd05ea1
(test case for transitive info/alternates, 2006-05-07).
Which is just as well, since it mutates the repo state in a
way that would invalidate further tests, without cleaning up
after itself. Let's get rid of it so that nobody is tempted
to use it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath in "git fsck" to detect malformed tree objects has
been updated not to die but keep going after detecting them.
* dt/tree-fsck:
fsck: handle bad trees like other errors
tree-walk: be more specific about corrupt tree errors
An author name, that spelled a backslash-quoted double quote in the
human readable part "My \"double quoted\" name", was not unquoted
correctly while applying a patch from a piece of e-mail.
* kd/mailinfo-quoted-string:
mailinfo: unescape quoted-pair in header fields
t5100-mailinfo: replace common path prefix with variable
The original command line syntax for "git merge", which was "git
merge <msg> HEAD <parent>...", has been deprecated for quite some
time, and "git gui" was the last in-tree user of the syntax. This
is finally fixed, so that we can move forward with the deprecation.
* rs/git-gui-use-modern-git-merge-syntax:
git-gui: stop using deprecated merge syntax
Codepaths that read from an on-disk loose object were too loose in
validating what they are reading is a proper object file and
sometimes read past the data they read from the disk, which has
been corrected. H/t to Gustavo Grieco for reporting.
* jc/verify-loose-object-header:
unpack_sha1_header(): detect malformed object header
streaming: make sure to notice corrupt object
"git init" tried to record core.worktree in the repository's
'config' file when GIT_WORK_TREE environment variable was set and
it was different from where GIT_DIR appears as ".git" at its top,
but the logic was faulty when .git is a "gitdir:" file that points
at the real place, causing trouble in working trees that are
managed by "git worktree". This has been corrected.
* nd/init-core-worktree-in-multi-worktree-world:
init: kill git_link variable
init: do not set unnecessary core.worktree
init: kill set_git_dir_init()
init: call set_git_dir_init() from within init_db()
init: correct re-initialization from a linked worktree
"gitweb" can spawn "highlight" to show blob contents with
(programming) language-specific syntax highlighting, but only
when the language is known. "highlight" can however be told
to make the guess itself by giving it "--force" option, which
has been enabled.
* ik/gitweb-force-highlight:
gitweb: use highlight's shebang detection
gitweb: remove unused guess_file_syntax() parameter
In fairly early days we somehow decided to abbreviate object names
down to 7-hexdigits, but as projects grow, it is becoming more and
more likely to see such a short object names made in earlier days
and recorded in the log messages no longer unique.
Currently the Linux kernel project needs 11 to 12 hexdigits, while
Git itself needs 10 hexdigits to uniquely identify the objects they
have, while many smaller projects may still be fine with the
original 7-hexdigit default. One-size does not fit all projects.
Introduce a mechanism, where we estimate the number of objects in
the repository upon the first request to abbreviate an object name
with the default setting and come up with a sane default for the
repository. Based on the expectation that we would see collision in
a repository with 2^(2N) objects when using object names shortened
to first N bits, use sufficient number of hexdigits to cover the
number of objects in the repository. Each hexdigit (4-bits) we add
to the shortened name allows us to have four times (2-bits) as many
objects in the repository.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that sets custom abbreviation length, in response to
command line argument, often does something like this:
if (skip_prefix(arg, "--abbrev=", &arg))
abbrev = atoi(arg);
else if (!strcmp("--abbrev", &arg))
abbrev = DEFAULT_ABBREV;
/* make the value sane */
if (abbrev < 0 || 40 < abbrev)
abbrev = ... some sane value ...
However, it is pointless to sanity-check and tweak the value
obtained from DEFAULT_ABBREV. We are going to allow it to be
initially set to -1 to signal that the default abbreviation length
must be auto sized upon the first request to abbreviate, based on
the number of objects in the repository, and when that happens,
rejecting or tweaking a negative value to a "saner" one will
negatively interfere with the auto sizing. The codepaths for
git rev-parse --short <object>
git diff --raw --abbrev
do exactly that; allow them to pass possibly negative abbrevs
intact, that will come from DEFAULT_ABBREV in the future.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We'll be introducing a new way to decide the default abbreviation
length by initialising DEFAULT_ABBREV to -1 to signal the first call
to "find unique abbreviation" codepath to compute a reasonable value
based on the number of objects we have to avoid collisions.
We have long relied on DEFAULT_ABBREV being a positive concrete
value that is used as the abbreviation length when no extra
configuration or command line option has overridden it. Some
codepaths wants to use such a positive concrete default value
even before making their first request to actually trigger the
computation for the auto sized default.
Introduce FALLBACK_DEFAULT_ABBREV and use it to the code that
attempts to align the report from "git fetch". For now, this
macro is also used to initialize the default_abbrev variable,
but the auto-sizing code will use -1 and then use the value of
FALLBACK_DEFAULT_ABBREV as the starting point of auto-sizing.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shorten the code by using QSORT instead of calling qsort(3) directly,
as the former determines the element size automatically and checks if
there are at least two elements to sort already.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `Repository>Create Desktop Shortcut`, Git GUI assumes
that it is okay to call `wish.exe` directly on Windows. However, in
Git for Windows 2.x' context, that leaves several crucial environment
variables uninitialized, resulting in a shortcut that does not work.
To fix those environment variable woes, Git for Windows comes with a
convenient `git-gui.exe`, so let's just use it when it is available.
This fixes https://github.com/git-for-windows/git/issues/448
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
MSys2 might *look* like Cygwin, but it is *not* Cygwin... Unless it
is run with `MSYSTEM=MSYS`, that is.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Tab order follows widget creation order (and Z-order) so amend this to
match the layout more logically.
For keyboard selection a highlight around the selected text widget is
useful. Customized on Windows themed Tk to follow the native theme more
closely with a custom EntryFrame style.
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Keyboard focus was restricted to the commit message widget and users were
forced to use the mouse to select files in the workdir widget and only then
could use a key combination to stage the file.
It is now possible to use key navigation (Ctrl-Tab, arrow keys and Ctrl-T
or Ctrl-U) to stage and unstage files.
Suggested by @koppor in git-for-window/git issue #859
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default. This option lets it consider
header files which would otherwise be ignored. That's important for
some rules that rely on type information. It doubles the duration of
coccicheck, however.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Delegation of credentials is disabled by default in libcurl since
version 7.21.7 due to security vulnerability CVE-2011-2192. Which
makes troubles with GSS/kerberos authentication when delegation
of credentials is required. This can be changed with option
CURLOPT_GSSAPI_DELEGATION in libcurl with set expected parameter
since libcurl version 7.22.0.
This patch provides new configuration variable http.delegation
which corresponds to curl parameter "--delegation" (see man 1 curl).
The following values are supported:
* none (default).
* policy
* always
Signed-off-by: Petr Stodulka <pstodulk@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree", even though it used the default_abbrev setting that
ought to be affected by core.abbrev configuration variable, ignored
the variable setting. The command has been taught to read the
default set of configuration variables to correct this.
* jc/worktree-config:
worktree: honor configuration variables
In the codepath that comes up with the hostname to be used in an
e-mail when the user didn't tell us, we looked at ai_canonname
field in struct addrinfo without making sure it is not NULL first.
* jk/ident-ai-canonname-could-be-null:
ident: handle NULL ai_canonname
When "git fetch" tries to find where the history of the repository
it runs in has diverged from what the other side has, it has a
mechanism to avoid digging too deep into irrelevant side branches.
This however did not work well over the "smart-http" transport due
to a design bug, which has been fixed.
* jt/fetch-pack-in-vain-count-with-stateless:
fetch-pack: do not reset in_vain on non-novel acks
A low-level function verify_packfile() was meant to show errors
that were detected without dying itself, but under some conditions
it didn't and died instead, which has been fixed.
* jk/verify-packfile-gently:
verify_packfile: check pack validity before accessing data
When "git format-patch --stdout" output is placed as an in-body
header and it uses the RFC2822 header folding, "git am" failed to
put the header line back into a single logical line. The
underlying "git mailinfo" was taught to handle this properly.
* jt/mailinfo-fold-in-body-headers:
mailinfo: handle in-body header continuations
mailinfo: make is_scissors_line take plain char *
mailinfo: separate in-body header processing
Add a semantic patch for removing checks similar to the one that QSORT
already does internally and apply it to the code base.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT. The resulting code is
shorter and supports empty arrays with NULL pointers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.
Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks. Using the macro avoids such surprises.
Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying when fsck hits a malformed tree object, log the error
like any other and continue. Now fsck can tell the user which tree is
bad, too.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the tree-walker runs into an error, it just calls
die(), and the message is always "corrupt tree file".
However, we are actually covering several cases here; let's
give the user a hint about what happened.
Let's also avoid using the word "corrupt", which makes it
seem like the data bit-rotted on disk. Our sha1 check would
already have found that. These errors are ones of data that
is malformed in the first place.
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log rev^..rev" is commonly used to show all work done on and merged
from a side branch. This patch introduces a shorthand "rev^-" for this
and additionally allows "rev^-$n" to mean "reachable from rev, excluding
what is reachable from the nth parent of rev". For example, for a
two-parent merge, you can use rev^-2 to get the set of commits which were
made to the main branch while the topic branch was prepared.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find ambiguous short sha1s, we may get a
disambiguation rule from our caller's context. But if we
don't, we fall back to treating all sha1s the same, even
though most projects will tend to refer only to commits by
their short sha1s.
This patch introduces a configuration option that lets the
user pick a different fallback (e.g., only commits). It's
possible that we may want to make this the default, but it's
a good idea to start as a config option for two reasons:
1. It lets people experiment with this and see if it's a
good idea (i.e., the "tend to" above is an assumption;
we don't really know if this will break some obscure
cases).
2. Even if we do flip the default, it gives people an
escape hatch if it causes problems (you can sometimes
override it by asking for "1234^{tree}", but not all
combinations are possible).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e8adf23 (xdl_change_compact(): introduce the concept
of a change group, 2016-08-22) added a "struct group" type
to xdiff/xdiffi.c. But the POSIX system header "grp.h"
already defines "struct group" (it is part of the getgrnam
interface). This happens to work because the new type is
local to xdiffi.c, and the xdiff code includes a relatively
small set of system headers. But it will break compilation
if xdiff ever switches to using git-compat-util.h. It can
also probably cause confusion with tools that look at the
whole code base, like coccinelle or ctags.
Let's resolve by giving the xdiff variant a scoped name,
which is closer to other xdiff types anyway (e.g.,
xdlfile_t, though note that xdiff is fond if typedefs when
Git usually is not).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though latin-1 is still seen in e-mail headers, some platforms
only install ISO-8859-1. "iconv -f ISO-8859-1" succeeds, while
"iconv -f latin-1" fails on such a system.
Using the same fallback_encoding() mechanism factored out in the
previous step, teach ourselves that "ISO-8859-1" has a better chance
of being accepted than "latin-1".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath we use to call iconv_open() has a provision to use a
fallback encoding when it fails, hoping that "UTF-8" being spelled
differently could be the reason why the library function did not
like the encoding names we gave it. Essentially, we turn what we
have observed to be used as variants of "UTF-8" (e.g. "utf8") into
the most official spelling and use that as a fallback.
We do the same thing for input and output encoding. Introduce a
helper function to do just one side and call that twice.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone --recurse-submodules" lost the progress eye-candy in
recent update, which has been corrected.
* jk/clone-recursive-progress:
clone: pass --progress decision to recursive submodules
Documentation around tools to import from CVS was fairly outdated.
* jk/doc-cvs-update:
docs/cvs-migration: mention cvsimport caveats
docs/cvs-migration: update link to cvsps homepage
docs/cvsimport: prefer cvs-fast-export to parsecvs
When "git rebase -i" is given a broken instruction, it told the
user to fix it with "--edit-todo", but didn't say what the step
after that was (i.e. "--continue").
* rt/rebase-i-broken-insn-advise:
rebase -i: improve advice on bad instruction lines
The procedure to build Git on Mac OS X for Travis CI hardcoded the
internal directory structure we assumed HomeBrew uses, which was a
no-no. The procedure has been updated to ask HomeBrew things we
need to know to fix this.
* ls/travis-homebrew-path-fix:
travis-ci: ask homebrew for its path instead of hardcoding it
"git add --chmod=+x <pathspec>" added recently only toggled the
executable bit for paths that are either new or modified. This has
been corrected to flip the executable bit for all paths that match
the given pathspec.
* tg/add-chmod+x-fix:
t3700-add: do not check working tree file mode without POSIXPERM
t3700-add: create subdirectory gently
add: modify already added files when --chmod is given
read-cache: introduce chmod_index_entry
update-index: add test for chmod flags
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region. This was fixed by introducing
a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND
extension.
* js/regexec-buf:
regex: use regexec_buf()
regex: add regexec_buf() that can work on a non NUL-terminated string
regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
"git checkout <word>" does not follow the usual disambiguation
rules when the <word> can be both a rev and a path, to allow
checking out a branch 'foo' in a project that happens to have a
file 'foo' in the working tree without having to disambiguate.
This was poorly documented and the check was incorrect when the
command was run from a subdirectory.
* nd/checkout-disambiguation:
checkout: fix ambiguity check in subdir
checkout.txt: document a common case that ignores ambiguation rules
checkout: add some spaces between code and comment
Even more i18n.
* va/i18n-more:
i18n: stash: mark messages for translation
i18n: notes-merge: mark die messages for translation
i18n: ident: mark hint for translation
i18n: i18n: diff: mark die messages for translation
i18n: connect: mark die messages for translation
i18n: commit: mark message for translation
In some projects, it is common to use "[RFC PATCH]" as the subject
prefix for a patch meant for discussion rather than application. A
new option "--rfc" was a short-hand for "--subject-prefix=RFC PATCH"
to help the participants of such projects.
* jt/format-patch-rfc:
format-patch: add "--rfc" for the common case of [RFC PATCH]
A shell script example in check-ref-format documentation has been
fixed.
* ep/doc-check-ref-format-example:
git-check-ref-format.txt: fixup documentation
Output from "git diff" can be made easier to read by selecting
which lines are common and which lines are added/deleted
intelligently when the lines before and after the changed section
are the same. A command line option is added to help with the
experiment to find a good heuristics.
* mh/diff-indent-heuristic:
blame: honor the diff heuristic options and config
parse-options: add parse_opt_unknown_cb()
diff: improve positioning of add/delete blocks in diffs
xdl_change_compact(): introduce the concept of a change group
recs_match(): take two xrecord_t pointers as arguments
is_blank_line(): take a single xrecord_t as argument
xdl_change_compact(): only use heuristic if group can't be matched
xdl_change_compact(): fix compaction heuristic to adjust ixo
The pretty-format specifier "%C(auto)" used by the "log" family of
commands to enable coloring of the output is taught to also issue a
color-reset sequence to the output.
* rs/c-auto-resets-attributes:
pretty: let %C(auto) reset all attributes
Documentation for individual configuration variables to control use
of color (like `color.grep`) said that their default value is
'false', instead of saying their default is taken from `color.ui`.
When we updated the default value for color.ui from 'false' to
'auto' quite a while ago, all of them broke. This has been
corrected.
* mm/config-color-ui-default-to-auto:
Documentation/config: default for color.* is color.ui
Code cleanup.
* rs/cocci:
use strbuf_addstr() for adding constant strings to a strbuf, part 2
add coccicheck make target
contrib/coccinelle: fix semantic patch for oid_to_hex_r()
When the user gives us an ambiguous short sha1, we print an
error and refuse to resolve it. In some cases, the next step
is for them to feed us more characters (e.g., if they were
retyping or cut-and-pasting from a full sha1). But in other
cases, that might be all they have. For example, an old
commit message may have used a 7-character hex that was
unique at the time, but is now ambiguous. Git doesn't
provide any information about the ambiguous objects it
found, so it's hard for the user to find out which one they
probably meant.
This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant. Here's what
it looks like on git.git:
$ git rev-parse b2e1
error: short SHA1 b2e1 is ambiguous
hint: The candidates are:
hint: b2e1196 tag v2.8.0-rc1
hint: b2e11d1 tree
hint: b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
hint: b2e1759 blob
hint: b2e18954 blob
hint: b2e1895c blob
fatal: ambiguous argument 'b2e1': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
We show the tagname for tags, and the date and subject for
commits. For trees and blobs, in theory we could dig in the
history to find the paths at which they were present. But
that's very expensive (on the order of 30s for the kernel),
and it's not likely to be all that helpful. Most short
references are to commits, so the useful information is
typically going to be that the object in question _isn't_ a
commit. So it's silly to spend a lot of CPU preemptively
digging up the path; the user can do it themselves if they
really need to.
And of course it's somewhat ironic that we abbreviate the
sha1s in the disambiguation hint. But full sha1s would cause
annoying line wrapping for the commit lines, and presumably
the user is going to just re-issue their command immediately
with the corrected sha1.
We also restrict the list to those that match any
disambiguation hint. E.g.:
$ git rev-parse b2e1:foo
error: short SHA1 b2e1 is ambiguous
hint: The candidates are:
hint: b2e1196 tag v2.8.0-rc1
hint: b2e11d1 tree
hint: b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
fatal: Invalid object name 'b2e1'.
does not bother reporting the blobs, because they cannot
work as a treeish.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an object appears multiple times in the object database
(e.g., in both loose and packed form, or in two separate
packs), the disambiguation machinery may see it more than
once. The get_short_sha1() function handles this already,
but for_each_abbrev() blindly fires the callback for each
instance it finds.
We can fix this by collecting the output in a sha1 array and
de-duplicating it. As a bonus, the sort done for the
de-duplication means that our output will be stable,
regardless of the order in which the objects are found.
Note that the old code normalized the callback's output to
0/1 to store in the 1-bit ds->ambiguous flag (which both
halted the iteration and was returned from the
for_each_abbrev function). Now that we are using sha1_array,
we can return the real value. In practice, it doesn't matter
as the sole caller only ever returns 0.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callbacks for iterating a sha1_array must have a void
return. This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a human-readable message, and there's no reason it
should not be translated. While we're at it, let's drop the
period from the end, which is not our usual style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We store the hex prefix in a 40-byte buffer with the prefix
itself followed by 40-minus-len "x" characters. These x's
serve no purpose, and the lack of NUL termination makes the
prefix string annoying to use. Let's just terminate it.
Note that this is in contrast to the binary prefix, which
_must_ be zero-padded, because we look at the whole thing
during a binary search to find the first potential match in
each pack index. The loose-object hex search cannot use the
same trick because it has to do a linear walk through the
unsorted results of readdir() (and even if it could, you'd
want zeroes instead of x's).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The disambiguation machinery has two callers: get_short_sha1
and for_each_abbrev. Both need to repeat much of the same
setup: declaring buffers, sanity-checking lengths, preparing
the prefixes, etc. Let's pull that into a single init
function so we can avoid repeating ourselves.
Pulling the buffers into the "struct disambiguate_state"
isn't strictly necessary, but it does make things simpler
for the callers, who no longer have to worry about sizing
them correctly (i.e., it's an implicit requirement that
the caller provide 20- and 40-byte buffers).
And while we're touching this code, we can convert any
magic-number sizes to the more modern GIT_SHA1_* constants.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The treeish disambiguation function tries to peel tags, but
it does so by calling:
deref_tag(lookup_object(sha1), ...);
This will only work if we have previously looked at the tag
and created a "struct tag" for it. Since parsing revision
arguments typically happens before anything else, this is
usually not the case, and we would fail to peel the tag (we
are lucky that deref_tag() gracefully handles the NULL and
does not segfault).
Instead, we can use parse_object(). Note that this is the
same fix done by 94d75d1 (get_short_sha1(): correctly
disambiguate type-limited abbreviation, 2013-07-01), but
that commit fixed only the committish disambiguator, and
left the bug in the treeish one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_sha1() function is actually implementation by many
sub-functions, but we do not always pass our flags around to
all of those functions. As a result, we may forget that our
caller asked us to resolve with GET_SHA1_QUIETLY and output
messages. The two triggerable cases are:
1. Resolving treeish:path will resolve the "treeish"
portion using GET_SHA1_TREEISH, dropping all other
flags.
2. The peel_onion() function did not take flags at all
but recurses to get_sha1_1(), which does.
The solution for both is to bitwise-OR their new flags with
the existing ones (after dropping any mutually exclusive
disambiguation flags).
This bug can trigger with "git rev-parse --quiet", which
asks for quiet resolution. But it can also happen in a more
vanilla code path when we do a follow-up ONLY_TO_DIE
invocation of get_sha1(), and that's what the tests check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the revision code cannot parse an argument like
"HEAD:foo", it will call maybe_die_on_misspelt_object_name(),
which re-runs get_sha1() with an extra ONLY_TO_DIE flag. We
then spend more effort to generate a better error message.
Unfortunately, a side effect is that our second call may
repeat the same error messages from the original get_sha1()
call. You can see this with:
$ git show 0017
error: short SHA1 0017 is ambiguous.
error: short SHA1 0017 is ambiguous.
fatal: ambiguous argument '0017': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
where the second "error:" line comes from the ONLY_TO_DIE
call.
To fix this, we can make ONLY_TO_DIE imply QUIETLY. This is
a little odd, because the whole point of ONLY_TO_DIE is to
output error messages. But what we want to do is tell the
rest of the get_sha1() code (particularly get_sha1_1()) that
the _regular_ messages should be quiet, but the only-to-die
ones should not.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_sha1() family of functions takes a flags field, but
some of the flags are mutually exclusive. In particular, we
can only handle one disambiguating function, and the flags
quietly override each other. Let's instead detect these as
programming bugs.
Technically some of the flags are supersets of the others,
so treating COMMITTISH|TREEISH as just COMMITTISH is not
wrong, but it's a good sign the caller is confused. And
certainly asking for BLOB|TREE does not work.
We can do the check easily with some bit-twiddling, and as a
bonus, the bit-mask of disambiguators will come in handy in
a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark strings for translation in lib/index.tcl that were seemingly
left behind by 700e560 ("git-gui: Mark forgotten strings for
translation.", 2008-09-04) which marks string in do_revert_selection
procedure.
These strings are passed to unstage_help and add_helper procedures.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "highlight" binary can, in some cases, determine the language type
by the means of file contents, for example the shebang in the first line
for some scripting languages. Make use of this autodetection for files
which syntax is not known by gitweb. In that case, pass the blob
contents to "highlight --force"; the parameter is needed to make it
always generate HTML output (which includes HTML-escaping).
Although we now run highlight on files which do not end up highlighted,
performance is virtually unaffected because when we call highlight, it
is used for escaping HTML. In the case that highlight is used, gitweb
calls sanitize() instead of esc_html(), and the latter is significantly
slower (it does more, being roughly a superset of sanitize()). Simple
benchmark comparing performance of 'blob' view of files without syntax
highlighting in gitweb before and after this change indicates ±1%
difference in request time for all file types. Benchmark was performed
on local instance on Debian, using Apache/2.4.23 web server and CGI.
Document the feature and improve syntax highlight documentation, add
test to ensure gitweb doesn't crash when language detection is used.
Signed-off-by: Ian Kelling <ian@iankelling.org>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function needs_work_tree_config() that is called from
create_default_files() is supposed to be fed the path to ".git" that
looks as if it is at the top of the working tree, and decide if that
location matches the actual worktree being used. This comparison allows
"git init" to decide if core.worktree needs to be recorded in the
working tree.
In the current code, however, we feed the return value from
get_git_dir(), which can be totally different from what the function
expects when "gitdir" file is involved. Instead of giving the path to
the ".git" at the top of the working tree, we end up feeding the actual
path that the file points at.
This original location of ".git" however is only known to init_db().
Make init_db() save it and have it passed to create_default_files() as a
new parameter, which passes the correct location down to
needs_work_tree_config() to fix this.
Noticed-by: Max Nordlund <max.nordlund@sqore.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a pure code move, necessary to kill the global variable git_link
later (and also helps a bit in the next patch).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next commit requires that set_git_dir_init() must be called before
init_db(). Let's make sure nobody can do otherwise.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git init' is called from a linked worktree, we treat '.git'
dir (which is $GIT_COMMON_DIR/worktrees/something) as the main
'.git' (i.e. $GIT_COMMON_DIR) and populate the whole repository skeleton
in there. It does not harm anything (*) but it is still wrong.
Since 'git init' calls set_git_dir() at preparation time, which
indirectly calls get_common_dir() and correctly detects multiple
worktree setup, all git_path_buf() calls in create_default_files() will
return correct paths in both single and multiple worktree setups. The
only thing left is copy_templates(), which targets $GIT_DIR, not
$GIT_COMMON_DIR.
Fix that with get_git_common_dir(). This function will return $GIT_DIR
in single-worktree setup, so we don't have to make a special case for
multiple-worktree here.
(*) It does in fact, thanks to another bug. More on that later.
Noticed-by: Max Nordlund <max.nordlund@sqore.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a static initializer for struct checkout and use it throughout the
code base. It's shorter, avoids a memset(3) call and makes sure the
base_dir member is initialized to a valid (empty) string.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning with "--recursive", we'd generally expect
submodules to show progress reports if the main clone did,
too.
In older versions of git, this mostly worked out of the
box. Since we show progress by default when stderr is a tty,
and since the child clones inherit the parent stderr, then
both processes would come to the same decision by default.
If the parent clone was asked for "--quiet", we passed down
"--quiet" to the child. However, if stderr was not a tty and
the user specified "--progress", we did not propagate this
to the child.
That's a minor bug, but things got much worse when we
switched recently to submodule--helper's update_clone
command. With that change, the stderr of the child clones
are always connected to a pipe, and we never output
progress at all.
This patch teaches git-submodule and git-submodule--helper
how to pass down an explicit "--progress" flag when cloning.
The clone command then decides to propagate that flag based
on the cloning decision made earlier (which takes into
account isatty(2) of the parent process, existing --progress
or --quiet flags, etc). Since the child processes always run
without a tty on stderr, we don't have to worry about
passing an explicit "--no-progress"; it's the default for
them.
This fixes the recent loss of progress during recursive
clones. And as a bonus, it makes:
git clone --recursive --progress ... 2>&1 | cat
work by triggering progress explicitly in the children.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git gc --aggressive" used to limit the delta-chain length to 250,
which is way too deep for gaining additional space savings and is
detrimental for runtime performance. The limit has been reduced to
50.
* jk/reduce-gc-aggressive-depth:
gc: default aggressive depth to 50
Even when "git pull --rebase=preserve" (and the underlying "git
rebase --preserve") can complete without creating any new commit
(i.e. fast-forwards), it still insisted on having a usable ident
information (read: user.email is set correctly), which was less
than nice. As the underlying commands used inside "git rebase"
would fail with a more meaningful error message and advice text
when the bogus ident matters, this extra check was removed.
* jk/rebase-i-drop-ident-check:
rebase-interactive: drop early check for valid ident
More i18n.
* va/i18n:
i18n: update-index: mark warnings for translation
i18n: show-branch: mark plural strings for translation
i18n: show-branch: mark error messages for translation
i18n: receive-pack: mark messages for translation
notes: spell first word of error messages in lowercase
i18n: notes: mark error messages for translation
i18n: merge-recursive: mark verbose message for translation
i18n: merge-recursive: mark error messages for translation
i18n: config: mark error message for translation
i18n: branch: mark option description for translation
i18n: blame: mark error messages for translation
"git format-patch --base=..." feature that was recently added
showed the base commit information after "-- " e-mail signature
line, which turned out to be inconvenient. The base information
has been moved above the signature line.
* jt/format-patch-base-info-above-sig:
format-patch: show base info before email signature
Performance tests done via "t/perf" did not use the same set of
build configuration if the user relied on autoconf generated
configuration.
* ks/perf-build-with-autoconf:
t/perf/run: copy config.mak.autogen & friends to build area
"git diff -W" output needs to extend the context backward to
include the header line of the current function and also forward to
include the body of the entire current function up to the header
line of the next one. This process may have to merge to adjacent
hunks, but the code forgot to do so in some cases.
* rs/xdiff-merge-overlapping-hunks-for-W-context:
xdiff: fix merging of hunks with -W context and -u context
There were numerous corner cases in which the configuration files
are read and used or not read at all depending on the directory a
Git command was run, leading to inconsistent behaviour. The code
to set-up repository access at the beginning of a Git process has
been updated to fix them.
* jk/setup-sequence-update:
t1007: factor out repeated setup
init: reset cached config when entering new repo
init: expand comments explaining config trickery
config: only read .git/config from configured repos
test-config: setup git directory
t1302: use "git -C"
pager: handle early config
pager: use callbacks instead of configset
pager: make pager_program a file-local static
pager: stop loading git_default_config()
pager: remove obsolete comment
diff: always try to set up the repository
diff: handle --no-index prefixes consistently
diff: skip implicit no-index check when given --no-index
patch-id: use RUN_SETUP_GENTLY
hash-object: always try to set up the git repository
The http transport (with curl-multi option, which is the default
these days) failed to remove curl-easy handle from a curlm session,
which led to unnecessary API failures.
* ew/http-do-not-forget-to-call-curl-multi-remove-handle:
http: always remove curl easy from curlm session on release
http: consolidate #ifdefs for curl_multi_remove_handle
http: warn on curl_multi_add_handle failures
Some codepaths in "git pack-objects" were not ready to use an
existing pack bitmap; now they are and as the result they have
become faster.
* ks/pack-objects-bitmap:
pack-objects: use reachability bitmap index when generating non-stdout pack
pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use
"git log --cherry-pick" used to include merge commits as candidates
to be matched up with other commits, resulting a lot of wasted time.
The patch-id generation logic has been updated to ignore merges to
avoid the wastage.
* jk/patch-ids-no-merges:
patch-ids: refuse to compute patch-id for merge commit
patch-ids: turn off rename detection
Recently we updated the code to manage the in-core cache that holds
objects that have recently been used to reconstitute other objects
that are stored as deltas against them, but the update used an
incorrect API function to manage the list of these objects. This
has been fixed.
* jk/delta-base-cache:
add_delta_base_cache: use list_for_each_safe
Even though "git hash-objects", which is a tool to take an
on-filesystem data stream and put it into the Git object store,
allowed to perform the "outside-world-to-Git" conversions (e.g.
end-of-line conversions and application of the clean-filter), and
it had the feature on by default from very early days, its reverse
operation "git cat-file", which takes an object from the Git object
store and externalize for the consumption by the outside world,
lacked an equivalent mechanism to run the "Git-to-outside-world"
conversion. The command learned the "--filters" option to do so.
* js/cat-file-filters:
cat-file: support --textconv/--filters in batch mode
cat-file --textconv/--filters: allow specifying the path separately
cat-file: introduce the --filters option
cat-file: fix a grammo in the man page
JGit can show a fake ref "capabilities^{}" to "git fetch" when it
does not advertise any refs, but "git fetch" was not prepared to
see such an advertisement. When the other side disconnects without
giving any ref advertisement, we used to say "there may not be a
repository at that URL", but we may have seen other advertisement
like "shallow" and ".have" in which case we definitely know that a
repository is there. The code to detect this case has also been
updated.
* jt/accept-capability-advertisement-when-fetching-from-void:
connect: advertized capability is not a ref
connect: tighten check for unexpected early hang up
tests: move test_lazy_prereq JGIT to test-lib.sh
Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:
From: author <author@example.com>
Date: Fri, 9 Jun 2006 00:44:16 -0700
Subject: a very long
broken line
Subject: another very long
broken line
interpret the in-body subject to be "another very long broken line"
instead of "another very long".
An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While marking individual messages for translation, consolidate some
messages "option 'foo' requires a value" that is used for many
options into one by introducing a helper function to die with the
message with the option name embedded in it, and ask the translators
to localize that single message instead.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an alias for --subject-prefix='RFC PATCH', which is used
commonly in some development communities to deserve such a
short-hand.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_scissors_line takes a struct strbuf * when a char * would
suffice. Make it take char *.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The check_header function contains logic specific to in-body headers,
although it is invoked during both the processing of actual headers and
in-body headers. Separate out the in-body header part into its own
function.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "unsigned char sha1[20]" to "struct object_id" conversion
continues. Notable changes in this round includes that ce->sha1,
i.e. the object name recorded in the cache_entry, turns into an
object_id.
It had merge conflicts with a few topics in flight (Christian's
"apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's
"status --porcelain-v2"). Extra sets of eyes double-checking for
mismerges are highly appreciated.
* bc/object-id:
builtin/reset: convert to use struct object_id
builtin/commit-tree: convert to struct object_id
builtin/am: convert to struct object_id
refs: add an update_ref_oid function.
sha1_name: convert get_sha1_mb to struct object_id
builtin/update-index: convert file to struct object_id
notes: convert init_notes to use struct object_id
builtin/rm: convert to use struct object_id
builtin/blame: convert file to use struct object_id
Convert read_mmblob to take struct object_id.
notes-merge: convert struct notes_merge_pair to struct object_id
builtin/checkout: convert some static functions to struct object_id
streaming: make stream_blob_to_fd take struct object_id
builtin: convert textconv_object to use struct object_id
builtin/cat-file: convert some static functions to struct object_id
builtin/cat-file: convert struct expand_data to use struct object_id
builtin/log: convert some static functions to use struct object_id
builtin/blame: convert struct origin to use struct object_id
builtin/apply: convert static functions to struct object_id
cache: convert struct cache_entry to use struct object_id
The ref-store abstraction was introduced to the refs API so that we
can plug in different backends to store references.
* mh/ref-store: (38 commits)
refs: implement iteration over only per-worktree refs
refs: make lock generic
refs: add method to rename refs
refs: add methods to init refs db
refs: make delete_refs() virtual
refs: add method for initial ref transaction commit
refs: add methods for reflog
refs: add method iterator_begin
files_ref_iterator_begin(): take a ref_store argument
split_symref_update(): add a files_ref_store argument
lock_ref_sha1_basic(): add a files_ref_store argument
lock_ref_for_update(): add a files_ref_store argument
commit_ref_update(): add a files_ref_store argument
lock_raw_ref(): add a files_ref_store argument
repack_without_refs(): add a files_ref_store argument
refs: make peel_ref() virtual
refs: make create_symref() virtual
refs: make pack_refs() virtual
refs: make verify_refname_available() virtual
refs: make read_raw_ref() virtual
...
"git am" has been taught to make an internal call to "git apply"'s
innards without spawning the latter as a separate process.
* cc/apply-am: (41 commits)
builtin/am: use apply API in run_apply()
apply: learn to use a different index file
apply: pass apply state to build_fake_ancestor()
apply: refactor `git apply` option parsing
apply: change error_routine when silent
usage: add get_error_routine() and get_warn_routine()
usage: add set_warn_routine()
apply: don't print on stdout in verbosity_silent mode
apply: make it possible to silently apply
apply: use error_errno() where possible
apply: make some parsing functions static again
apply: move libified code from builtin/apply.c to apply.{c,h}
apply: rename and move opt constants to apply.h
builtin/apply: rename option parsing functions
builtin/apply: make create_one_file() return -1 on error
builtin/apply: make try_create_file() return -1 on error
builtin/apply: make write_out_results() return -1 on error
builtin/apply: make write_out_one_result() return -1 on error
builtin/apply: make create_file() return -1 on error
builtin/apply: make add_index_file() return -1 on error
...
Mark messages passed to die() in die_initial_contact().
Update test to reflect changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark message commit_utf8_warn for translation.
Update tests to reflect changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "git blame" and "git annotate" the --compaction-heuristic and
--indent-heuristic options that are now supported by "git diff".
Also teach them to honor the `diff.compactionHeuristic` and
`diff.indentHeuristic` configuration options.
It would be conceivable to introduce separate configuration options for
"blame" and "annotate"; for example `blame.compactionHeuristic` and
`blame.indentHeuristic`. But it would be confusing to users if blame
output is inconsistent with diff output, so it makes more sense for them
to respect the same configuration.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new callback function, parse_opt_unknown_cb(), which returns -2 to
indicate that the corresponding option is unknown. This can be used to
add "-h" documentation for an option that will be handled externally to
parse_options().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some groups of added/deleted lines in diffs can be slid up or down,
because lines at the edges of the group are not unique. Picking good
shifts for such groups is not a matter of correctness but definitely has
a big effect on aesthetics. For example, consider the following two
diffs. The first is what standard Git emits:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
}
if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
+if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
$smtp_server = $_;
The following diff is equivalent, but is obviously preferable from an
aesthetic point of view:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
$initial_reply_to =~ s/(^\s+|\s+$)//g;
}
+if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
This patch teaches Git to pick better positions for such "diff sliders"
using heuristics that take the positions of nearby blank lines and the
indentation of nearby lines into account.
The existing Git code basically always shifts such "sliders" as far down
in the file as possible. The only exception is when the slider can be
aligned with a group of changed lines in the other file, in which case
Git favors depicting the change as one add+delete block rather than one
add and a slightly offset delete block. This naive algorithm often
yields ugly diffs.
Commit d634d61ed6 improved the situation somewhat by preferring to
position add/delete groups to make their last line a blank line, when
that is possible. This heuristic does more good than harm, but (1) it
can only help if there are blank lines in the right places, and (2)
always picks the last blank line, even if there are others that might be
better. The end result is that it makes perhaps 1/3 as many errors as
the default Git algorithm, but that still leaves a lot of ugly diffs.
This commit implements a new and much better heuristic for picking
optimal "slider" positions using the following approach: First observe
that each hypothetical positioning of a diff slider introduces two
splits: one between the context lines preceding the group and the first
added/deleted line, and the other between the last added/deleted line
and the first line of context following it. It tries to find the
positioning that creates the least bad splits.
Splits are evaluated based only on the presence and locations of nearby
blank lines, and the indentation of lines near the split. Basically, it
prefers to introduce splits adjacent to blank lines, between lines that
are indented less, and between lines with the same level of indentation.
In more detail:
1. It measures the following characteristics of a proposed splitting
position in a `struct split_measurement`:
* the number of blank lines above the proposed split
* whether the line directly after the split is blank
* the number of blank lines following that line
* the indentation of the nearest non-blank line above the split
* the indentation of the line directly below the split
* the indentation of the nearest non-blank line after that line
2. It combines the measured attributes using a bunch of
empirically-optimized weighting factors to derive a `struct
split_score` that measures the "badness" of splitting the text at
that position.
3. It combines the `split_score` for the top and the bottom of the
slider at each of its possible positions, and selects the position
that has the best `split_score`.
I determined the initial set of weighting factors by collecting a corpus
of Git histories from 29 open-source software projects in various
programming languages. I generated many diffs from this corpus, and
determined the best positioning "by eye" for about 6600 diff sliders. I
used about half of the repositories in the corpus (corresponding to
about 2/3 of the sliders) as a training set, and optimized the weights
against this corpus using a crude automated search of the parameter
space to get the best agreement with the manually-determined values.
Then I tested the resulting heuristic against the full corpus. The
results are summarized in the following table, in column `indent-1`:
| repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 |
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) |
| alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) |
| angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) |
| animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) |
| ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | *
| bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | *
| corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) |
| couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | *
| cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | *
| discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) |
| docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | *
| electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) |
| git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | *
| gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) |
| ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | *
| junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | *
| lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) |
| magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) |
| neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | *
| phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | *
| react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) |
| rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) |
| spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) |
| tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) |
| test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | *
| test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | *
| xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | *
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) |
| totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) |
| totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) |
In this table, the numbers are the count and percentage of human-rated
sliders that the corresponding algorithm got *wrong*. The columns are
* "repository" - the name of the repository used. I used the diffs
between successive non-merge commits on the HEAD branch of the
corresponding repository.
* "count" - the number of sliders that were human-rated. I chose most,
but not all, sliders to rate from those among which the various
algorithms gave different answers.
* "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0.
* "compaction" - the heuristic used by `git diff --compaction-heuristic`
in Git 2.9.0.
* "compaction-fixed" - the heuristic used by `git diff
--compaction-heuristic` after the fixes from earlier in this patch
series. Note that the results are not dramatically different than
those for "compaction". Both produce non-ideal diffs only about 1/3 as
often as the default `git diff`.
* "indent-1" - the new `--indent-heuristic` algorithm, using the first
set of weighting factors, determined as described above.
* "indent-2" - the new `--indent-heuristic` algorithm, using the final
set of weighting factors, determined as described below.
* `*` - indicates that repo was part of training set used to determine
the first set of weighting factors.
The fact that the heuristic performed nearly as well on the test set as
on the training set in column "indent-1" is a good indication that the
heuristic was not over-trained. Given that fact, I ran a second round of
optimization, using the entire corpus as the training set. The resulting
set of weights gave the results in column "indent-2". These are the
weights included in this patch.
The final result gives consistently and significantly better results
across the whole corpus than either `git diff` or `git diff
--compaction-heuristic`. It makes only about 1/30 as many errors as the
former and about 1/10 as many errors as the latter. (And a good fraction
of the remaining errors are for diffs that involve weirdly-formatted
code, sometimes apparently machine-generated.)
The tools that were used to do this optimization and analysis, along
with the human-generated data values, are recorded in a separate project
[1].
This patch adds a new command-line option `--indent-heuristic`, and a
new configuration setting `diff.indentHeuristic`, that activate this
heuristic. This interface is only meant for testing purposes, and should
be finalized before including this change in any release.
[1] https://github.com/mhagger/diff-slider-tools
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit-tree" stopped reading commit.gpgsign configuration
variable that was meant for Porcelain "git commit" in Git 2.9; we
forgot to update "git gui" to look at the configuration to match
this change.
* js/git-gui-commit-gpgsign:
git-gui: respect commit.gpgsign again
Lifts calls to exit(2) and die() higher in the callchain in
sequencer.c files so that more helper functions in it can be used
by callers that want to handle error conditions themselves.
* js/sequencer-wo-die:
sequencer: ensure to release the lock when we could not read the index
sequencer: lib'ify checkout_fast_forward()
sequencer: lib'ify fast_forward_to()
sequencer: lib'ify save_opts()
sequencer: lib'ify save_todo()
sequencer: lib'ify save_head()
sequencer: lib'ify create_seq_dir()
sequencer: lib'ify read_populate_opts()
sequencer: lib'ify read_populate_todo()
sequencer: lib'ify read_and_refresh_cache()
sequencer: lib'ify prepare_revs()
sequencer: lib'ify walk_revs_populate_todo()
sequencer: lib'ify do_pick_commit()
sequencer: lib'ify do_recursive_merge()
sequencer: lib'ify write_message()
sequencer: do not die() in do_pick_commit()
sequencer: lib'ify sequencer_pick_revisions()
"git fetch http::/site/path" did not die correctly and segfaulted
instead.
* jk/fix-remote-curl-url-wo-proto:
remote-curl: handle URLs without protocol
Message cleanup.
* ah/misc-message-fixes:
unpack-trees: do not capitalize "working"
git-merge-octopus: do not capitalize "octopus"
git-rebase--interactive: fix English grammar
cat-file: put spaces around pipes in usage string
am: put spaces around pipe in usage string
Update Japanese translation for "git-gui".
* sy/git-gui-i18n-ja:
git-gui: update Japanese information
git-gui: update Japanese translation
git-gui: add Japanese language code
git-gui: apply po template to Japanese translation
git-gui: consistently use the same word for "blame" in Japanese
git-gui: consistently use the same word for "remote" in Japanese
"git pack-objects --include-tag" was taught that when we know that
we are sending an object C, we want a tag B that directly points at
C but also a tag A that points at the tag B. We used to miss the
intermediate tag B in some cases.
* jk/pack-tag-of-tag:
pack-objects: walk tag chains for --include-tag
t5305: simplify packname handling
t5305: use "git -C"
t5305: drop "dry-run" of unpack-objects
t5305: move cleanup into test block
Mark plural string for translation using Q_().
Although we already know that the plural sentence is always used in the
English source, other languages have complex plural rules they must
comply according to the value of MAX_REVS.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Spell the first word of messages in lowercase, following the usual
style.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark messages refuse_unconfigured_deny_msg and
refuse_unconfigured_deny_delete_current_msg for translation.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
That's the usual style.
Update one test to reflect these changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Spell the first word of such error messages in lowercase,
following the usual style.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark error messages for translation passed to die() function.
Change "Cannot" to lowercase following the usual style.
Reflect changes to test by using test_i18ngrep.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sort the linked list of packs directly using llist_mergesort() instead
of building an array, calling qsort(3) and fixing up the list pointers.
This is shorter and less complicated.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a series of 3 CRLF tests that do exactly the same
(long) setup sequence. Let's pull it out into a common setup
test, which is shorter, more efficient, and will make it
easier to add new tests.
Note that we don't have to worry about cleaning up any of
the setup which was previously per-test; we call pop_repo
after the CRLF tests, which cleans up everything.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After we copy the templates into place, we re-read the
config in case we copied in a default config file. But since
git_config() is backed by a cache these days, it's possible
that the call will not actually touch the filesystem at all;
we need to tell it that something has changed behind the
scenes.
Note that we also need to reset the shared_repository
config. At first glance, it seems like this should probably
just be folded into git_config_clear(). But unfortunately
that is not quite right. The shared repository value may
come from config, _or_ it may have been set manually. So
only the caller who knows whether or not they set it is the
one who can clear it (and indeed, if you _do_ put it into
git_config_clear(), then many tests fail, as we have to
clear the config cache any time we set a new config
variable).
There are three tests here. The first two actually pass
already, though it's largely luck: they just don't happen to
actually read any config before we enter the new repo.
But the third one does fail without this patch; we look at
core.sharedrepository while creating the directory, but need
to make sure the value from the template config overrides
it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-init may copy "config" from the templates directory and
then re-read it. There are some comments explaining what's
going on here, but they are not grouped very well with the
matching code. Let's rearrange and expand them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git_config() runs, it looks in the system, user-wide,
and repo-level config files. It gets the latter by calling
git_pathdup(), which in turn calls get_git_dir(). If we
haven't set up the git repository yet, this may simply
return ".git", and we will look at ".git/config". This
seems like it would be helpful (presumably we haven't set up
the repository yet, so it tries to find it), but it turns
out to be a bad idea for a few reasons:
- it's not sufficient, and therefore hides bugs in a
confusing way. Config will be respected if commands are
run from the top-level of the working tree, but not from
a subdirectory.
- it's not always true that we haven't set up the
repository _yet_; we may not want to do it at all. For
instance, if you run "git init /some/path" from inside
another repository, it should not load config from the
existing repository.
- there might be a path ".git/config", but it is not the
actual repository we would find via setup_git_directory().
This may happen, e.g., if you are storing a git
repository inside another git repository, but have
munged one of the files in such a way that the
inner repository is not valid (e.g., by removing HEAD).
We have at least two bugs of the second type in git-init,
introduced by ae5f677 (lazily load core.sharedrepository,
2016-03-11). It causes init to use git_configset(), which
loads all of the config, including values from the current
repo (if any). This shows up in two ways:
1. If we happen to be in an existing repository directory,
we'll read and respect core.sharedrepository from it,
even though it should have no bearing on the new
repository. A new test in t1301 covers this.
2. Similarly, if we're in an existing repo that sets
core.logallrefupdates, that will cause init to fail to
set it in a newly created repository (because it thinks
that the user's templates already did so). A new test
in t0001 covers this.
We also need to adjust an existing test in t1302, which
gives another example of why this patch is an improvement.
That test creates an embedded repository with a bogus
core.repositoryformatversion of "99". It wants to make sure
that we actually stop at the bogus repo rather than
continuing upward to find the outer repo. So it checks that
"git config core.repositoryformatversion" returns 99. But
that only works because we blindly read ".git/config", even
though we _know_ we're in a repository whose vintage we do
not understand.
After this patch, we avoid reading config from the unknown
vintage repository at all, which is a safer choice. But we
need to tweak the test, since core.repositoryformatversion
will not return 99; it will claim that it could not find the
variable at all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The t1308 test script uses our test-config helper to read
repository-level config, but never actually sets up the
repository. This works because git_config() blindly reads
".git/config" even if we have not configured a repository.
This means that test-config won't work from a subdirectory,
though since it's just a helper for the test scripts, that's
not a big deal.
More important is that the behavior of git_config() is going
to change, and we want to make sure that t1308 continues to
work. We can just use setup_git_directory(), and not the
gentle form; there's no point in being flexible, as it's
just a helper for the tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pager code is often run early in the git.c startup,
before we have actually found the repository. When we ask
git_config() to look for values like core.pager, it doesn't
know where to find the repo-level config, and will blindly
examine ".git/config" if it exists. That's why t7006 shows
that many pager-related features happen to work from the
top-level of a repository, but not from a subdirectory.
This patch pulls that ".git/config" hack explicitly into the
pager code. There are two reasons for this:
1. We'd like to clean up the git_config() behavior, as
looking at ".git/config" when we do not have a
configured repository is often the wrong thing to do.
But we'd prefer not to break the pager config any worse
than it already is.
2. It's one very tiny step on the road to ultimately
making the pager config work consistently. If we
eventually get an equivalent of setup_git_directory()
that _just_ finds the directory and doesn't chdir() or
set up any global state, we could plug it in here
(instead of blindly looking at ".git/config").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the cached configset interface is more pleasant to
use, it is not appropriate for "early" config like pager
setup, which must sometimes do tricky things like reading
from ".git/config" even when we have not set up the
repository.
As a preparatory step to handling these cases better, let's
switch back to using the callback interface, which gives us
more control.
Note that this is essentially a revert of 586f414 (pager.c:
replace `git_config()` with `git_config_get_value()`,
2014-08-07), but with some minor style fixups and
modernizations.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This variable is only ever used by the routines in pager.c,
and other parts of the code should always use those routines
(like git_pager()) to make decisions about which pager to
use. Let's reduce its scope to prevent accidents.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git_pager(), we really only care about getting the value
of core.pager. But to do so, we use the git_default_config()
callback, which loads many other values. Ordinarily it
isn't a big deal to load this config an extra time, as it
simply overwrites the values from the previous run. But it's
a bad idea here, for two reasons:
1. The pager setup may be called very early in the
program, before we have found the git repository. As a
result, we may fail to read the correct repo-level
config file. This is a problem for core.pager, too,
but we should at least try to minimize the pollution to
other configured values.
2. Because we call setup_pager() from git.c, basically
every builtin command _may_ end up reading this config
and getting an implicit git_default_config() setup.
Which doesn't sound like a terrible thing, except that
we don't do it consistently; it triggers only when
stdout is a tty. So if a command forgets to load the
default config itself (but depends on it anyway), it
may appear to work, and then mysteriously fail when the
pager is not in use.
We can improve this by loading _just_ the core.pager config
from git_pager().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment at the top of pager.c claims that we've split
the code out so that Windows can do something different.
This dates back to f67b45f (Introduce trivial new pager.c
helper infrastructure, 2006-02-28), because the original
implementation used fork(). Later, we ended up sticking the
Windows #ifdefs into this file anyway. And then even later,
in ea27a18 (spawn pager via run_command interface,
2008-07-22) we unified the implementations.
So these days this comment is really saying nothing at all.
Let's drop it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we see an explicit "--no-index", we do not bother calling
setup_git_directory_gently() at all. This means that we may
miss out on reading repo-specific config.
It's arguable whether this is correct or not. If we were
designing from scratch, making "git diff --no-index"
completely ignore the repository makes some sense. But we
are nowhere near scratch, so let's look at the existing
behavior:
1. If you're in the top-level of a repository and run an
explicit "diff --no-index", the config subsystem falls
back to reading ".git/config", and we will respect repo
config.
2. If you're in a subdirectory of a repository, then we
still try to read ".git/config", but it generally
doesn't exist. So "diff --no-index" there does not
respect repo config.
3. If you have $GIT_DIR set in the environment, we read
and respect $GIT_DIR/config,
4. If you run "git diff /tmp/foo /tmp/bar" to get an
implicit no-index, we _do_ run the repository setup,
and set $GIT_DIR (or respect an existing $GIT_DIR
variable). We find the repo config no matter where we
started, and respect it.
So we already respect the repository config in a number of
common cases, and case (2) is the only one that does not.
And at least one of our tests, t4034, depends on case (1)
behaving as it does now (though it is just incidental, not
an explicit test for this behavior).
So let's bring case (2) in line with the others by always
running the repository setup, even with an explicit
"--no-index". We shouldn't need to change anything else, as the
implicit case already handles the prefix.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we see an explicit "git diff --no-index ../foo ../bar",
then we do not set up the git repository at all (we already
know we are in --no-index mode, so do not have to check "are
we in a repository?"), and hence have no "prefix" within the
repository. A patch generated by this command will have the
filenames "a/../foo" and "b/../bar", no matter which
directory we are in with respect to any repository.
However, in the implicit case, where we notice that the
files are outside the repository, we will have chdir()'d to
the top-level of the repository. We then feed the prefix
back to the diff machinery. As a result, running the same
diff from a subdirectory will result in paths that look like
"a/subdir/../../foo".
Besides being unnecessarily long, this may also be confusing
to the user: they don't care about the subdir or the
repository at all; it's just where they happened to be when
running the command. We should treat this the same as the
explicit --no-index case.
One way to address this would be to chdir() back to the
original path before running our diff. However, that's a bit
hacky, as we would also need to adjust $GIT_DIR, which could
be a relative path from our top-level.
Instead, we can reuse the diff machinery's RELATIVE_NAME
option, which automatically strips off the prefix. Note that
this _also_ restricts the diff to this relative prefix, but
that's OK for our purposes: we queue our own diff pairs
manually, and do not rely on that part of the diff code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can invoke no-index mode in two ways: by an explicit
request from the user, or implicitly by noticing that we
have two paths, and at least one is outside the repository.
If the user already told us --no-index, there is no need for
us to do the implicit test at all. However, we currently
do, and downgrade our "explicit" to DIFF_NO_INDEX_IMPLICIT.
This doesn't have any user-visible behavior, though it's not
immediately obvious why. We only trigger the implicit check
when we have exactly two non-option arguments. And the only
code that cares about implicit versus explicit is an error
message that we show when we _don't_ have two non-option
arguments.
However, it's worth fixing anyway. Besides being slightly
more efficient, it makes the code easier to follow, which
will help when we modify it in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Patch-id does not require a repository because it is just
processing the incoming diff on stdin, but it may look at
git config for keys like patchid.stable.
Even though we do not setup_git_directory(), this works from
the top-level of a repository because we blindly look at
".git/config" in this case. But as the included test
demonstrates, it does not work from a subdirectory.
We can fix it by using RUN_SETUP_GENTLY. We do not take any
filenames from the user on the command line, so there's no
need to adjust them via prefix_filename().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "hash-object" is run without "-w", we don't need to be
in a git repository at all; we can just hash the object and
write its sha1 to stdout. However, if we _are_ in a git
repository, we would want to know that so we can follow the
normal rules for respecting config, .gitattributes, etc.
This happens to work at the top-level of a git repository
because we blindly read ".git/config", but as the included
test shows, it does not work when you are in a subdirectory.
The solution is to just do a "gentle" setup in this case. We
already take care to use prefix_filename() on any filename
arguments we get (to handle the "-w" case), so we don't need
to do anything extra to handle the side effects of repo
setup.
An alternative would be to specify RUN_SETUP_GENTLY for this
command in git.c, and then die if "-w" is set but we are not
in a repository. However, the error messages generated at
the time of setup_git_directory() are more detailed, so it's
better to find out which mode we are in, and then call the
appropriate function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update a few tests that used to use GIT_CURL_VERBOSE to use the
newer GIT_TRACE_CURL.
* ep/use-git-trace-curl-in-tests:
t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var
t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var
test-lib.sh: preserve GIT_TRACE_CURL from the environment
t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var
A test spawned a short-lived background process, which sometimes
prevented the test directory from getting removed at the end of the
script on some platforms.
* js/t6026-clean-up:
t6026-merge-attr: clean up background process at end of test case
"git symbolic-ref -d HEAD" happily removes the symbolic ref, but
the resulting repository becomes an invalid one. Teach the command
to forbid removal of HEAD.
* jc/forbid-symbolic-ref-d-HEAD:
symbolic-ref -d: do not allow removal of HEAD
Having a submodule whose ".git" repository is somehow corrupt
caused a few commands that recurse into submodules loop forever.
* jc/submodule-anchor-git-dir:
submodule: avoid auto-discovery in prepare_submodule_repo_env()
The test framework left the number of tests and success/failure
count in the t/test-results directory, keyed by the name of the
test script plus the process ID. The latter however turned out not
to serve any useful purpose. The process ID part of the filename
has been removed.
* jk/test-lib-drop-pid-from-results:
test-lib: drop PID from test-results/*.count
Extract a small helper out of the function that reads the authors
script file "git am" internally uses.
* jc/am-read-author-file:
am: refactor read_author_script()
The "git diff --submodule={short,log}" mechanism has been enhanced
to allow "--submodule=diff" to show the patch between the submodule
commits bound to the superproject.
* jk/diff-submodule-diff-inline:
diff: teach diff to display submodule difference with an inline diff
submodule: refactor show_submodule_summary with helper function
submodule: convert show_submodule_summary to use struct object_id *
allow do_submodule_path to work even if submodule isn't checked out
diff: prepare for additional submodule formats
graph: add support for --line-prefix on all graph-aware output
diff.c: remove output_prefix_length field
cache: add empty_tree_oid object and helper function
Starting from 6b8fda2d (pack-objects: use bitmaps when packing objects)
if a repository has bitmap index, pack-objects can nicely speedup
"Counting objects" graph traversal phase. That however was done only for
case when resultant pack is sent to stdout, not written into a file.
The reason here is for on-disk repack by default we want:
- to produce good pack (with bitmap index not-yet-packed objects are
emitted to pack in suboptimal order).
- to use more robust pack-generation codepath (avoiding possible
bugs in bitmap code and possible bitmap index corruption).
Jeff King further explains:
The reason for this split is that pack-objects tries to determine how
"careful" it should be based on whether we are packing to disk or to
stdout. Packing to disk implies "git repack", and that we will likely
delete the old packs after finishing. We want to be more careful (so
as not to carry forward a corruption, and to generate a more optimal
pack), and we presumably run less frequently and can afford extra CPU.
Whereas packing to stdout implies serving a remote via "git fetch" or
"git push". This happens more frequently (e.g., a server handling many
fetching clients), and we assume the receiving end takes more
responsibility for verifying the data.
But this isn't always the case. One might want to generate on-disk
packfiles for a specialized object transfer. Just using "--stdout" and
writing to a file is not optimal, as it will not generate the matching
pack index.
So it would be useful to have some way of overriding this heuristic:
to tell pack-objects that even though it should generate on-disk
files, it is still OK to use the reachability bitmaps to do the
traversal.
So we can teach pack-objects to use bitmap index for initial object
counting phase when generating resultant pack file too:
- if we take care to not let it be activated under git-repack:
See above about repack robustness and not forward-carrying corruption.
- if we know bitmap index generation is not enabled for resultant pack:
The current code has singleton bitmap_git, so it cannot work
simultaneously with two bitmap indices.
We also want to avoid (at least with current implementation)
generating bitmaps off of bitmaps. The reason here is: when generating
a pack, not-yet-packed objects will be emitted into pack in
suboptimal order and added to tail of the bitmap as "extended entries".
When the resultant pack + some new objects in associated repository
are in turn used to generate another pack with bitmap, the situation
repeats: new objects are again not emitted optimally and just added to
bitmap tail - not in recency order.
So the pack badness can grow over time when at each step we have
bitmapped pack + some other objects. That's why we want to avoid
generating bitmaps off of bitmaps, not to let pack badness grow.
- if we keep pack reuse enabled still only for "send-to-stdout" case:
Because pack-to-file needs to generate index for destination pack, and
currently on pack reuse raw entries are directly written out to the
destination pack by write_reused_pack(), bypassing needed for pack index
generation bookkeeping done by regular codepath in write_one() and
friends.
( In the future we might teach pack-reuse code about cases when index
also needs to be generated for resultant pack and remove
pack-reuse-only-for-stdout limitation )
This way for pack-objects -> file we get nice speedup:
erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup
repository managed by git-backup[2] via
time echo 0186ac99 | git pack-objects --revs erp5pack
before: 37.2s
after: 26.2s
And for `git repack -adb` packed git.git
time echo 5c589a73 | git pack-objects --revs gitpack
before: 7.1s
after: 3.6s
i.e. it can be 30% - 50% speedup for pack extraction.
git-backup extracts many packs on repositories restoration. That was my
initial motivation for the patch.
[1] https://lab.nexedi.com/nexedi/erp5
[2] https://lab.nexedi.com/kirr/git-backup
NOTE
Jeff also suggests that pack.useBitmaps was probably a mistake to
introduce originally. This way we are not adding another config point,
but instead just always default to-file pack-objects not to use bitmap
index: Tools which need to generate on-disk packs with using bitmap, can
pass --use-bitmap-index explicitly. And git-repack does never pass
--use-bitmap-index, so this way we can be sure regular on-disk repacking
remains robust.
NOTE2
`git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much slower
than `git pack-objects file.pack`. Extracting erp5.git pack from
lab.nexedi.com backup repository:
$ time echo 0186ac99 | git pack-objects --stdout --revs >erp5pack-stdout.pack
real 0m22.309s
user 0m21.148s
sys 0m0.932s
$ time git index-pack erp5pack-stdout.pack
real 0m50.873s <-- more than 2 times slower than time to generate pack itself!
user 0m49.300s
sys 0m1.360s
So the time for
`pack-object --stdout >file.pack` + `index-pack file.pack` is 72s,
while
`pack-objects file.pack` which does both pack and index is 27s.
And even
`pack-objects --no-use-bitmap-index file.pack` is 37s.
Jeff explains:
The packfile does not carry the sha1 of the objects. A receiving
index-pack has to compute them itself, including inflating and applying
all of the deltas.
that's why for `git-backup restore` we want to teach `git pack-objects
file.pack` to use bitmaps instead of using `git pack-objects --stdout
>file.pack` + `git index-pack file.pack`.
NOTE3
The speedup is now tracked via t/perf/p5310-pack-bitmaps.sh
Test 56dfeb62 this tree
--------------------------------------------------------------------------------
5310.2: repack to disk 8.98(8.05+0.29) 9.05(8.08+0.33) +0.8%
5310.3: simulated clone 2.02(2.27+0.09) 2.01(2.25+0.08) -0.5%
5310.4: simulated fetch 0.81(1.07+0.02) 0.81(1.05+0.04) +0.0%
5310.5: pack to file 7.58(7.04+0.28) 7.60(7.04+0.30) +0.3%
5310.6: pack to file (bitmap) 7.55(7.02+0.28) 3.25(2.82+0.18) -57.0%
5310.8: clone (partial bitmap) 1.83(2.26+0.12) 1.82(2.22+0.14) -0.5%
5310.9: pack to file (partial bitmap) 6.86(6.58+0.30) 2.87(2.74+0.20) -58.2%
More context:
http://marc.info/?t=146792101400001&r=1&w=2http://public-inbox.org/git/20160707190917.20011-1-kirr@nexedi.com/T/#t
Cc: Vicent Marti <tanoku@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there
are two codepaths in pack-objects: with & without using bitmap
reachability index.
However add_object_entry_from_bitmap(), despite its non-bitmapped
counterpart add_object_entry(), in no way does check for whether --local
or --honor-pack-keep or --incremental should be respected. In
non-bitmapped codepath this is handled in want_object_in_pack(), but
bitmapped codepath has simply no such checking at all.
The bitmapped codepath however was allowing to pass in all those options
and with bitmap indices still being used under such conditions -
potentially giving wrong output (e.g. including objects from non-local or
.keep'ed pack).
We can easily fix this by noting the following: when an object comes to
add_object_entry_from_bitmap() it can come for two reasons:
1. entries coming from main pack covered by bitmap index, and
2. object coming from, possibly alternate, loose or other packs.
"2" can be already handled by want_object_in_pack() and to cover
"1" we can teach want_object_in_pack() to expect that *found_pack can be
non-NULL, meaning calling client already found object's pack entry.
In want_object_in_pack() we care to start the checks from already found
pack, if we have one, this way determining the answer right away
in case neither --local nor --honour-pack-keep are active. In
particular, as p5310-pack-bitmaps.sh shows (3 consecutive runs), we do
not do harm to served-with-bitmap clones performance-wise:
Test 56dfeb62 this tree
-----------------------------------------------------------------
5310.2: repack to disk 9.08(8.20+0.25) 9.09(8.14+0.32) +0.1%
5310.3: simulated clone 1.92(2.12+0.08) 1.93(2.12+0.09) +0.5%
5310.4: simulated fetch 0.82(1.07+0.04) 0.82(1.06+0.04) +0.0%
5310.6: partial bitmap 1.96(2.42+0.13) 1.95(2.40+0.15) -0.5%
Test 56dfeb62 this tree
-----------------------------------------------------------------
5310.2: repack to disk 9.11(8.16+0.32) 9.11(8.19+0.28) +0.0%
5310.3: simulated clone 1.93(2.14+0.07) 1.92(2.11+0.10) -0.5%
5310.4: simulated fetch 0.82(1.06+0.04) 0.82(1.04+0.05) +0.0%
5310.6: partial bitmap 1.95(2.38+0.16) 1.94(2.39+0.14) -0.5%
Test 56dfeb62 this tree
-----------------------------------------------------------------
5310.2: repack to disk 9.13(8.17+0.31) 9.07(8.13+0.28) -0.7%
5310.3: simulated clone 1.92(2.13+0.07) 1.91(2.12+0.06) -0.5%
5310.4: simulated fetch 0.82(1.08+0.03) 0.82(1.08+0.03) +0.0%
5310.6: partial bitmap 1.96(2.43+0.14) 1.96(2.42+0.14) +0.0%
with delta timings showing they are all within noise from run to run.
In the general case we do not want to call find_pack_entry_one() more than
once, because it is expensive. This patch splits the loop in
want_object_in_pack() into two parts: finding the object and seeing if it
impacts our choice to include it in the pack. We may call the inexpensive
want_found_object() twice, but we will never call find_pack_entry_one() if we
do not need to.
I appreciate help and discussing this change with Junio C Hamano and
Jeff King.
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We may remove elements from the list while we are iterating,
which requires using a second temporary pointer. Otherwise
stepping to the next element of the list might involve
looking at freed memory (which generally works in practice,
as we _just_ freed it, but of course is wrong to rely on;
valgrind notices it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this patch, --batch can be combined with --textconv or --filters.
For this to work, the input needs to have the form
<object name><single white space><path>
so that the filters can be chosen appropriately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are circumstances when it is relatively easy to figure out the
object name for a given path, but not the name of the containing tree.
For example, when looking at a diff generated by Git, the object names
are recorded, but not the revision. As a matter of fact, the revisions
from which the diff was generated may not even exist locally.
In such a case, the user would have to generate a fake revision just to
be able to use --textconv or --filters.
Let's simplify this dramatically, because we do not really need that
revision at all: all we care about is that we know the path. In the
scenario described above, we do know the path, and we just want to
specify it separately from the object name.
Example usage:
git cat-file --textconv --path=main.c 0f1937fd
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --filters option applies the convert_to_working_tree() filter for
the path when showing the contents of a regular file blob object;
the contents are written out as-is for other types of objects.
This feature comes in handy when a 3rd-party tool wants to work with
the contents of files from past revisions as if they had been checked
out, but without detouring via temporary files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alternate refs backends might still use files to store per-worktree
refs. So provide a way to iterate over only the per-worktree references
in a ref_store. The other backend can set up a files ref_store and
iterate using the new DO_FOR_EACH_PER_WORKTREE_ONLY flag when iterating.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of including a files-backend-specific struct ref_lock, change
the generic ref_update struct to include a void pointer that backends
can use for their own arbitrary data.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alternate refs backends might not need the refs/heads directory and so
on, so we make ref db initialization part of the backend.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the file-based backend, delete_refs has some special optimization
to deal with packed refs. In other backends, we might be able to make
ref deletion faster by putting all deletions into a single
transaction. So we need a special backend function for this.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the file-based backend, the reflog piggybacks on the ref lock.
Since other backends won't have the same sort of ref lock, ref backends
must also handle reflogs.
Signed-off-by: Ronnie Sahlberg <rsahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For now it only supports the main reference store.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reference backends will be able to customize this function to implement
reference reading.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we don't have to strip trailing '/' from the submodule path, then
don't allocate and copy the submodule name.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_recursively() can handle references in arbitrary files
reference stores, so use it to resolve "gitlink" (i.e., submodule)
references. Aside from removing redundant code, this allows submodule
lookups to benefit from the much more robust code that we use for
reading non-submodule references. And, since the code is now agnostic
about reference backends, it will work for any future references
backend (so move its definition to refs.c).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new function, resolve_ref_recursively(), which is basically like
the old resolve_ref_unsafe() except that it takes a (ref_store *)
argument and also works for submodules.
Re-implement resolve_ref_unsafe() as a thin wrapper around
resolve_ref_recursively().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that resolve_packed_ref() can work with an arbitrary
files_ref_store, there is no need to have a separate
resolve_gitlink_packed_ref() function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move resolve_gitlink_ref() and related functions lower in the file to
avoid the need for forward declarations in the next step.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions currently only work in the main repository, so add an
assert_main_repository() check to each function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want ref_stores to be polymorphic, so invent a base class of which
files_ref_store is a derived class. For now there is exactly one
ref_store for the main repository and one for any submodules whose
references have been accessed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a `struct ref_storage_be` to represent types of reference stores. In
OO notation, this is the class, and will soon hold some class
methods (e.g., a factory to create new ref_store instances) and will
also serve as the vtable for ref_store instances of that type.
As yet, the backends cannot do anything.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The greater goal of this patch series is to develop the concept of a
reference store, which is a place that references, their values, and
their reflogs are stored, and to virtualize the reference interface so
that different types of ref_stores can be implemented. We will then, for
example, use ref_store instances to access submodule references and
worktree references.
Currently, we keep a ref_cache for each submodule that has had its
references iterated over. It is a far cry from a ref_store, but they are
stored the way we will want to store ref_stores, and ref_stores will
eventually have to hold the reference caches. So let's treat ref_caches
as embryo ref_stores, and build them out from there.
As the first step, simply rename `ref_cache` to `files_ref_store`, and
rename some functions and attributes correspondingly.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning an empty repository served by standard git, "git clone" produces
the following reassuring message:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
Meanwhile when cloning an empty repository served by JGit, the output is more
haphazard:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.
This is a common command to run immediately after creating a remote repository
as preparation for adding content to populate it and pushing. The warning is
confusing and needlessly worrying.
The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
with no refs in upload service., 2013-08-08), JGit's ref advertisement includes
a ref named capabilities^{} to advertise its capabilities on, while git's ref
advertisement is empty in this case. This allows the client to learn about the
server's capabilities and is needed, for example, for fetch-by-sha1 to work
when no refs are advertised.
This also affects "ls-remote". For example, against an empty repository served
by JGit:
$ git ls-remote git://localhost/tmp/empty
0000000000000000000000000000000000000000 capabilities^{}
Git advertises the same capabilities^{} ref in its ref advertisement for push
but since it never did so for fetch, the client didn't need to handle this
case. Handle it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A server hanging up immediately to mark access being denied does not
send any .have refs, shallow lines, or anything else before hanging
up. If the server has sent anything, then the hangup is unexpected.
That is, if the server hangs up after a shallow line but before sending
any refs, then git should tell me so:
fatal: The remote end hung up upon initial contact
instead of suggesting an access control problem:
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Noticed while examining this code. This case isn't likely to come up
in practice but tightening the check makes the code easier to read and
manipulate.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This enables JGIT to be used as a prereq in invocations of
test_expect_success (and other functions) in other test scripts.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future caller of read_and_refresh_cache() may want to do more than just
print some helpful advice in case of failure.
Suggested by Junio Hamano.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The only callers of checkout_fast_forward(), cmd_merge(),
pull_into_void(), cmd_pull() and sequencer's fast_forward_to(),
already check the return value and handle it appropriately. With this
step, we make it notice an error return from this function.
So this is a safe conversion to make checkout_fast_forward()
callable from new callers that want it not to die, without changing
the external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The only caller of fast_forward_to(), do_pick_commit() already checks
the return value and passes it on to its callers, so its caller must
be already prepared to handle error returns, and with this step, we
make it notice an error return from this function.
So this is a safe conversion to make fast_forward_to() callable from
new callers that want it not to die, without changing the external
behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of save_opts(), sequencer_pick_revisions() can already
return errors, so its caller must be already prepared to handle error
returns, and with this step, we make it notice an error return from
this function.
So this is a safe conversion to make save_opts() callable from new
callers that want it not to die, without changing the external
behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of save_todo(), pick_commits() can already return
errors, so its caller must be already prepared to handle error
returns, and with this step, we make it notice an error return from
this function.
So this is a safe conversion to make save_todo() callable
from new callers that want it not to die, without changing the
external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of save_head(), sequencer_pick_revisions() can already
return errors, so its caller must be already prepared to handle error
returns, and with this step, we make it notice an error return from
this function.
So this is a safe conversion to make save_head() callable from new
callers that want it not to die, without changing the external
behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of create_seq_dir(), sequencer_pick_revisions() can
already return errors, so its caller must be already prepared to
handle error returns, and with this step, we make it notice an error
return from this function.
So this is a safe conversion to make create_seq_dir() callable from
new callers that want it not to die, without changing the external
behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of read_populate_opts(), sequencer_continue() can
already return errors, so its caller must be already prepared to
handle error returns, and with this step, we make it notice an error
return from this function.
So this is a safe conversion to make read_populate_opts() callable
from new callers that want it not to die, without changing the
external behaviour of anything existing.
Note that the function git_config_from_file(), called from
read_populate_opts(), can currently still die() (in git_parse_source(),
because the do_config_from_file() function sets die_on_error = 1). We do
not try to fix that here, as it would have larger ramifications on the
config code, and we also assume that we write the opts file
programmatically, hence any parse errors would be bugs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The only caller of read_populate_todo(), sequencer_continue() can
already return errors, so its caller must be already prepared to
handle error returns, and with this step, we make it notice an
error return from this function.
So this is a safe conversion to make read_populate_todo() callable
from new callers that want it not to die, without changing the
external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
There are two call sites of read_and_refresh_cache(), one of which is
pick_commits(), whose callers were already prepared to do the right
thing given an "error" return from it by an earlier patch, so the
conversion is safe.
The other one, sequencer_pick_revisions() was also prepared to relay
an error return back to its caller in all remaining cases in an
earlier patch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only caller of prepare_revs(), walk_revs_populate_todo() was just
taught to return errors, after verifying that its callers are prepared
to handle error returns, and with this step, we make it notice an
error return from this function.
So this is a safe conversion to make prepare_revs() callable from new
callers that want it not to die, without changing the external
behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The function sequencer_pick_revisions() is the only caller of
walk_revs_populate_todo(), and it already returns errors
appropriately, so its caller must be already prepared to handle error
returns, and with this step, we make it notice an error return from
this function.
So this is a safe conversion to make walk_revs_populate_todo()
callable from new callers that want it not to die, without changing
the external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The only two callers of do_pick_commit(), pick_commits() and
single_pick() already check the return value and pass it on to their
callers, so their callers must be already prepared to handle error
returns, and with this step, we make it notice an error return from
this function.
So this is a safe conversion to make do_pick_commit() callable from
new callers that want it not to die, without changing the external
behaviour of anything existing.
While at it, remove the superfluous space.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The only caller of do_recursive_merge(), do_pick_commit() already
checks the return value and passes it on to its callers, so its caller
must be already prepared to handle error returns, and with this step,
we make it notice an error return from this function.
So this is a safe conversion to make do_recursive_merge() callable
from new callers that want it not to die, without changing the
external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The only caller of write_message(), do_pick_commit() already checks
the return value and passes it on to its callers, so its caller must
be already prepared to handle error returns, and with this step, we
make it notice an error return from this function.
So this is a safe conversion to make write_message() callable
from new callers that want it not to die, without changing the
external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff-highlight" script (in contrib/) learned to work better with
"git log -p --graph" output.
* bh/diff-highlight-graph:
diff-highlight: avoid highlighting combined diffs
diff-highlight: add multi-byte tests
diff-highlight: ignore test cruft
diff-highlight: add support for --graph output
diff-highlight: add failing test for handling --graph output
diff-highlight: add some tests
"git clone --resurse-submodules --reference $path $URL" is a way to
reduce network transfer cost by borrowing objects in an existing
$path repository when cloning the superproject from $URL; it
learned to also peek into $path for presense of corresponding
repositories of submodules and borrow objects from there when able.
* sb/submodule-clone-rr:
clone: recursive and reference option triggers submodule alternates
clone: implement optional references
clone: clarify option_reference as required
clone: factor out checking for an alternate path
submodule--helper update-clone: allow multiple references
submodule--helper module-clone: allow multiple references
t7408: merge short tests, factor out testing method
t7408: modernize style
Enhance "git status --porcelain" output by collecting more data on
the state of the index and the working tree files, which may
further be used to teach git-prompt (in contrib/) to make fewer
calls to git.
* jh/status-v2-porcelain:
status: unit tests for --porcelain=v2
test-lib-functions.sh: add lf_to_nul helper
git-status.txt: describe --porcelain=v2 format
status: print branch info with --porcelain=v2 --branch
status: print per-file porcelain v2 status data
status: collect per-file data for --porcelain=v2
status: support --porcelain[=<version>]
status: cleanup API to wt_status_print
status: rename long-format print routines
Clarify various ways to specify the "revision ranges" in the
documentation.
* po/range-doc:
doc: revisions: sort examples and fix alignment of the unchanged
doc: revisions: show revision expansion in examples
doc: revisions - clarify reachability examples
doc: revisions - define `reachable`
doc: gitrevisions - clarify 'latter case' is revision walk
doc: gitrevisions - use 'reachable' in page description
doc: revisions: single vs multi-parent notation comparison
doc: revisions: extra clarification of <rev>^! notation effects
doc: revisions: give headings for the two and three dot notations
doc: show the actual left, right, and boundary marks
doc: revisions - name the left and right sides
doc: use 'symmetric difference' consistently
"git nosuchcommand --help" said "No manual entry for gitnosuchcommand",
which was not intuitive, given that "git nosuchcommand" said "git:
'nosuchcommand' is not a git command".
* rt/help-unknown:
help: make option --help open man pages only for Git commands
help: introduce option --exclude-guides
An incoming "git push" that attempts to push too many bytes can now
be rejected by setting a new configuration variable at the receiving
end.
* cc/receive-pack-limit:
receive-pack: allow a maximum input size to be specified
unpack-objects: add --max-input-size=<size> option
index-pack: add --max-input-size=<size> option
"git format-patch --cover-letter HEAD^" to format a single patch
with a separate cover letter now numbers the output as [PATCH 0/1]
and [PATCH 1/1] by default.
* jk/format-patch-number-singleton-patch-with-cover:
format-patch: show 0/1 and 1/1 for singleton patch with cover letter
The delta-base-cache mechanism has been a key to the performance in
a repository with a tightly packed packfile, but it did not scale
well even with a larger value of core.deltaBaseCacheLimit.
* jk/delta-base-cache:
t/perf: add basic perf tests for delta base cache
delta_base_cache: use hashmap.h
delta_base_cache: drop special treatment of blobs
delta_base_cache: use list.h for LRU
release_delta_base_cache: reuse existing detach function
clear_delta_base_cache_entry: use a more descriptive name
cache_or_unpack_entry: drop keep_cache parameter
Convert uses of unsigned char [20] to struct object_id. Rename the
generically-named "ptr" to "old_oid" and make it const.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several places around the codebase want to pass update_ref data from
struct object_id, but update_ref may also be passed NULL pointers.
Instead of checking and dereferencing in every caller, create an
update_ref_oid which wraps update_ref and provides this functionality.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of this function use struct object_id, so rename it
to get_oid_mb and make it take struct object_id instead of
unsigned char *.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all functions to use struct object_id, and replace instances of
hardcoded 40, 41, and 42 with appropriate references to GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this file to use struct object_id, and additionally convert some
uses of the constant 40 to GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since all of its callers have been updated, convert read_mmblob to take
a pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert each of this structure's members from an unsigned char array to
a struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all the static functions that are not callbacks to struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since all of its callers have been updated, modify stream_blob_to_fd to
take a struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since all of its callers have been updated, make textconv_object take a
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all of the static functions that are not callbacks to use struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib,
plus the actual change to the struct:
@@
struct expand_data E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct expand_data *E1;
@@
- E1->sha1
+ E1->oid.hash
@@
struct expand_data E1;
@@
- E1.delta_base_sha1
+ E1.delta_base_oid.hash
@@
struct expand_data *E1;
@@
- E1->delta_base_sha1
+ E1->delta_base_oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct origin to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib,
plus the actual change to the struct:
@@
struct origin E1;
@@
- E1.blob_sha1
+ E1.blob_oid.hash
@@
struct origin *E1;
@@
- E1->blob_sha1
+ E1->blob_oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were several static functions using unsigned char arrays for SHA-1
values. Convert them to use struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:
@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This replaces run_apply() implementation with a new one that
uses the apply API that has been previously prepared in
apply.c and apply.h.
This shoud improve performance a lot in certain cases.
As the previous implementation was creating a new `git apply`
process to apply each patch, it could be slow on systems like
Windows where it is costly to create new processes.
Also the new `git apply` process had to read the index from
disk, and when the process was done the calling process
discarded its own index and read back from disk the new
index that had been created by the `git apply` process.
This could be very inefficient with big repositories that
have big index files, especially when the system decided
that it was a good idea to run the `git apply` processes on
a different processor core.
Also eliminating index reads enables further performance
improvements by using:
`git update-index --split-index`
For example here is a benchmark of a multi hundred commit
rebase on the Linux kernel on a Debian laptop with SSD:
command: git rebase --onto 1993b17 52bef0c 29dde7c
Vanilla "next" without split index: 1m54.953s
Vanilla "next" with split index: 1m22.476s
This series on top of "next" without split index: 1m12.034s
This series on top of "next" with split index: 0m15.678s
(using branch "next" from mid April 2016.)
Benchmarked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes we want to apply in a different index file.
Before the apply functionality was libified it was possible to
use the GIT_INDEX_FILE environment variable, for this purpose.
But now, as the apply functionality has been libified, it should
be possible to do that in a libified way.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify git apply functionality, we will need to read from a
different index file in get_current_sha1(). This index file will be
stored in "struct apply_state", so let's pass the state to
build_fake_ancestor() which will later pass it to get_current_sha1().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing `git apply` options can be useful to other commands that
want to call the libified apply functionality, because this way
they can easily pass some options from their own command line to
the libified apply functionality.
This will be used by `git am` in a following patch.
To make this possible, let's refactor the `git apply` option
parsing code into a new libified apply_parse_options() function.
Doing that makes it possible to remove some functions definitions
from "apply.h" and make them static in "apply.c".
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid printing anything when applying with
`state->apply_verbosity == verbosity_silent`, let's save the
existing warn and error routines before applying, and let's
replace them with a routine that does nothing.
Then after applying, let's restore the saved routines.
Note that, as we need to restore the saved routines in all
cases, we cannot return early any more in apply_all_patches().
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's make it possible to get the current error_routine and warn_routine,
so that we can store them before using set_error_routine() or
set_warn_routine() to use new ones.
This way we will be able put back the original routines, when we are done
with using new ones.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are already set_die_routine() and set_error_routine(),
so let's add set_warn_routine() as this will be needed in a
following commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When apply_verbosity is set to verbosity_silent nothing should be
printed on both stderr and stdout.
To avoid printing on stdout, we can just skip calling the following
functions:
- stat_patch_list(),
- numstat_patch_list(),
- summary_patch_list().
It is safe to do that because the above functions have no side
effects other than printing:
- stat_patch_list() only computes some local values and then call
show_stats() and print_stat_summary(), those two functions only
compute local values and call printing functions,
- numstat_patch_list() also only computes local values and calls
printing functions,
- summary_patch_list() calls show_file_mode_name(), printf(),
show_rename_copy(), show_mode_change() that are only printing.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This changes 'int apply_verbosely' into 'enum apply_verbosity', and
changes the possible values of the variable from a bool to
a tristate.
The previous 'false' state is changed into 'verbosity_normal'.
The previous 'true' state is changed into 'verbosity_verbose'.
The new added state is 'verbosity_silent'. It should prevent
anything to be printed on both stderr and stdout.
This is needed because `git am` wants to first call apply
functionality silently, if it can then fall back on 3-way merge
in case of error.
Printing on stdout, and calls to warning() or error() are not
taken care of in this patch, as that will be done in following
patches.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid possible mistakes and to uniformly show the errno
related messages, let's use error_errno() where possible.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some parsing functions that were used in both "apply.c" and
"builtin/apply.c" are now only used in the former, so they
can be made static to "apply.c".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As most of the apply code in builtin/apply.c has been libified by a number of
previous commits, it can now be moved to apply.{c,h}, so that more code can
use it.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The constants for the "inaccurate-eof" and the "recount" options will
be used in both "apply.c" and "builtin/apply.c", so they need to go
into "apply.h", and therefore they need a name that is more specific
to the API they belong to.
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As these functions are going to be part of the libified
apply API, let's give them a name that is more specific
to the apply API.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", create_one_file() should return -1 instead of
calling exit().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", try_create_file() should return -1 in case of
error.
Unfortunately try_create_file() currently returns -1 to signal a
recoverable error. To fix that, let's make it return 1 in case of
a recoverable error and -1 in case of an unrecoverable error.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-diff and friends a new format for displaying the difference
of a submodule. The new format is an inline diff of the contents of the
submodule between the commit range of the update. This allows the user
to see the actual code change caused by a submodule update.
Add tests for the new format and option.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future patch is going to add a new submodule diff format which
displays an inline diff of the submodule changes. To make this easier,
and to ensure that both submodule diff formats use the same initial
header, factor out show_submodule_header() function which will print the
current submodule header line, and then leave the show_submodule_summary
function to lookup and print the submodule log format.
This does create one format change in that "(revision walker failed)"
will now be displayed on its own line rather than as part of the message
because we no longer perform this step directly in the header display
flow. However, this is a rare case as most causes of the failure will be
due to a missing commit which we already check for and avoid previously.
flow. However, this is a rare case and shouldn't impact much.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we're going to be changing this function in a future patch, lets
go ahead and convert this to use object_id now.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, do_submodule_path will attempt locating the .git directory by
using read_gitfile on <path>/.git. If this fails it just assumes the
<path>/.git is actually a git directory.
This is good because it allows for handling submodules which were cloned
in a regular manner first before being added to the superproject.
Unfortunately this fails if the <path> is not actually checked out any
longer, such as by removing the directory.
Fix this by checking if the directory we found is actually a gitdir. In
the case it is not, attempt to lookup the submodule configuration and
find the name of where it is stored in the .git/modules/ directory of
the superproject.
If we can't locate the submodule configuration, this might occur because
for example a submodule gitlink was added but the corresponding
.gitmodules file was not properly updated. A die() here would not be
pleasant to the users of submodule diff formats, so instead, modify
do_submodule_path() to return an error code:
- git_pathdup_submodule() returns NULL when we fail to find a path.
- strbuf_git_path_submodule() propagates the error code to the caller.
Modify the callers of these functions to check the error code and fail
properly. This ensures we don't attempt to use a bad path that doesn't
match the corresponding submodule.
Because this change fixes add_submodule_odb() to work even if the
submodule is not checked out, update the wording of the submodule log
diff format to correctly display that the submodule is "not initialized"
instead of "not checked out"
Add tests to ensure this change works as expected.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future patch will add a new format for displaying the difference of
a submodule. Make it easier by changing how we store the current
selected format. Replace the DIFF_OPT flag with an enumeration, as each
format will be mutually exclusive.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an extension to git-diff and git-log (and any other graph-aware
displayable output) such that "--line-prefix=<string>" will print the
additional line-prefix on every line of output.
To make this work, we have to fix a few bugs in the graph API that force
graph_show_commit_msg to be used only when you have a valid graph.
Additionally, we extend the default_diff_output_prefix handler to work
even when no graph is enabled.
This is somewhat of a hack on top of the graph API, but I think it
should be acceptable here.
This will be used by a future extension of submodule display which
displays the submodule diff as the actual diff between the pre and post
commit in the submodule project.
Add some tests for both git-log and git-diff to ensure that the prefix
is honored correctly.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff/log --stat" has a logic that determines the display columns
available for the diffstat part of the output and apportions it for
pathnames and diffstat graph automatically.
5e71a84a (Add output_prefix_length to diff_options, 2012-04-16)
added the output_prefix_length field to diff_options structure to
allow this logic to subtract the display columns used for the
history graph part from the total "terminal width"; this matters
when the "git log --graph -p" option is in use.
The field must be set to the number of display columns needed to
show the output from the output_prefix() callback, which is error
prone. As there is only one user of the field, and the user has the
actual value of the prefix string, let's get rid of the field and
have the user count the display width itself.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to is_null_oid(), and is_empty_blob_sha1() add an
empty_tree_oid along with helper function is_empty_tree_oid(). For
completeness, also add an "is_empty_tree_sha1()",
"is_empty_blob_sha1()", "is_empty_tree_oid()" and "is_empty_blob_oid()"
helpers.
To ensure we only get one singleton, implement EMPTY_BLOB_SHA1_BIN as
simply getting the hash of empty_blob_oid structure.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If option --help is passed to a Git command, we try to open
the man page of that command. However, we do it for both commands
and concepts. Make sure it is an actual command.
This makes "git <concept> --help" not working anymore, while
"git help <concept>" still works.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce option --exclude-guides to the help command. With this option
being passed, "git help" will open man pages only for actual commands.
Since we know it is a command, we can use function help_unknown_command
to give the user advice on typos.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain
notice the error and handle it (by dying, still).
The eventual caller of do_pick_commit() is sequencer_pick_revisions(),
which already relays a reported error from its helper functions
(including this one), and both of its two callers know how to react to
a negative return correctly.
So this makes do_pick_commit() callable from new callers that want it
not to die, without changing the external behaviour of anything
existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying there, let the caller high up in the callchain notice
the error and handle it (by dying, still).
The function sequencer_pick_revisions() has only two callers,
cmd_revert() and cmd_cherry_pick(), both of which check the return
value and react appropriately upon errors.
So this is a safe conversion to make sequencer_pick_revisions()
callable from new callers that want it not to die, without changing
the external behaviour of anything existing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.
Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When receiving a pack-file, it can be useful to abort the
`git unpack-objects`, if the pack-file is too big.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When receiving a pack-file, it can be useful to abort the
`git index-pack`, if the pack-file is too big.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the default behavior of git-format-patch to generate numbered
sequence of 0/1 and 1/1 when generating both a cover-letter and a single
patch. This standardizes the cover letter to have 0/N which helps
distinguish the cover letter from the patch itself. Since the behavior
is easily changed via configuration as well as the use of -n and -N this
should be acceptable default behavior.
Add tests for the new default behavior.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This just shows off the improvements done by the last few
patches, and gives us a baseline for noticing regressions in
the future. Here are the results with linux.git as the perf
"large repo":
Test origin HEAD
-------------------------------------------------------------------
0003.1: log --raw 43.41(40.36+2.69) 33.86(30.96+2.41) -22.0%
0003.2: log -S 313.61(309.74+3.78) 298.75(295.58+3.00) -4.7%
(for a large repo, the "log -S" improvements are greater if
you bump the delta base cache limit, but I think it makes
sense to test the "stock" behavior, since that is what most
people will see).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fundamental data structure of the delta base cache is a
hash table mapping pairs of "(packfile, offset)" into
structs containing the actual object data. The hash table
implementation dates back to e5e0161 (Implement a simple
delta_base cache, 2007-03-17), and uses a fixed-size table.
The current size is a hard-coded 256 entries.
Because we need to be able to remove objects from the hash
table, entry lookup does not do any kind of probing to
handle collisions. Colliding items simply replace whatever
is in their slot. As a result, we have fewer usable slots
than even the 256 we allocate. At half full, each new item
has a 50% chance of displacing another one. Or another way
to think about it: every item has a 1/256 chance of being
ejected due to hash collision, without regard to our LRU
strategy.
So it would be interesting to see the effect of increasing
the cache size on the runtime for some common operations. As
with the previous patch, we'll measure "git log --raw" for
tree-only operations, and "git log -Sfoo --raw" for
operations that touch trees and blobs. All times are
wall-clock best-of-3, done against fully packed repos with
--depth=50, and the default core.deltaBaseCacheLimit of
96MB.
Here are timings for various values of MAX_DELTA_CACHE
against git.git (the asterisk marks the minimum time for
each operation):
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s
The log-raw case isn't changed much at all here (probably
because our trees just aren't that big in the first place,
or possibly because we have so _few_ trees in git.git that
the 256-entry cache is enough). But once we start putting
blobs in the cache, too, we see a big improvement (almost
50%). The curve levels off around 4096, which means that we
can hold about that many entries before hitting the 96MB
memory limit (or possibly that the workload is small enough
that there is simply no more work to be optimized out by
caching more).
(As a side note, I initially timed my existing git.git pack,
which was a base of --aggressive combined with some pulls on
top. So it had quite a few deeper delta chains. The
256-cache case was more like 15s, and it still dropped to
~6.5s in the same way).
Here are the timings for linux.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s
As we grow we see a nice 20% speedup in the tree traversal,
and more modest 10% in the log-S. This is probably an
indication that we are bound less by the number of entries,
and more by the memory limit (more on that below). What is
interesting is that the numbers bounce around a bit;
increasing the number of entries isn't always a strict
improvement.
Partially this is due to noise in the measurement. But it
may also be an indication that our LRU ejection scheme is
not optimal. The smaller cache sizes introduce some
randomness into the ejection (due to collisions), which may
sometimes work in our favor (and sometimes not!).
So what is the optimal setting of MAX_DELTA_CACHE? The
"bouncing" in the linux.git log-S numbers notwithstanding,
it mostly seems like bigger is better. And even if we were
to try to find a "sweet spot", these are just two
repositories, that are not necessarily representative. The
shape of history, the size of trees and blobs, the memory
limit configuration, etc, all will affect the outcome.
Rather than trying to find the "right" number, another
strategy is to just switch to a hash table that can actually
store collisions: namely our hashmap.h implementation.
Here are numbers for that compared to the "best" we saw from
adjusting MAX_DELTA_CACHE:
| log-raw | log-S
| best hashmap | best hashmap
| --------- --------- | --------- ---------
git | 0m02.144s 0m02.144s | 0m06.448s 0m06.688s
linux | 0m33.311s 0m33.092s | 4m41.194s 4m57.172s
We can see the results are similar in most cases, which is
what we'd expect. We're not ejecting due to collisions at
all, so this is purely representing the LRU. So really, we'd
expect this to model most closely the larger values of the
static MAX_DELTA_CACHE limit. And that does seem to be
what's happening, including the "bounce" in the linux log-S
case.
So while the value for that case _isn't_ as good as the
optimal one measured above (which was 2048 entries), given
the bouncing I'm hesitant to suggest that 2048 is any kind
of optimum (not even for linux.git, let alone as a general
rule). The generic hashmap has the appeal that it drops the
number of tweakable numbers by one, which means we can focus
on tuning other elements, like the LRU strategy or the
core.deltaBaseCacheLimit setting.
And indeed, if we bump the cache limit to 1G (which is
probably silly for general use, but maybe something people
with big workstations would want to do), the linux.git log-S
time drops to 3m32s. That's something you really _can't_ do
easily with the static hash table, because the number of
entries needs to grow in proportion to the memory limit (so
2048 is almost certainly not going to be the right value
there).
This patch takes that direction, and drops the static hash
table entirely in favor of using the hashmap.h API.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the delta base cache runs out of allowed memory, it has
to drop entries. It does so by walking an LRU list, dropping
objects until we are under the memory limit. But we actually
walk the list twice: once to drop blobs, and then again to
drop other objects (which are generally trees). This comes
from 18bdec1 (Limit the size of the new delta_base_cache,
2007-03-19).
This performs poorly as the number of entries grows, because
any time dropping blobs does not satisfy the limit, we have
to walk the _entire_ list, trees included, looking for blobs
to drop, before starting to drop any trees.
It's not generally a problem now, as the cache is limited to
only 256 entries. But as we could benefit from increasing
that in a future patch, it's worth looking at how it
performs as the cache size grows. And the answer is "not
well".
The table below shows times for various operations with
different values of MAX_DELTA_CACHE (which is not a run-time
knob; I recompiled with -DMAX_DELTA_CACHE=$n for each).
I chose "git log --raw" ("log-raw" in the table) because it
will access all of the trees, but no blobs at all (so in a
sense it is a worst case for this problem, because we will
always walk over the entire list of trees once before
realizing there are no blobs to drop). This is also
representative of other tree-only operations like "rev-list
--objects" and "git log -- <path>".
I also timed "git log -Sfoo --raw" ("log-S" in the table).
It similarly accesses all of the trees, but also the blobs
for each commit. It's representative of "git log -p", though
it emphasizes the cost of blob access more, as "-S" is
cheaper than computing an actual blob diff.
All timings are best-of-3 wall-clock times (though they all
were CPU bound, so the user CPU times are similar). The
repositories were fully packed with --depth=50, and the
default core.deltaBaseCacheLimit of 96M was in effect. The
current value of MAX_DELTA_CACHE is 256, so I started there
and worked up by factors of 2.
First, here are values for git.git (the asterisk signals the
fastest run for each operation):
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.212s 0m12.634s
512 0m02.136s* 0m10.614s
1024 0m02.156s 0m08.614s
2048 0m02.208s 0m07.062s
4096 0m02.190s 0m06.484s*
8192 0m02.176s 0m07.635s
16384 0m02.913s 0m19.845s
32768 0m03.617s 1m05.507s
65536 0m04.031s 1m18.488s
You can see that for the tree-only log-raw case, we don't
actually benefit that much as the cache grows (all the
differences up through 8192 are basically just noise; this
is probably because we don't actually have that many
distinct trees in git.git). But for log-S, we get a definite
speed improvement as the cache grows, but the improvements
are lost as cache size grows and the linear LRU management
starts to dominate.
Here's the same thing run against linux.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m40.987s 5m13.216s
512 0m37.949s 5m03.243s
1024 0m35.977s 4m50.580s
2048 0m33.855s 4m39.818s
4096 0m32.913s 4m47.299s*
8192 0m32.176s* 5m14.650s
16384 0m32.185s 6m31.625s
32768 0m38.056s 9m31.136s
65536 1m30.518s 17m38.549s
The pattern is similar, though the effect in log-raw is more
pronounced here. The times dip down in the middle, and then
go back up as we keep growing.
So we know there's a problem. What's the solution?
The obvious one is to improve the data structure to avoid
walking over tree entries during the looking-for-blobs
traversal. We can do this by keeping _two_ LRU lists: one
for blobs, and one for other objects. We drop items from the
blob LRU first, and then from the tree LRU (if necessary).
Here's git.git using that strategy:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ----------
256 0m02.264s 0m12.830s
512 0m02.201s 0m10.771s
1024 0m02.181s 0m08.593s
2048 0m02.205s 0m07.116s
4096 0m02.158s 0m06.537s*
8192 0m02.213s 0m07.246s
16384 0m02.155s* 0m10.975s
32768 0m02.159s 0m16.047s
65536 0m02.181s 0m16.992s
The upswing on log-raw is gone completely. But log-S still
has it (albeit much better than without this strategy).
Let's see what linux.git shows:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m42.519s 5m14.654s
512 0m39.106s 5m04.708s
1024 0m36.802s 4m51.454s
2048 0m34.685s 4m39.378s*
4096 0m33.663s 4m44.047s
8192 0m33.157s 4m50.644s
16384 0m33.090s* 4m49.648s
32768 0m33.458s 4m53.371s
65536 0m33.563s 5m04.580s
The results are similar. The tree-only case again performs
well (not surprising; we're literally just dropping the one
useless walk, and not otherwise changing the cache eviction
strategy at all). But the log-S case again does a bit worse
as the cache grows (though possibly that's within the noise,
which is much larger for this case).
Perhaps this is an indication that the "remove blobs first"
strategy is not actually optimal. The intent of it is to
avoid blowing out the tree cache when we see large blobs,
but it also means we'll throw away useful, recent blobs in
favor of older trees.
Let's run the same numbers without caring about object type
at all (i.e., one LRU list, and always evicting whatever is
at the head, regardless of type).
Here's git.git:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m02.227s 0m12.821s
512 0m02.143s 0m10.602s
1024 0m02.127s 0m08.642s
2048 0m02.148s 0m07.123s
4096 0m02.194s 0m06.448s*
8192 0m02.239s 0m06.504s
16384 0m02.144s* 0m06.502s
32768 0m02.202s 0m06.622s
65536 0m02.230s 0m06.677s
Much smoother; there's no dramatic upswing as we increase
the cache size (some remains, though it's small enough that
it's mostly run-to-run noise. E.g., in the log-raw case,
note how 8192 is 50-100ms higher than its neighbors). Note
also that we stop getting any real benefit for log-S after
about 4096 entries; that number will depend on the size of
the repository, the size of the blob entries, and the memory
limit of the cache.
Let's see what linux.git shows for the same strategy:
MAX_DELTA_CACHE log-raw log-S
--------------- --------- ---------
256 0m41.661s 5m12.410s
512 0m39.547s 5m07.920s
1024 0m37.054s 4m54.666s
2048 0m35.871s 4m41.194s*
4096 0m34.646s 4m51.648s
8192 0m33.881s 4m55.342s
16384 0m35.190s 5m00.122s
32768 0m35.060s 4m58.851s
65536 0m33.311s* 4m51.420s
It's similarly good. As with the "separate blob LRU"
strategy, there's a lot of noise on the log-S run here. But
it's certainly not any worse, is possibly a bit better, and
the improvement over "separate blob LRU" on the git.git case
is dramatic.
So it seems like a clear winner, and that's what this patch
implements.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep an LRU list of entries for when we need to drop
something from an over-full cache. The list is implemented
as a circular doubly-linked list, which is exactly what
list.h provides. We can save a few lines by using the list.h
macros and functions. More importantly, this makes the code
easier to follow, as the reader sees explicit concepts like
"list_add_tail()" instead of pointer manipulation.
As a bonus, the list_entry() macro lets us place the lru
pointers anywhere inside the delta_base_cache_entry struct
(as opposed to just casting the pointer, which requires it
at the front of the struct). This will be useful in later
patches when we need to place other items at the front of
the struct (e.g., our hashmap implementation requires this).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function drops an entry entirely from the cache,
meaning that aside from the freeing of the buffer, it is
exactly equivalent to detach_delta_base_cache_entry(). Let's
build on top of the detach function, which shortens the code
and will make it simpler when we change out the underlying
storage in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The delta base cache entries are stored in a fixed-length
hash table. So the way to remove an entry is to "clear" the
slot in the table, and that is what this function does.
However, the name is a leaky abstraction. If we were to
change the hash table implementation, it would no longer be
about "clearing". We should name it after _what_ it does,
not _how_ it does it. I.e., something like "remove" instead
of "clear".
But that does not tell the whole story, either. The subtle
thing about this function is that it removes the entry, but
does not free the entry data. So a more descriptive name is
"detach"; we give ownership of the data buffer to the
caller, and remove any other resources.
This patch uses the name detach_delta_base_cache_entry().
We could further model this after functions like
strbuf_detach(), which pass back all of the detached
information. However, since there are so many bits of
information in the struct (the data, the size, the type),
and so few callers (only one), it's not worth that
awkwardness. The name change and a comment can make the
intent clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is only one caller of cache_or_unpack_entry() and it
always passes 1 for the keep_cache parameter. We can
simplify it by dropping the "!keep_cache" case.
Another call, which did pass 0, was dropped in abe601b
(sha1_file: remove recursion in unpack_entry, 2013-03-27),
as unpack_entry() now does more complicated things than a
simple unpack when there is a cache miss.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of xdl_change_compact() is fairly simple:
* Proceed through groups of changed lines in the file to be compacted,
keeping track of the corresponding location in the "other" file.
* If possible, slide the group up and down to try to give the most
aesthetically pleasing diff. Whenever it is slid, the current location
in the other file needs to be adjusted.
But these simple concepts are obfuscated by a lot of index handling that
is written in terse, subtle, and varied patterns. I found it very hard
to convince myself that the function was correct.
So introduce a "struct group" that represents a group of changed lines
in a file. Add some functions that perform elementary operations on
groups:
* Initialize a group to the first group in a file
* Move to the next or previous group in a file
* Slide a group up or down
Even though the resulting code is longer, I think it is easier to
understand and review. Its performance is not changed
appreciably (though it would be if `group_next()` and `group_previous()`
were not inlined).
...and in fact, the rewriting helped me discover another bug in the
--compaction-heuristic code: The update of blank_lines was never done
for the highest possible position of the group. This means that it could
fail to slide the group to its highest possible position, even if that
position had a blank line as its last line. So for example, it yielded
the following diff:
$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
1
A
+
+B
+
+A
2
when in fact the following diff is better (according to the rules of
--compaction-heuristic):
$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
1
+A
+
+B
+
A
2
The new code gives the bottom answer.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason for it to take an array and two indexes as argument,
as it only accesses two elements of the array.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason for it to take an array and index as argument, as it
only accesses a single element of the array.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the changed group of lines can be matched to a group in the other
file, then that positioning should take precedence over the compaction
heuristic.
The old code tried the heuristic unconditionally, which cost redundant
effort and also was broken if the matching code had already shifted the
group higher than the blank line.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code branch used for the compaction heuristic forgot to keep ixo in
sync while the group was shifted. This is certainly wrong, as it causes
the two counters to get out of sync.
I *think* that this bug could also have caused the function to read past
the end of the rchgo array, though I haven't done the work to prove it
for sure. Here is my reasoning:
If ixo is not decremented correctly during one iteration of the outer
while loop, then it will loose sync with the ix counter. In particular,
ixo will be too large.
Suppose that the next iterations of the outer while loop (i.e.,
processing the next block of add/delete lines) don't have any sliders.
Then the ixo counter would be incremented by the number of non-changed
lines in xdf, which is the same as the number of non-changed lines in
xdfo that *should have* followed the group that experienced the
malfunction. But since ixo was too large at the end of that iteration,
it will be incremented past the end of the xdfo->rchg array, and will
try to read that memory illegally.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `--recursive` and `--reference` is given, it is reasonable to
expect that the submodules are created with references to the submodules
of the given alternate for the superproject.
An initial attempt to do this was presented to the mailing list, which
used flags that are passed around ("--super-reference") that instructed
the submodule clone to look for a reference in the submodules of the
referenced superproject. This is not well thought out, as any further
`submodule update` should also respect the initial setup.
When a new submodule is added to the superproject and the alternate
of the superproject does not know about that submodule yet, we rather
error out informing the user instead of being unclear if we did or did
not use a submodules alternate.
To solve this problem introduce new options that store the configuration
for what the user wanted originally.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch we want to try to create alternates for submodules,
but they might not exist in the referenced superproject. So add a way
to skip the non existing references and report them.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch we introduce optional references; To better distinguish
between optional and required references we rename the variable.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch we want to determine if a path is suitable as an
alternate from other commands than builtin/clone. Move the checking
functionality of `add_one_reference` to `compute_alternate_path` that is
defined in cache.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the user to pass in multiple references to update_clone.
Currently this is only internal API, but once the shell script is
replaced by a C version, this is needed.
This fixes an API bug between the shell script and the helper.
Currently the helper accepts "--reference" "--reference=foo"
as a OPT_STRING whose value happens to be "--reference=foo", and
then uses
if (suc->reference)
argv_array_push(&child->args, suc->reference)
where suc->reference _is_ "--reference=foo" when invoking the
underlying "git clone", it cancels out.
With this change we omit one of the "--reference" arguments when
passing references from the shell script to the helper.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow users to pass in multiple references, just as clone accepts multiple
references as well.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests consisting of one line each can be consolidated to have fewer tests
to run as well as fewer lines of code.
When having just a few git commands, do not create a new shell but
use the -C flag in Git to execute in the correct directory.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No functional change intended. This commit only changes formatting
to the style we recently use, e.g. starting the body of a test with a
single quote on the same line as the header, and then having the test
indented in the following lines.
Whenever we change directories, we do that in subshells.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", write_out_results() should return -1 instead of
calling exit().
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", write_out_one_result() should just return what
remove_file() and create_file() are returning instead of calling
exit().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", create_file() should just return what
add_conflicted_stages_file() and add_index_file() are returning
instead of calling exit().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", add_index_file() should return -1 instead of
calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", add_conflicted_stages_file() should return -1
instead of calling die().
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", remove_file() should return -1 instead of
calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", build_fake_ancestor() should return -1 instead
of calling die().
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", die_on_unsafe_path() should return a negative
integer instead of calling die(), so while doing that let's change
its name to check_unsafe_path().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", gitdiff_*() functions should return -1 instead
of calling die().
A previous patch made it possible for gitdiff_*() functions to
return -1 in case of error. Let's take advantage of that to
make gitdiff_verify_name() return -1 on error, and to have
gitdiff_oldname() and gitdiff_newname() directly return
what gitdiff_verify_name() returns.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gitdiff_*() functions that are called as p->fn() in parse_git_header()
should return 1 instead of -1 in case of end of header or unrecognized
input, as these are not real errors. It just instructs the parser to break
out.
This makes it possible for gitdiff_*() functions to return -1 in case of a
real error. This will be done in a following patch.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", parse_traditional_patch() should return -1
instead of calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To finish libifying the apply functionality, apply_all_patches() should not
die() or exit() in case of error, but return either 128 or 1, so that it
gives the same exit code as when die() or exit(1) is called. This way
scripts relying on the exit code don't need to be changed.
While doing that we must take care that file descriptors are properly closed
and, if needed, reset to a sensible value.
Also, according to the lockfile API, when finished with a lockfile, one
should either commit it or roll it back.
This is even more important now that the same lockfile can be passed
to init_apply_state() many times to be reused by series of calls to
the apply lib functions.
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we must make check_apply_state()
usable outside "builtin/apply.c".
Let's do that by moving it into "apply.c".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", check_apply_state() should return -1 instead of
calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of exit()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", init_apply_state() should return -1 instead of
calling exit().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we must make init_apply_state()
usable outside "builtin/apply.c".
Let's do that by moving it into a new "apply.c".
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in "builtin/apply.c", parse_ignorewhitespace_option() should return
-1 instead of calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_whitespace_option() should return -1 instead
of calling die().
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_single_patch() should return a negative
integer instead of calling die().
Let's do that by using error() and let's adjust the related test
cases accordingly.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing or exit()ing.
To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, parse_chunk() should return a negative integer
instead of calling die() or exit().
As parse_chunk() is called only by apply_patch() which already
returns either -1 or -128 when an error happened, let's make it also
return -1 or -128.
This makes it compatible with what find_header() and parse_binary()
already return.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing.
To do that in a compatible manner with the rest of the error handling
in builtin/apply.c, let's make find_header() return -128 instead of
calling die().
We could make it return -1, unfortunately find_header() already
returns -1 when no header is found.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors to the
caller instead of die()ing. Let's do that by returning -1 instead of
die()ing in read_patch_file().
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we have to signal errors
to the caller instead of die()ing.
As a first step in this direction, let's make apply_patch() return
-1 or -128 in case of errors instead of dying. For now its only
caller apply_all_patches() will exit(128) when apply_patch()
return -128 and it will exit(1) when it returns -1.
We exit() with code 128 because that was what die() was doing
and we want to keep the distinction between exiting with code 1
and exiting with code 128.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To libify `git apply` functionality we must make 'struct apply_state'
usable outside "builtin/apply.c".
Let's do that by creating a new "apply.h" and moving
'struct apply_state' there.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for some structs and constants being moved from
builtin/apply.c to apply.h, we should give them some more
specific names to avoid possible name collisions in the global
namespace.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add lf_to_nul helper function to test-lib-functions.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update status manpage to include information about
porcelain v2 format.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expand porcelain v2 output to include branch and tracking
branch information. This includes the commit id, the branch,
the upstream branch, and the ahead and behind counts.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Collect extra per-file data for porcelain V2 format.
The output of `git status --porcelain` leaves out many
details about the current status that clients might like
to have. This can force them to be less efficient as they
may need to launch secondary commands (and try to match
the logic within git) to accumulate this extra information.
For example, a GUI IDE might want the file mode to display
the correct icon for a changed item (without having to stat
it afterwards).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original implementation of want_object_in_pack(), we
always looked for the object in every pack, so the order did
not matter for performance.
As of the last few patches, however, we can now often break
out of the loop early after finding the first instance, and
avoid looking in the other packs at all. In this case, pack
order can make a big difference, because we'd like to find
the objects by looking at as few packs as possible.
This patch switches us to the same packed_git_mru list that
is now used by normal object lookups.
Here are timings for p5303 on linux.git:
Test HEAD^ HEAD
------------------------------------------------------------------------
5303.3: rev-list (1) 31.31(31.07+0.23) 31.28(31.00+0.27) -0.1%
5303.4: repack (1) 40.35(38.84+2.60) 40.53(39.31+2.32) +0.4%
5303.6: rev-list (50) 31.37(31.15+0.21) 31.41(31.16+0.24) +0.1%
5303.7: repack (50) 58.25(68.54+2.03) 47.28(57.66+1.89) -18.8%
5303.9: rev-list (1000) 31.91(31.57+0.33) 31.93(31.64+0.28) +0.1%
5303.10: repack (1000) 304.80(376.00+3.92) 87.21(159.54+2.84) -71.4%
The rev-list numbers are unchanged, which makes sense (they
are not exercising this code at all). The 50- and 1000-pack
repack cases show considerable improvement.
The single-pack repack case doesn't, of course; there's
nothing to improve. In fact, it gives us a baseline for how
fast we could possibly go. You can see that though rev-list
can approach the single-pack case even with 1000 packs,
repack doesn't. The reason is simple: the loop we are
optimizing is only part of what the repack is doing. After
the "counting" phase, we do delta compression, which is much
more expensive when there are multiple packs, because we
have fewer deltas we can reuse (you can also see that these
numbers come from a multicore machine; the CPU times are
much higher than the wall-clock times due to the delta
phase).
So the good news is that in cases with many packs, we used
to be dominated by the "counting" phase, and now we are
dominated by the delta compression (which is faster, and
which we have already parallelized).
Here are similar numbers for git.git:
Test HEAD^ HEAD
---------------------------------------------------------------------
5303.3: rev-list (1) 1.55(1.51+0.02) 1.54(1.53+0.00) -0.6%
5303.4: repack (1) 1.82(1.80+0.08) 1.82(1.78+0.09) +0.0%
5303.6: rev-list (50) 1.58(1.57+0.00) 1.58(1.56+0.01) +0.0%
5303.7: repack (50) 2.50(3.12+0.07) 2.31(2.95+0.06) -7.6%
5303.9: rev-list (1000) 2.22(2.20+0.02) 2.23(2.19+0.03) +0.5%
5303.10: repack (1000) 10.47(16.78+0.22) 7.50(13.76+0.22) -28.4%
Not as impressive in terms of percentage, but still
measurable wins. If you look at the wall-clock time
improvements in the 1000-pack case, you can see that linux
improved by roughly 10x as many seconds as git. That's
because it has roughly 10x as many objects, and we'd expect
this improvement to scale linearly with the number of
objects (since the number of packs is kept constant). It's
just that the "counting" phase is a smaller percentage of
the total time spent for a git.git repack, and hence the
percentage win is smaller.
The implementation itself is a straightforward use of the
MRU code. We only bother marking a pack as used when we know
that we are able to break early out of the loop, for two
reasons:
1. If we can't break out early, it does no good; we have
to visit each pack anyway, so we might as well avoid
even the minor overhead of managing the cache order.
2. The mru_mark() function reorders the list, which would
screw up our traversal. So it is only safe to mark when
we are about to break out of the loop. We could record
the found pack and mark it after the loop finishes, of
course, but that's more complicated and it doesn't buy
us anything due to (1).
Note that this reordering does have a potential impact on
the final pack, as we store only a single "found" pack for
each object, even if it is present in multiple packs. In
principle, any copy is acceptable, as they all refer to the
same content. But in practice, they may differ in whether
they are stored as deltas, against which base, etc. This may
have an impact on delta reuse, and even the delta search
(since we skip pairs that were already in the same pack).
It's not clear whether this change of order would hurt or
even help average cases, though. The most likely reason to
have duplicate objects is from the completion of thin packs
(e.g., you have some objects in a "base" pack, then receive
several pushes; the packs you receive may be thin on the
wire, with deltas that refer to bases outside the pack, but
we complete them with duplicate base objects when indexing
them).
In such a case the current code would always find the thin
duplicates (because we currently walk the packs in reverse
chronological order). Whereas with this patch, some of those
duplicates would be found in the base pack instead.
In my tests repacking a real-world case of linux.git with
3600 thin-pack pushes (on top of a large "base" pack), the
resulting pack was about 0.04% larger with this patch. On
the other hand, because we were more likely to hit the base
pack, there were more opportunities for delta reuse, and we
had 50,000 fewer objects to examine in the delta search.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not allow cycles in the delta graph of a pack (i.e., A
is a delta of B which is a delta of A) for the obvious
reason that you cannot actually access any of the objects in
such a case.
There's a last-ditch attempt to notice cycles during the
write phase, during which we issue a warning to the user and
write one of the objects out in full. However, this is
"last-ditch" for two reasons:
1. By this time, it's too late to find another delta for
the object, so the resulting pack is larger than it
otherwise could be.
2. The warning is there because this is something that
_shouldn't_ ever happen. If it does, then either:
a. a pack we are reusing deltas from had its own
cycle
b. we are reusing deltas from multiple packs, and
we found a cycle among them (i.e., A is a delta of
B in one pack, but B is a delta of A in another,
and we choose to use both deltas).
c. there is a bug in the delta-search code
So this code serves as a final check that none of these
things has happened, warns the user, and prevents us
from writing a bogus pack.
Right now, (2b) should never happen because of the static
ordering of packs in want_object_in_pack(). If two objects
have a delta relationship, then they must be in the same
pack, and therefore we will find them from that same pack.
However, a future patch would like to change that static
ordering, which will make (2b) a common occurrence. In
preparation, we should be able to handle those kinds of
cycles better. This patch does by introducing a
cycle-breaking step during the get_object_details() phase,
when we are deciding which deltas can be reused. That gives
us the chance to feed the objects into the delta search as
if the cycle did not exist.
We'll leave the detection and warning in the write_object()
phase in place, as it still serves as a check for case (2c).
This does mean we will stop warning for (2a). That case is
caused by bogus input packs, and we ideally would warn the
user about it. However, since those cycles show up after
picking reusable deltas, they look the same as (2b) to us;
our new code will break the cycles early and the last-ditch
check will never see them.
We could do analysis on any cycles that we find to
distinguish the two cases (i.e., it is a bogus pack if and
only if every delta in the cycle is in the same pack), but
we don't need to. If there is a cycle inside a pack, we'll
run into problems not only reusing the delta, but accessing
the object data at all. So when we try to dig up the actual
size of the object, we'll hit that same cycle and kick in
our usual complain-and-try-another-source code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some code may have a pack/offset pair for an object, but
would like to look up more information. Using
sha1_object_info() is too heavy-weight; it starts from the
sha1 and has to find the pack again (so not only does it waste
time, it might not even find the same instance).
In some cases, this problem is solved by helpers like
get_size_from_delta(), which is used by pack-objects to take
a shortcut for objects whose packed representation has
already been found. But there's no similar function for
getting the object type, for instance. Rather than introduce
one, let's just make the whole packed_object_info() available.
It is smart enough to spend effort only on the items the
caller wants.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An all-zero initializer is fine for this struct, but because
the first element is a pointer, call sites need to know to
use "NULL" instead of "0". Otherwise some static checkers
like "sparse" will complain; see d099b71 (Fix some sparse
warnings, 2013-07-18) for example. So let's provide an
initializer to make this easier to get right.
But let's also comment that memset() to zero is explicitly
OK[1]. One of the callers embeds object_info in another
struct which is initialized via memset (expand_data in
builtin/cat-file.c). Since our subset of C doesn't allow
assignment from a compound literal, handling this in any
other way is awkward, so we'd like to keep the ability to
initialize by memset(). By documenting this property, it
should make anybody who wants to change the initializer
think twice before doing so.
There's one other caller of interest. In parse_sha1_header(),
we did not initialize the struct fully in the first place.
This turned out not to be a bug because the sub-function it
calls does not look at any other fields except the ones we
did initialize. But that assumption might not hold in the
future, so it's a dangerous construct. This patch switches
it to initializing the whole struct, which protects us
against unexpected reads of the other fields.
[1] Obviously using memset() to initialize a pointer
violates the C standard, but we long ago decided that it
was an acceptable tradeoff in the real world.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update --porcelain argument to take optional version parameter
to allow multiple porcelain formats to be supported in the future.
The token "v1" is the default value and indicates the traditional
porcelain format. (The token "1" is an alias for that.)
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the API between builtin/commit.c and wt-status.[ch].
Hide the details of the various wt_*status_print() routines inside
wt-status.c behind a single (new) wt_status_print() routine.
Eliminate the switch statements from builtin/commit.c.
Allow details of new status formats to be isolated within wt-status.c
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the various wt_status_print*() routines to be
wt_longstatus_print*() to make it clear that these
routines are only concerned with the normal/long
status output and reduce developer confusion as other
status formats are added in the future.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An empty string as a pathspec element matches all paths. A buggy
script, however, could accidentally assign an empty string to a
variable that then gets passed to a Git command invocation, e.g.:
path=... compute a path to be removed in $path ...
git rm -r "$paht"
which would unintentionally remove all paths in the current
directory.
The fix for this issue requires a two-step approach. As there may be
existing scripts that knowingly use empty strings in this manner,
the first step simply gives a warning that (1) tells that an empty
string will become an invalid pathspec element and (2) asks the user
to use "." if they mean to match all.
For step two, a follow-up patch several release cycles later will
remove the warning and throw an error instead.
This patch is the first step.
Signed-off-by: Emily Xie <emilyxxie@gmail.com>
Reported-by: David Turner <novalis@novalis.org>
Mentored-by: Michail Denchev <mdenchev@gmail.com>
Thanks-to: Sarah Sharp <sarah@thesharps.us> and James Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of always requiring both ends of a range, we could DWIM
"OLD", which could be a misspelt "OLD..", to be a range that ends at
the current commit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git blame --reverse OLD..NEW -- PATH" tells us to start from the
contents in PATH at OLD and observe how each line is changed while
the history develops up to NEW, and report for each line the latest
commit up to which the line survives in the original form.
If you say "git blame --reverse NEW -- PATH" by mistake, we complain
about the missing OLD, but we phrased it as "No commit to dig down
to?" In this case, however, we are digging up from OLD, so say so.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone of this branch
after version <some-tag>".
Short refs are accepted and expanded at the server side with expand_ref()
because we cannot expand (unknown) refs from the client side.
Like deepen-since, deepen-not cannot be used with deepen. But deepen-not
can be mixed with deepen-since. The result is exactly how you do the
command "git rev-list --since=... --not ref".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is basically dwim_ref() without @{} support. To be used on the
server side where we want to expand abbreviated to full ref names and
nothing else. The first user is "git clone/fetch --shallow-exclude".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone containing the
work from last year" (once the client side is fixed up, of course).
In theory deepen-since and deepen (aka --depth) can be used together to
draw the shallow boundary (whether it's intersection or union is up to
discussion, but if rev-list is used, it's likely intersection). However,
because deepen goes with a custom commit walker, we can't mix the two
yet.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of a custom commit walker like get_shallow_commits(), this new
function uses rev-list to mark NOT_SHALLOW to all reachable commits,
except borders. The definition of reachable is to be defined by the
protocol later. This makes it more flexible to define shallow boundary.
The way we find border is paint all reachable commits NOT_SHALLOW. Any
of them that "touches" commits without NOT_SHALLOW flag are considered
shallow (e.g. zero parents via grafting mechanism). Shallow commits and
their true parents are all marked SHALLOW. Then NOT_SHALLOW is removed
from shallow commits at the end.
There is an interesting observation. With a generic walker, we can
produce all kinds of shallow cutting. In the following graph, every
commit but "x" is reachable. "b" is a parent of "a".
x -- a -- o
/ /
x -- c -- b -- o
After this function is run, "a" and "c" are both considered shallow
commits. After grafting occurs at the client side, what we see is
a -- o
/
c -- b -- o
Notice that because of grafting, "a" has zero parents, so "b" is no
longer a parent of "a".
This is unfortunate and may be solved in two ways. The first is change
the way shallow grafting works and keep "a -- b" connection if "b"
exists and always ends at shallow commits (iow, no loose ends). This is
hard to detect, or at least not cheap to do.
The second way is mark one "x" as shallow commit instead of "a" and
produce this graph at client side:
x -- a -- o
/ /
c -- b -- o
More commits, but simpler grafting rules.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shallow repo could be deepened or shortened when then user gives
--depth. But in future that won't be the only way to deepen/shorten a
repo. Stop relying on args->depth in this mode. Future deepening
methods can simply set this flag on instead of updating all these if
expressions.
The new name "deepen" was chosen after the command to define shallow
boundary in pack protocol. New commands also follow this tradition.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reduces the number of "if (verbose)" which makes it a bit easier
to read imo. It also makes it easier to redirect all these printouts,
to a file for example.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On error check_non_tip() will die and not closing file descriptors is no
big deal. The next patch will split the majority of this function out
for reuse in other cases, where die() may not be the only outcome. Same
story for popping SIGPIPE out of the signal chain. So let's make sure we
clean things up properly first.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also add some more comments in this code because it takes too long to
understand what it does (to me, who should be familiar enough to
understand this code well!)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the last patch, "result" and "backup" are the same. "result" used
to move, but the movement is now contained in send_shallow(). Delete
this redundant variable.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a prep step for further refactoring. Besides reindentation and
s/shallows\./shallows->/g, no other changes are expected.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For now we can handle two types, string and boolean, in
set_helper_option(). Later on we'll add string_list support, which does
not fit well. The new function strbuf_set_helper_option() can be reused
for a separate function that handles string-list.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git commit --amend preserves the author details unless --reset-author is
given.
git-gui discards the author details on amend.
Fix by reading the author details along with the commit message, and
setting the appropriate environment variables required for preserving
them.
Reported long ago in the mailing list[1].
[1] http://article.gmane.org/gmane.comp.version-control.git/243921
Signed-off-by: Orgad Shaneh <orgad.shaneh@audiocodes.com>
ALL_LIBFILES uses wildcard, which provides the result in directory
order. This order depends on the underlying filesystem on the
buildhost. To get reproducible builds it is required to sort such list
before using them.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
2015-05-01 15:53:06 +01:00
434 changed files with 58556 additions and 32593 deletions
* The following two options are parsed by parse_revision_opt()
* and are only included here to get included in the "-h"
* output:
*/
{OPTION_LOWLEVEL_CALLBACK,0,"indent-heuristic",NULL,NULL,N_("Use an experimental indent-based heuristic to improve diffs"),PARSE_OPT_NOARG,parse_opt_unknown_cb},
{OPTION_LOWLEVEL_CALLBACK,0,"compaction-heuristic",NULL,NULL,N_("Use an experimental blank-line-based heuristic to improve diffs"),PARSE_OPT_NOARG,parse_opt_unknown_cb},
OPT_BIT(0,"minimal",&xdl_opts,N_("Spend extra cycles to find better match"),XDF_NEED_MINIMAL),
OPT_STRING('S',NULL,&revs_file,N_("file"),N_("Use revisions from <file> instead of calling git-rev-list")),
OPT_STRING(0,"contents",&contents_from,N_("file"),N_("Use <file>'s contents as the final image")),
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.