* maint-2.22:
Git 2.22.5
Git 2.21.4
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.21:
Git 2.21.4
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.20:
Git 2.20.5
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.19:
Git 2.19.6
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.18:
Git 2.18.5
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
* maint-2.17:
Git 2.17.6
unpack_trees(): start with a fresh lstat cache
run-command: invalidate lstat cache after a command finished
checkout: fix bug that makes checkout follow symlinks in leading path
In the previous commit, we intercepted calls to `rmdir()` to invalidate
the lstat cache in the successful case, so that the lstat cache could
not have the idea that a directory exists where there is none.
The same situation can arise, of course, when a separate process is
spawned (most notably, this is the case in `submodule_move_head()`).
Obviously, we cannot know whether a directory was removed in that
process, therefore we must invalidate the lstat cache afterwards.
Note: in contrast to `lstat_cache_aware_rmdir()`, we invalidate the
lstat cache even in case of an error: the process might have removed a
directory and still have failed afterwards.
Co-authored-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).
But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.
When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.
Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.
Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.
Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.
This addresses CVE-2021-21300.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
* 'master' of github.com:ruester/git-po-de:
l10n: de.po: Fix typo in the German translation of octopus
l10n: de.po: Update German translation for Git 2.27.0
f1e3df3169 (t: increase test coverage of signature verification output,
2020-03-04) adds GPG dependent tests to t4202 and t6200 that were found
problematic with at least OpenBSD 6.7.
Using an escaped '|' for alternations works only in some implementations
of grep (e.g. GNU and busybox).
It is not part of POSIX[1] and not supported by some BSD, macOS, and
possibly other POSIX compatible implementations.
Use `grep -E`, and write it using extended regular expression.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To set the default hash algorithm you can set the `GIT_DEFAULT_HASH`
environment variable. In the documentation this variable is named
`GIT_DEFAULT_HASH_ALGORITHM`, which is incorrect.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The explanation of the `--show-pulls` option added in commit 8d049e182e
("revision: --show-pulls adds helpful merges", 2020-04-10) consists of
several paragraphs and we use "+" throughout to tie them together in one
long chain of list continuations. Only thing is, we're not in any kind
of list, so these pluses end up being rendered literally.
The preceding few paragraphs describe `--ancestry-path` and there we
*do* have a list, since we've started one with `--ancestry-path::`. In
fact, we have several such lists for all the various history-simplifying
options we're discussing earlier in this file.
Thus, we're missing a list both from a consistency point of view and
from a practical rendering standpoint.
Let's start a list for `--show-pulls` where we start actually discussing
the option, and keep the paragraphs preceding it out of that list. That
is, drop all those pluses before the new list we're adding here.
Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with asciidoc-8.4.5 (as found on CentOS/Red Hat 6), the
period in the "[[files-in-.gitignore-are-tracked]]" anchor is not
properly parsed as a section:
WARNING: gitfaq.txt: line 245: missing [[files-in-.gitignore-are-tracked]] section
The resulting XML file fails to validate with xmlto:
xmlto: /git/Documentation/gitfaq.xml does not validate (status 3)
xmlto: Fix document syntax or use --skip-validation option
/git/Documentation/gitfaq.xml:3: element refentry: validity error :
Element refentry content does not follow the DTD, expecting
(beginpage? , indexterm* , refentryinfo? , refmeta? , (remark | link
| olink | ulink)* , refnamediv+ , refsynopsisdiv? , (refsect1+ |
refsection+)), got (refmeta refnamediv refsynopsisdiv refsect1
refsect1 refsect1 refsect1 variablelist refsect1 refsect1 )
Document /git/Documentation/gitfaq.xml does not validate
Let's avoid breaking users of platforms which ship an old version of
asciidoc, since the cost to do so is quite low.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various doc fixes.
* ma/doc-fixes:
git-sparse-checkout.txt: add missing '
git-credential.txt: use list continuation
git-commit-graph.txt: fix list rendering
git-commit-graph.txt: fix grammo
date-formats.txt: fix list continuation
* https://github.com/prati0100/git-gui:
git-gui: Handle Ctrl + BS/Del in the commit msg
Subject: git-gui: fix syntax error because of missing semicolon
Allow deleting words backwards and forwards using Ctrl + Backspace and
Delete in the commit message buffer.
* il/ctrl-bs-del:
git-gui: Handle Ctrl + BS/Del in the commit msg
6c722cbe5a (bisect: allow CRLF line endings in "git bisect replay"
input, 2020-05-07) includes CR as a field separator, but relies on
it not being included in the last field, which breaks at least when
running under OpenBSD 6.7's sh.
Instead of just assume the CR will get swallowed, read the rest of
the line into an otherwise unused variable and ignore it everywhere
except on the call for git bisect start, where it matters.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'gitfaq.txt' was added in 2149b6748f (docs: add a FAQ, 2020-03-30),
it was added to the Makefile but not to command-list.txt.
Add it there also, so that the new FAQ is listed in the output of
`git help --guides`.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a BRE, that broke tests 30-32, 37-39, 42 at least with
OpenBSD 6.7; use a simpler ERE.
Fixes: d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Seems to trigger a bug in at least OpenBSD's 6.7 sh where it is
interpreted as a history lookup and therefore fails 125-126, 128,
130.
Remove the subshell and get a space between ! and grep, so tests
pass successfully.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent attempt to make the test output nicer to view on CI
systems broke TAP output under bash. The effort has been reverted
to be re-attempted in the next cycle.
* jc/fix-tap-output-under-bash:
Revert "tests: when run in Bash, annotate test failures with file name/line number"
Revert "ci: add a problem matcher for GitHub Actions"
Revert "t/test_lib: avoid naked bash arrays in file_lineno"
Last-minute fix for our recent change to allow use of progress API
as a traceable region.
* ds/trace-log-progress-fix:
progress: call trace2_region_leave() only after calling _enter()
Instead of downloading Windows SDK for CI jobs for windows builds
from an external site (wingit.blob.core.windows.net), use the one
created in the windows-build job, to work around quota issues at
the external site.
* js/ci-sdk-download-fix:
ci: avoid pounding on the poor ci-artifacts container
When a binary file gets modified and renamed on both sides of history
to different locations, both files would be written to the working
tree but both would have the contents from "ours". This has been
corrected so that the path from each side gets their original content.
* en/merge-rename-rename-worktree-fix:
merge-recursive: fix rename/rename(1to2) for working tree with a binary
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
95acf11a3d ("diff: restrict when prefetching occurs", 2020-04-07) taught
diff to prefetch blobs in a more limited set of situations. These
limited situations include when the output format requires blob data,
and when inexact rename detection is needed.
There is an existing test case that tests inexact rename detection, but
it also uses an output format that requires blob data, resulting in the
inexact-rename-detection-only code not being tested. Update this test to
use the raw output format, which does not require blob data.
Thanks to Derrick Stolee for noticing this lapse in code coverage and
for doing the preliminary analysis [1].
[1] https://lore.kernel.org/git/853759d3-97c3-241f-98e1-990883cd204e@gmail.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some platforms likes HP-UX, grep(1) doesn't understand "-a".
Let's switch to perl.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Where we explain the 'reapply' command, we don't properly wrap it in
single quote marks like we do with the other commands: We omit the
closing mark ("'reapply") and this ends up being rendered literally as
"'reapply". Add the missing "'".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use list continuation to avoid the second and third paragraphs
rendering with a different indentation from the first one where we
describe the "url" attribute.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first list item follows immediately on the paragraph where we
introduce the list. This makes the "*" render literally as part of one
huge paragraph. (With AsciiDoc, everything is fine after that, but with
Asciidoctor, we get some minor follow-on errors.) Add an empty line --
with a list continuation ("+") -- to make the first list item render ok.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's easy to mix up the possessive "its" and "it's" ("it is"). Correct
an instance of this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The blank line before the lone "+" means it isn't detected as a list
continuation, but instead renders literally, at least with AsciiDoc.
Drop the empty line and, while at it, add a closing period to the
preceding paragraph.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strftime(3) man page is outside of the Git suite. Refererence it as
we do other external man pages and avoid creating a broken link when
generating the HTML documentation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a9f1f1f9f8 ("commit-slab.h: code split", 2018-05-19) split
commit-slab.h into commit-slab-decl.h and commit-slab-impl.h header
files, commit-slab-decl.h were left to use "COMMIT_SLAB_HDR_H",
while commit-slab-impl.h gained its own macro, "COMMIT_SLAB_IMPL_H".
As these two files use different include guards, there is nothing
broken, but let's update commit-slab-decl.h to match the convention
to name the include guard after the filename.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From e76eec3554 (ci: allow per-branch config for GitHub Actions,
2020-05-07), we started to allow contributors decide which branch
they want to build with GitHub Actions
by checking for a file named "ci/config/allow-ref".
In order to assist those contributors,
we provided a sample in "ci/config/allow-refs.sample",
and instructed them to drop the ".sample",
then commit that file to their repository.
We've misspelt the filename in that change.
Let's fix the spelling.
While we're at it, also instruct our contributors introduce that new
file to Git before commit, in case of they've never told Git before.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On hppa these tests crash because the allocated stack space is too
small, even after it was doubled in b9a190789 (and the data size
doubled to match) to make it work on powerpc. For this arch just
skip these tests, which is enough to make the whole suite pass.
Fixes: https://bugs.debian.org/757402
Based-on-patch-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Price <gnprice@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 676eb0c1ce0d380478eb16bdc5a3f2a7bc01c1d2;
as we will be reverting the change to show these extra output
tokens under bash, the pattern would not match anything.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 303775a25f0b4ac5d6ad2e96eb4404c24209cad8;
instead of trying to salvage the tap-breaking change, let's
revert the whole thing for now.
A user of progress API calls start_progress() conditionally and
depends on the display_progress() and stop_progress() functions to
become no-op when start_progress() hasn't been called.
As we added a call to trace2_region_enter() to start_progress(), the
calls to other trace2 API calls from the progress API functions must
make sure that these trace2 calls are skipped when start_progress()
hasn't been called on the progress struct. Specifically, do not
call trace2_region_leave() from stop_progress() when we haven't
called start_progress(), which would have called the matching
trace2_region_enter().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this developer tested how the git-sdk-64-minimal artifact could be
served to all the GitHub workflow runs that need it, Azure Blobs looked
like a pretty good choice: it is reliable, fast and we already use it in
Git for Windows to serve components like OpenSSL, cURL, etc
It came as an unpleasant surprise just _how many_ times this artifact
was downloaded. It exploded the bandwidth to a point where the free tier
would no longer be enough, threatening to block other, essential Git for
Windows services.
Let's switch back to using the Build Artifacts of our trusty Azure
Pipeline for the time being.
To avoid unnecessary hammering of the Azure Pipeline artifacts, we use
the GitHub Action `actions/upload-artifact` in the `windows-build` job
and the GitHub Action `actions/download-artifact` in the `windows-test`
and `vs-test` jobs (the latter now depends on `windows-build` for that
reason, too).
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit b0a5a12a60 ("unpack-trees: allow check_updates() to work on a
different index", 2020-03-27) allowed check_updates() to work on a
different index, but it called get_progress() which was hardcoded to
work on o->result much like check_updates() had been. Update it to also
accept an index parameter and have check_updates() pass that parameter
along so that both are working on the same index.
Noticed-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach codepaths that show progress meter to also use the
start_progress() and the stop_progress() calls as a "region" to be
traced.
* es/trace-log-progress:
trace2: log progress time and throughput
Code cleanup and typofixes
* ds/bloom-cleanup:
completion: offer '--(no-)patch' among 'git log' options
bloom: use num_changes not nr for limit detection
bloom: de-duplicate directory entries
Documentation: changed-path Bloom filters use byte words
bloom: parse commit before computing filters
test-bloom: fix usage typo
bloom: fix whitespace around tab length
"git fsck" ensures that the paths recorded in tree objects are
sorted and without duplicates, but it failed to notice a case where
a blob is followed by entries that sort before a tree with the same
name. This has been corrected.
* rs/fsck-duplicate-names-in-trees:
fsck: report non-consecutive duplicate names in trees
"git p4" learned to recover from a (broken) state where a directory
and a file are recorded at the same path in the Perforce repository
the same way as their clients do.
* ao/p4-d-f-conflict-recover:
git-p4: recover from inconsistent perforce history
"rebase -i" segfaulted when rearranging a sequence that has a
fix-up that applies another fix-up (which may or may not be a
fix-up of yet another step).
* js/rebase-autosquash-double-fixup-fix:
rebase --autosquash: fix a potential segfault
"git bisect replay" had trouble with input files when they used
CRLF line ending, which has been corrected.
* cw/bisect-replay-with-dos:
bisect: allow CRLF line endings in "git bisect replay" input
ccd469450a (date.c: switch to reentrant {gm,local}time_r, 2019-11-28)
removes the only gmtime() call we had and moves to gmtime_r() which
doesn't have the same portability problems.
Remove the compat gmtime code since it is no longer needed, and confirm
by successfull running t4212 in FreeBSD 9.3 amd64 (the oldest I could
get a hold off).
Further work might be needed to ensure 32bit time_t systems (like FreeBSD
i386) will handle correctly the overflows tested in t4212, but that is
orthogonal to this change, and it doesn't change the current behaviour
as neither gmtime() or gmtime_r() will ever return NULL on those systems
because time_t is unsigned.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With a rename/rename(1to2) conflict, we attempt to do a three-way merge
of the file contents, so that the correct contents can be placed in the
working tree at both paths. If the file is a binary, however, no
content merging is possible and we should just use the original version
of the file at each of the paths.
Reported-by: Chunlin Zhang <zhangchunlin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Serving a "git fetch" client over "git://" and "ssh://" protocols
using the on-wire protocol version 2 was buggy on the server end
when the client needs to make a follow-up request to
e.g. auto-follow tags.
* cc/upload-pack-v2-fetch-fix:
upload-pack: clear filter_options for each v2 fetch command
The code to skip unmerged paths in the index when sparse checkout
is in use would have made out-of-bound access of the in-core index
when the last path was unmerged, which has been corrected.
* ds/sparse-updates-oob-access-fix:
unpack-trees: avoid array out-of-bounds error
Instead of always building all branches at GitHub via Actions,
users can specify which branches to build.
* jk/ci-only-on-selected-branches:
ci: allow per-branch config for GitHub Actions
Teach "am", "commit", "merge" and "rebase", when they are run with
the "--quiet" option, to pass "--quiet" down to "gc --auto".
* jc/auto-gc-quiet:
auto-gc: pass --quiet down from am, commit, merge and rebase
auto-gc: extract a reusable helper from "git fetch"
Minor in-code comments and documentation updates around credential
API.
* cb/credential-doc-fixes:
credential: document protocol updates
credential: update gitcredentials documentation
credential: correct order of parameters for credential_match
credential: update description for credential_from_url_gently
The object walk with object filter "--filter=tree:0" can now take
advantage of the pack bitmap when available.
* tb/bitmap-walk-with-tree-zero-filter:
pack-bitmap: pass object filter to fill-in traversal
pack-bitmap.c: support 'tree:0' filtering
pack-bitmap.c: make object filtering functions generic
list-objects-filter: treat NULL filter_options as "disabled"
git-init(1)'s messages is subjected to i18n.
They should be tested by test_i18n* family.
Fix them.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than teaching only one operation, like 'git fetch', how to write
down throughput to traces, we can learn about a wide range of user
operations that may seem slow by adding tooling to the progress library
itself. Operations which display progress are likely to be slow-running
and the kind of thing we want to monitor for performance anyways. By
showing object counts and data transfer size, we should be able to
make some derived measurements to ensure operations are scaling the way
we expect.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Control+BackSpace: Delete word to the left of the cursor.
- Control+Delete : Delete word to the right of the cursor.
Originally introduced by BRIEF and Turbo Vision between 1985 and 1992,
they were adopted by most CUA-Compliant UIs, including those of: OS/2,
Windows, Mac OS, Qt, GTK, Open/Libre Office, Gecko, and GNU Emacs.
In both cases Tk already implements the functionality bound to other key
combination, so we use that.
Graphical examples:
Deleting to the left:
v------ pointer
X_WORD____X
^-----^------ selection
Deleting to the right:
v--------- pointer
X_WORD_X
^--^------ selection
Signed-off-by: Ismael Luceno <ismael.luceno@tttech-auto.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
As diff_tree_oid() computes a diff, it will terminate early if the
total number of changed paths is strictly larger than max_changes.
This includes the directories that changed, not just the file paths.
However, only the file paths are reflected in the resulting diff
queue's "nr" value.
Use the "num_changes" from diffopt to check if the diff terminated
early. This is incredibly important, as it can result in incorrect
filters! For example, the first commit in the Linux kernel repo
reports only 471 changes, but since these are nested inside several
directories they expand to 513 "real" changes, and in fact the
total list of changes is not reported. Thus, the computed filter
for this commit is incorrect.
Demonstrate the subtle difference by using one fewer file change
in the 'get bloom filter for commit with 513 changes' test. Before,
this edited 513 files inside "bigDir" which hit this inequality.
However, dropping the file count by one demonstrates how the
previous inequality was incorrect but the new one is correct.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing a changed-path Bloom filter, we need to take the
files that changed from the diff computation and extract the parent
directories. That way, a directory pathspec such as "Documentation"
could match commits that change "Documentation/git.txt".
However, the current code does a poor job of this process. The paths
are added to a hashmap, but we do not check if an entry already
exists with that path. This can create many duplicate entries and
cause the filter to have a much larger length than it should. This
means that the filter is more sparse than intended, which helps the
false positive rate, but wastes a lot of space.
Properly use hashmap_get() before hashmap_add(). Also be sure to
include a comparison function so these can be matched correctly.
This has an effect on a test in t0095-bloom.sh. This makes sense,
there are ten changes inside "smallDir" so the total number of
paths in the filter should be 11. This would result in 11 * 10 bits
required, and with 8 bits per byte, this results in 14 bytes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Documentation/technical/commit-graph-format.txt, the definition
of the BIDX chunk specifies the length is a number of 8-byte words.
During development we discovered that using 8-byte words in the
Murmur3 hash algorithm causes issues with big-endian versus little-
endian machines. Thus, the hash algorithm was adapted to work on a
byte-by-byte basis. However, this caused a change in the definition
of a "word" in bloom.h. Now, a "word" is a single byte, which allows
filters to be as small as two bytes. These length-two filters are
demonstrated in t0095-bloom.sh, and a larger filter of length 25 is
demonstrated as well.
The original point of using 8-byte words was for alignment reasons.
It also presented opportunities for extremely sparse Bloom filters
when there were a small number of changes at a commit, creating a
very low false-positive rate. However, modifying the format at this
point is unlikely to be a valuable exercise. Also, this use of
single-byte granularity does present opportunities to save space.
It is unclear if 8-byte alignment of the filters would present any
meaningful performance benefits.
Modify the format document to reflect reality.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing changed-path Bloom filters for a commit, we need to
know if the commit has a parent or not. If the commit is not parsed,
then its parent pointer will be NULL.
As far as I can tell, the only opportunity to reach this code
without parsing the commit is inside "test-tool bloom
get_filter_for_commit" but it is best to be safe.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tree entries are sorted in path order, meaning that directory names get
a slash ('/') appended implicitly. Git fsck checks if trees contains
consecutive duplicates, but due to that ordering there can be
non-consecutive duplicates as well if one of them is a directory and the
other one isn't. Such a tree cannot be fully checked out.
Find these duplicates by recording candidate file names on a stack and
check candidate directory names against that stack to find matches.
Suggested-by: Brandon Williams <bwilliamseng@gmail.com>
Original-test-by: Brandon Williams <bwilliamseng@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce allows you commit files and directories with the same name,
so you could have files //depot/foo and //depot/foo/bar both checked
in. A p4 sync of a repository in this state fails. Deleting one of
the files recovers the repository.
When this happens we want git-p4 to recover in the same way as
perforce.
Note that Perforce has this change in their 2017.1 version:
Bugs fixed in 2017.1
#1489051 (Job #2170) **
Submitting a file with the same name as an existing depot
directory path (or vice versa) will now be rejected.
so people hopefully will not creating damaged Perforce repos
anymore, but "git p4" needs to be able to interact with already
corrupt ones.
Signed-off-by: Andrew Oakley <andrew@adoakley.name>
Reviewed-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When selecting a batch of pack-files to repack in the "git
multi-pack-index repack" command, Git should respect the
repack.packKeptObjects config option. When false, this option says that
the pack-files with an associated ".keep" file should not be repacked.
This config value is "false" by default.
There are two cases for selecting a batch of objects. The first is the
case where the input batch-size is zero, which specifies "repack
everything". The second is with a non-zero batch size, which selects
pack-files using a greedy selection criteria. Both of these cases are
updated and tested.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the "repack" subcommand of "git multi-pack-index" command
creates new packfile(s), it does not call the "git repack"
command but instead directly calls the "git pack-objects"
command, and the configuration variables meant for the "git
repack" command, like "repack.usedaeltabaseoffset", are ignored.
Check the configuration variables used by "git repack" ourselves
in "git multi-index-pack" and pass the corresponding options to
underlying "git pack-objects".
Note that `repack.writeBitmaps` configuration is ignored, as the
pack bitmap facility is useful only with a single packfile.
Signed-off-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rearranging the todo list so that the fixups/squashes are reordered
just after the commits they intend to fix up, we use two arrays to
maintain that list: `next` and `tail`.
The idea is that `next[i]`, if set to a non-negative value, contains the
index of the item that should be rearranged just after the `i`th item.
To avoid having to walk the entire `next` chain when appending another
fixup/squash, we also store the end of the `next` chain in `tail[i]`.
The logic we currently use to update these array items is based on the
assumption that given a fixup/squash item at index `i`, we just found
the index `i2` indicating the first item in that fixup chain.
However, as reported by Paul Ganssle, that need not be true: the special
form `fixup! <commit-hash>` is allowed to point to _another_ fixup
commit in the middle of the fixup chain.
Example:
* 0192a To fixup
* 02f12 fixup! To fixup
* 03763 fixup! To fixup
* 04ecb fixup! 02f12
Note how the fourth commit targets the second commit, which is already a
fixup that targets the first commit.
Previously, we would update `next` and `tail` under our assumption that
every `fixup!` commit would find the start of the `fixup!`/`squash!`
chain. This would lead to a segmentation fault because we would actually
end up with a `next[i]` pointing to a `fixup!` but the corresponding
`tail[i]` pointing nowhere, which would the lead to a segmentation
fault.
Let's fix this by _inserting_, rather than _appending_, the item. In
other words, if we make a given line successor of another line, we do
not simply forget any previously set successor of the latter, but make
it a successor of the former.
In the above example, at the point when we insert 04ecb just after
02f12, 03763 would already be recorded as a successor of 04ecb, and we
now "squeeze in" 04ecb.
To complete the idea, we now no longer assume that `next[i]` pointing to
a line means that `last[i]` points to a line, too. Instead, we extend
the concept of `last` to cover also partial `fixup!`/`squash!` chains,
i.e. chains starting in the middle of a larger such chain.
In the above example, after processing all lines, `last[0]`
(corresponding to 0192a) would point to 03763, which indeed is the end
of the overall `fixup!` chain, and `last[1]` (corresponding to 02f12)
would point to 04ecb (which is the last `fixup!` targeting 02f12, but it
has 03763 as successor, i.e. it is not the end of overall `fixup!`
chain).
Reported-by: Paul Ganssle <paul@ganssle.io>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent change to show files and line numbers of a breakage during
test (only available when running the tests with bash) were hurting
other shells with syntax errors, which has been corrected.
* cb/test-bash-lineno-fix:
t/test_lib: avoid naked bash arrays in file_lineno
The basic test did not honor $TEST_SHELL_PATH setting, which has
been corrected.
* cb/t0000-use-the-configured-shell:
t/t0000-basic: make sure subtests also use TEST_SHELL_PATH
The <stdlib.h> header on NetBSD brings in its own definition of
hmac() function (eek), which conflicts with our own and unrelated
function with the same name. Our function has been renamed to work
around the issue.
* cb/avoid-colliding-with-netbsd-hmac:
builtin/receive-pack: avoid generic function name hmac()
"git restore --staged --worktree" now defaults to take the contents
out of "HEAD", instead of erring out.
* es/restore-staged-from-head-by-default:
restore: default to HEAD when combining --staged and --worktree
The coding guideline for shell scripts instructed to refer to a
variable with dollar-sign inside arithmetic expansion to work
around a bug in old versions of dash, which is a thing of the past.
Now we are not forbidden from writing $((var+1)).
* jk/arith-expansion-coding-guidelines:
CodingGuidelines: drop arithmetic expansion advice to use "$x"
The sparse-checkout patterns have been forbidden from excluding all
paths, leaving an empty working tree, for a long time. This
limitation has been lifted.
* ds/sparse-allow-empty-working-tree:
sparse-checkout: stop blocking empty workdirs
"git branch" and other "for-each-ref" variants accepted multiple
--sort=<key> options in the increasing order of precedence, but it
had a few breakages around "--ignore-case" handling, and tie-breaking
with the refname, which have been fixed.
* jk/for-each-ref-multi-key-sort-fix:
ref-filter: apply fallback refname sort only after all user sorts
ref-filter: apply --ignore-case to all sorting keys
The samples in the credential documentation has been updated to
make it clear that we depict what would appear in the .git/config
file, by adding appropriate quotes as needed..
* jk/credential-sample-update:
gitcredentials(7): make shell-snippet example more realistic
gitcredentials(7): clarify quoting of helper examples
With the recent tightening of the code that is used to parse
various parts of a URL for use in the credential subsystem, a
hand-edited credential-store file causes the credential helper to
die, which is a bit too harsh to the users. Demote the error
behaviour to just ignore and keep using well-formed lines instead.
* cb/credential-store-ignore-bogus-lines:
credential-store: ignore bogus lines from store file
credential-store: document the file format a bit more
In error messages that "git switch" mentions its option to create a
new branch, "-b/-B" options were shown, where "-c/-C" options
should be, which has been corrected.
* dl/switch-c-option-in-error-message:
switch: fix errors and comments related to -c and -C
Because of the request/response model of protocol v2, the
upload_pack_v2() function is sometimes called twice in the same
process, while 'struct list_objects_filter_options filter_options'
was declared as static at the beginning of 'upload-pack.c'.
This made the check in list_objects_filter_die_if_populated(), which
is called by process_args(), fail the second time upload_pack_v2() is
called, as filter_options had already been populated the first time.
To fix that, filter_options is not static any more. It's now owned
directly by upload_pack(). It's now also part of 'struct
upload_pack_data', so that it's owned indirectly by upload_pack_v2().
In the long term, the goal is to also have upload_pack() use
'struct upload_pack_data', so adding filter_options to this struct
makes more sense than to have it owned directly by upload_pack_v2().
This fixes the first of the 2 bugs documented by d0badf8797
(partial-clone: demonstrate bugs in partial fetch, 2020-02-21).
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loop in warn_conflicted_path() that checks for the count of
entries with the same path uses "i+count" for the array
entry. However, the loop only verifies that the value of count is
below the array size. Fix this by adding i to the condition.
I hit this condition during a test of the in-tree sparse-checkout
feature, so it is exercised by the end of the series.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
[jc: readability fix]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We advertise that the bisect log can be corrected in your editor
before being fed to "git bisect replay", but some editors may
turn the line endings to CRLF.
Update the parser of the input lines so that the CR at the end of
the line gets ignored.
Were anyone to intentionally be using terms/revs with embedded CRs,
replaying such bisects will no longer work with this change. I suspect
that this is incredibly rare.
Signed-off-by: Christopher Warrington <chwarr@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to
'submodule--helper.c' and call the latter via 'git-submodule.sh'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally a failure a user is seeing may be related to a specific
hook which is being run, perhaps without the user realizing. While the
contents of hooks can be sensitive - containing user data or process
information specific to the user's organization - simply knowing that a
hook is being run at a certain stage can help us to understand whether
something is going wrong.
Without a definitive list of hook names within the code, we compile our
own list from the documentation. This is likely prone to bitrot, but
designing a single source of truth for acceptable hooks is too much
overhead for this small change to the bugreport tool.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* We need a `final_new_line` to make our source code as text file, per
POSIX and C specification.
* `bloom_filters` should be limited to interal linkage only
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document protocol changes after CVE-2020-11008, including the removal of
references to the override of attributes which is no longer recommended
after CVE-2020-5260 and that might be removed in the future.
While at it do some improvements for clarity and consistency.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the expected effect of all attributes and how the helpers
are expected to handle them and the context where they operate.
While at it, space the descriptions for clarity, and add a paragraph
mentioning the early termination in the list processing of helpers,
to complement the one about the special "quit" attribute.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
662f9cf154 (tests: when run in Bash, annotate test failures with file
name/line number, 2020-04-11), introduces a way to report the location
(file:lineno) of a failed test case by traversing the bash callstack.
The implementation requires bash and uses shell arrays and is therefore
protected by a guard but NetBSD sh will still have to parse the function
and therefore will result in:
** t0000-basic.sh ***
./test-lib.sh: 681: Syntax error: Bad substitution
Enclose the bash specific code inside an eval to avoid parsing errors in
the same way than 5826b7b595 (test-lib: check Bash version for '-x'
without using shell arrays, 2019-01-03)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3f824e91c8 (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08) allows for
setting a shell for running the tests, but the generated subtests weren't
updated.
Correct that and while at it update it to use write_script.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on the workflows of individual developers, it can either be
convenient or annoying that our GitHub Actions CI jobs are run on every
branch. As an example of annoying: if you carry many half-finished
work-in-progress branches and rebase them frequently against master,
you'd get tons of failure reports that aren't interesting (not to
mention the wasted CPU).
This commit adds a new job which checks a special branch within the
repository for CI config, and then runs a shell script it finds there to
decide whether to skip the rest of the tests. The default will continue
to run tests for all refs if that branch or script is missing.
There have been a few alternatives discussed:
One option is to carry information in the commit itself about whether it
should be tested, either in the tree itself (changing the workflow YAML
file) or in the commit message (a "[skip ci]" flag or similar). But
these are frustrating and error-prone to use:
- you have to manually apply them to each branch that you want to mark
- it's easy for them to leak into other workflows, like emailing patches
We could likewise try to get some information from the branch name. But
that leads to debates about whether the default should be "off" or "on",
and overriding still ends up somewhat awkward. If we default to "on",
you have to remember to name your branches appropriately to skip CI. And
if "off", you end up having to contort your branch names or duplicate
your pushes with an extra refspec.
By comparison, this commit's solution lets you specify your config once
and forget about it, and all of the data is off in its own ref, where it
can be changed by individual forks without touching the main tree.
There were a few design decisions that came out of on-list discussion.
I'll summarize here:
- we could use GitHub's API to retrieve the config ref, rather than a
real checkout (and then just operate on it via some javascript). We
still have to spin up a VM and contact GitHub over the network from
it either way, so it ends up not being much faster. I opted to go
with shell to keep things similar to our other tools (and really
could implement allow-refs in any language you want). This also makes
it easy to test your script locally, and to modify it within the
context of a normal git.git tree.
- we could keep the well-known refname out of refs/heads/ to avoid
cluttering the branch namespace. But that makes it awkward to
manipulate. By contrast, you can just "git checkout ci-config" to
make changes.
- we could assume the ci-config ref has nothing in it except config
(i.e., a branch unrelated to the rest of git.git). But dealing with
orphan branches is awkward. Instead, we'll do our best to efficiently
check out only the ci/config directory using a shallow partial clone,
which allows your ci-config branch to be just a normal branch, with
your config changes on top.
- we could provide a simpler interface, like a static list of ref
patterns. But we can't get out of spinning up a whole VM anyway, so
we might as well use that feature to make the config as flexible as
possible. If we add more config, we should be able to reuse our
partial-clone to set more outputs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These commands take the --quiet option for their own operation, but
they forget to pass the option down when they invoke "git gc --auto"
internally.
Teach them to do so using the run_auto_gc() helper we added in the
previous step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1991006c (fetch: convert argv_gc_auto to struct argv_array,
2014-08-16), we taught "git fetch --quiet" to pass the "--quiet"
option down to "gc --auto". This issue, however, is not limited to
"fetch":
$ git grep -e 'gc.*--auto' \*.c
finds hits in "am", "commit", "merge", and "rebase" and these
commands do not pass "--quiet" down to "gc --auto" when they
themselves are told to be quiet.
As a preparatory step, let's introduce a helper function
run_auto_gc(), that the caller can pass a boolean "quiet",
and redo the fix to "git fetch" using the helper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In two tests introduced by 4fa3f00abb ("fetch-pack: in protocol v2,
in_vain only after ACK", 2020-04-28) and 2f0a093dd6 ("fetch-pack: in
protocol v2, reset in_vain upon ACK", 2020-04-28), the count of objects
downloaded is checked by grepping for a specific message in the packet
trace. However, this is flaky as that specific message may be delivered
over 2 or more packet lines.
Instead, grep over stderr, just like the "fetch creating new shallow
root" test in the same file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an issue in 'Common Issues' section which addresses the confusion
between performing a 'fetch' and a 'pull'.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
gitcredentials(7) already mentions several possible invocations that one
can use as the value for credential.helper. However, many people are
not aware that there are other options than a simple credential helper
name, so let's place some explanatory text in the documentation for
credential.helper as well.
We still refer the user to gitcredential(7) for additional explanations
and helpful examples.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add issue in 'Common Issues' section which addresses the problem of
Git tracking files/paths mentioned in '.gitignore'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In documentation pertaining to autostash behavior, we refer to the
"stash reflog". This description is too low-level as the reflog refers
to an implementation detail of how the stash works and, for end-users,
they do not need to be aware of this at all.
Change references of "stash reflog" to "stash list", which should
provide more accessible terminology for end-users.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same as js/partial-urlmatch-2.17, built on more recent codebase
to avoid unnecessary merge conflicts.
* js/partial-urlmatch:
credential: handle `credential.<partial-URL>.<key>` again
credential: optionally allow partial URLs in credential_from_url_gently()
Recent updates broke parsing of "credential.<url>.<key>" where
<url> is not a full URL (e.g. [credential "https://"] helper = ...)
stopped working, which has been corrected.
* js/partial-urlmatch-2.17:
credential: handle `credential.<partial-URL>.<key>` again
credential: optionally allow partial URLs in credential_from_url_gently()
credential: fix grammar
Some of the files commit-graph subsystem keeps on disk did not
correctly honor the core.sharedRepository settings and some were
left read-write.
* tb/commit-graph-perm-bits:
commit-graph.c: make 'commit-graph-chain's read-only
commit-graph.c: ensure graph layers respect core.sharedRepository
commit-graph.c: write non-split graphs as read-only
lockfile.c: introduce 'hold_lock_file_for_update_mode'
tempfile.c: introduce 'create_tempfile_mode'
The approxidate parser learns to parse seconds with fraction.
* dd/iso-8601-updates:
date.c: allow compact version of ISO-8601 datetime
date.c: skip fractional second part of ISO-8601
date.c: validate and set time in a helper function
date.c: s/is_date/set_date/
Update the parser used for credential.<URL>.<variable>
configuration, to handle <URL>s with '/' in them correctly.
* bc/wildcard-credential:
credential: fix matching URLs with multiple levels in path
By default, files are restored from the index for --worktree, and from
HEAD for --staged. When --worktree and --staged are combined, --source
must be specified to disambiguate the restore source[1], thus making it
cumbersome to restore a file in both the worktree and the index.
However, HEAD is also a reasonable default for --worktree when combined
with --staged, so make it the default anytime --staged is used (whether
combined with --worktree or not).
[1]: Due to an oversight, the --source requirement, though documented,
is not actually enforced.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fabec2c5c3 (builtin/receive-pack: switch to use the_hash_algo, 2019-08-18)
renames hmac_sha1 to hmac, as it was updated to use the hash function used
by git (which won't be sha1 in the future).
hmac() is provided by NetBSD >= 8 libc and therefore conflicts as shown by :
builtin/receive-pack.c:421:13: error: conflicting types for 'hmac'
static void hmac(unsigned char *out,
^~~~
In file included from ./git-compat-util.h:172:0,
from ./builtin.h:4,
from builtin/receive-pack.c:1:
/usr/include/stdlib.h:305:10: note: previous declaration of 'hmac' was here
ssize_t hmac(const char *, const void *, size_t, const void *, size_t, void *,
^~~~
Rename it again to hmac_hash to reflect it will use the git's defined hash
function and avoid the conflict, while at it update a comment to better
describe the HMAC function that was used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the beginning in 118250728e (credential: apply helper config,
2011-12-10), the declaration for that function used a different order
than the implementation.
All callers use the same order than the implementation, so update
the declaration in credential.h to match.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c44088ecc4 (credential: treat URL without scheme as invalid, 2020-04-18)
changes the implementation for this function to return -1 if protocol is
missing.
Update blurb to match implementation.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).
And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.
But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------
[...]
5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2%
The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patch, we made it easy to define other filters that
exclude all objects of a certain type. Use that in order to implement
bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is
equal to 0.
The general case is not helped by bitmaps, since for values of 'n > 0',
the object filtering machinery requires a full-blown tree traversal in
order to determine the depth of a given tree. Caching this is
non-obvious, too, since the same tree object can have a different depth
depending on the context (e.g., a tree was moved up in the directory
hierarchy between two commits).
But, the 'n = 0' case can be helped, and this patch does so. Running
p5310.11 in this tree and on master with the kernel, we can see that
this case is helped substantially:
Test master this tree
--------------------------------------------------------------------------------
5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4%
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4f3bd5606a (pack-bitmap: implement BLOB_NONE filtering, 2020-02-14),
filtering support for bitmaps was added for the 'LOFC_BLOB_NONE' filter.
In the future, we would like to add support for filters that behave as
if they exclude a certain type of object, for e.g., the tree depth
filter with depth 0.
To prepare for this, make some of the functions used for filtering more
generic, such as 'find_tip_blobs' and 'filter_bitmap_blob_none' so that
they can work over arbitrary object types.
To that end, create 'find_tip_objects' and
'filter_bitmap_exclude_type', and redefine the aforementioned functions
in terms of those.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most callers, we have an actual list_objects_filter_options struct,
and if no filtering is desired its "choice" element will be
LOFC_DISABLED. However, some code may have only a pointer to such a
struct which may be NULL (because _their_ callers didn't care about
filtering, either). Rather than forcing them to handle this explicitly
like:
if (filter_options)
traverse_commit_list_filtered(filter_options, revs,
show_commit, show_object,
show_data, NULL);
else
traverse_commit_list(revs, show_commit, show_object,
show_data);
let's just treat a NULL filter_options the same as LOFC_DISABLED. We
only need a small change, since that option struct is converted into a
real filter only in the "init" function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fuzzer running on the entry point provided by fuzz-commit-graph.c
revealed a memory leak when parse_commit_graph() creates a struct
bloom_filter_settings and then returns early due to error. Fix that
error by always freeing that struct first (if it exists) before
returning early due to error.
While making that change, I also noticed another possible memory leak -
when the BLOOMDATA chunk is provided but not BLOOMINDEXES. Also fix that
error.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 9e468334b4 (ref-filter: fallback on alphabetical comparison,
2015-10-30) taught ref-filter's sort to fallback to comparing refnames.
But it did it at the wrong level, overriding the comparison result for a
single "--sort" key from the user, rather than after all sort keys have
been exhausted.
This worked correctly for a single "--sort" option, but not for multiple
ones. We'd break any ties in the first key with the refname and never
evaluate the second key at all.
To make matters even more interesting, we only applied this fallback
sometimes! For a field like "taggeremail" which requires a string
comparison, we'd truly return the result of strcmp(), even if it was 0.
But for numerical "value" fields like "taggerdate", we did apply the
fallback. And that's why our multiple-sort test missed this: it uses
taggeremail as the main comparison.
So let's start by adding a much more rigorous test. We'll have a set of
commits expressing every combination of two tagger emails, dates, and
refnames. Then we can confirm that our sort is applied with the correct
precedence, and we'll be hitting both the string and value comparators.
That does show the bug, and the fix is simple: moving the fallback to
the outer compare_refs() function, after all ref_sorting keys have been
exhausted.
Note that in the outer function we don't have an "ignore_case" flag, as
it's part of each individual ref_sorting element. It's debatable what
such a fallback should do, since we didn't use the user's keys to match.
But until now we have been trying to respect that flag, so the
least-invasive thing is to try to continue to do so. Since all callers
in the current code either set the flag for all keys or for none, we can
just pull the flag from the first key. In a hypothetical world where the
user really can flip the case-insensitivity of keys separately, we may
want to extend the code to distinguish that case from a blanket
"--ignore-case".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the ref-filter users (for-each-ref, branch, and tag) take an
--ignore-case option which makes filtering and sorting case-insensitive.
However, this option was applied only to the first element of the
ref_sorting list. So:
git for-each-ref --ignore-case --sort=refname
would do what you expect, but:
git for-each-ref --ignore-case --sort=refname --sort=taggername
would sort the primary key (taggername) case-insensitively, but sort the
refname case-sensitively. We have two options here:
- teach callers to set ignore_case on the whole list
- replace the ref_sorting list with a struct that contains both the
list of sorting keys, as well as options that apply to _all_
keys
I went with the first one here, as it gives more flexibility if we later
want to let the users set the flag per-key (presumably through some
special syntax when defining the key; for now it's all or nothing
through --ignore-case).
The new test covers this by sorting on both tagger and subject
case-insensitively, which should compare "a" and "A" identically, but
still sort them before "b" and "B". We'll break ties by sorting on the
refname to give ourselves a stable output (this is actually supposed to
be done automatically, but there's another bug which will be fixed in
the next commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the error condition when updating the sparse-checkout leaves
an empty working directory.
This behavior was added in 9e1afb167 (sparse checkout: inhibit empty
worktree, 2009-08-20). The comment was added in a7bc906f2 (Add
explanation why we do not allow to sparse checkout to empty working
tree, 2011-09-22) in response to a "dubious" comment in 84563a624
(unpack-trees.c: cosmetic fix, 2010-12-22).
With the recent "cone mode" and "git sparse-checkout init [--cone]"
command, it is common to set a reasonable sparse-checkout pattern
set of
/*
!/*/
which matches only files at root. If the repository has no such files,
then their "git sparse-checkout init" command will fail.
Now that we expect this to be a common pattern, we should not have the
commands fail on an empty working directory. If it is a confusing
result, then the user can recover with "git sparse-checkout disable"
or "git sparse-checkout set". This is especially simple when using cone
mode.
Reported-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice to use "$x" rather than "x" in arithmetric expansion was
working around a dash bug fixed in 0.5.4. Even Debian oldstable has
0.5.8 these days. And in the meantime, we've added almost two dozen
instances of the "x" form which you can find with:
git grep '$(([a-z]'
and nobody seems to have complained. Let's declare this workaround
obsolete and simplify our style guide.
Helped-by: Danh Doan <congdanhqx@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the added checks for invalid URLs in credentials, any locally
modified store files which might have empty lines or even comments
were reported[1] failing to parse as valid credentials.
Instead of doing a hard check for credentials, do a soft one and
therefore avoid the reported fatal error.
While at it add tests for all known corruptions that are currently
ignored to keep track of them and avoid the risk of regressions.
[1] https://stackoverflow.com/a/61420852/5005936
Reported-by: Dirk <dirk@ed4u.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's typical to find Markdown documentation alongside source code, and
having better context for documentation changes is useful; see also
commit 69f9c87d4 (userdiff: add support for Fountain documents,
2015-07-21).
The pattern is based on the CommonMark specification 0.29, section 4.2
<https://spec.commonmark.org/> but doesn't match empty headings, as
seeing them in a hunk header is unlikely to be useful.
Only ATX headings are supported, as detecting setext headings would
require printing the line before a pattern matches, or matching a
multiline pattern. The word-diff pattern is the same as the pattern for
HTML, because many Markdown parsers accept inline HTML.
Signed-off-by: Ash Holland <ash@sorrel.sh>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The upload-pack protocol v2 gave up too early before finding a
common ancestor, resulting in a wasteful fetch from a fork of a
project. This has been corrected to match the behaviour of v0
protocol.
* jt/v2-fetch-nego-fix:
fetch-pack: in protocol v2, reset in_vain upon ACK
fetch-pack: in protocol v2, in_vain only after ACK
fetch-pack: return enum from process_acks()
Error and verbose trace messages from "git push" did not redact
credential material embedded in URLs.
* js/anonymise-push-url-in-errors:
push: anonymize URLs in error messages and warnings
The "bugreport" tool.
* es/bugreport:
bugreport: drop extraneous includes
bugreport: add compiler info
bugreport: add uname info
bugreport: gather git version and build info
bugreport: add tool to generate debugging info
help: move list_config_help to builtin/help
Incompatible options "--root" and "--fork-point" of "git rebase"
have been marked and documented as being incompatible.
* en/rebase-root-and-fork-point-are-incompatible:
rebase: display an error if --root and --fork-point are both provided
Recent update to Homebrew used by macOS folks breaks build by
moving gettext library and necessary headers.
* ds/build-homebrew-gettext-fix:
macOS/brew: let the build find gettext headers/libraries/msgfmt
Compilation fix.
* dd/sparse-fixes:
progress.c: silence cgcc suggestion about internal linkage
graph.c: limit linkage of internal variable
compat/regex: move stdlib.h up in inclusion chain
test-parse-pathspec-file.c: s/0/NULL/ for pointer type
The multi-pack-index left mmapped file descriptors open when it
does not have to.
* ds/multi-pack-index:
multi-pack-index: close file descriptor after mmap
"git blame" learns to take advantage of the "changed-paths" Bloom
filter stored in the commit-graph file.
* ds/blame-on-bloom:
test-bloom: check that we have expected arguments
test-bloom: fix some whitespace issues
blame: drop unused parameter from maybe_changed_path
blame: use changed-path Bloom filters
tests: write commit-graph with Bloom filters
revision: complicated pathspecs disable filters
Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.
* gs/commit-graph-path-filter:
bloom: ignore renames when computing changed paths
commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
t4216: add end to end tests for git log with Bloom filters
revision.c: add trace2 stats around Bloom filter usage
revision.c: use Bloom filters to speed up path based revision walks
commit-graph: add --changed-paths option to write subcommand
commit-graph: reuse existing Bloom filters during write
commit-graph: write Bloom filters to commit graph file
commit-graph: examine commits by generation number
commit-graph: examine changed-path objects in pack order
commit-graph: compute Bloom filters for changed paths
diff: halt tree-diff early after max_changes
bloom.c: core Bloom filter implementation for changed paths.
bloom.c: introduce core Bloom filter constructs
bloom.c: add the murmur3 hash implementation
commit-graph: define and use MAX_NUM_CHUNKS
The commit-graph code exhausted file descriptors easily when it
does not have to.
* tb/commit-graph-fd-exhaustion-fix:
commit-graph: close descriptors after mmap
commit-graph.c: gracefully handle file descriptor exhaustion
t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests
commit-graph.c: don't use discarded graph_name in error
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
Tighten "git mailinfo" to notice and error out when decoded result
contains NUL in it.
* dd/mailinfo-with-nul:
mailinfo: disallow NUL character in mail's header
mailinfo.c: avoid strlen on strings that can contains NUL
t4254: merge 2 steps of a single test
Test clean-up.
* dl/test-must-fail-fixes-4:
t9819: don't use test_must_fail with p4
t9164: use test_must_fail only on git commands
t9160: use test_path_is_missing()
t9141: use test_path_is_missing()
t7508: don't use `test_must_fail test_cmp`
t7408: replace incorrect uses of test_must_fail
t6030: use test_path_is_missing()
The build procedure did not use the libcurl library and its include
files correctly for a custom-built installation.
* jk/build-with-right-curl:
Makefile: avoid running curl-config unnecessarily
Makefile: use curl-config --cflags
Makefile: avoid running curl-config multiple times
Fix alignment issues that were likely introduced due to an editor
using tab lengths of 4 instead of 8.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's an example of using your own bit of shell to act as a credential
helper, but it's not very realistic:
- It's stupid to hand out your secret password to _every_ host. In the
real world you'd use the config-matcher to limit it to a particular
host.
- We never provided a username. We can easily do that in another config
option (you can do it in the helper, too, but this is much more
readable).
- We were sending the secret even for store/erase operations. This
is OK because Git would just ignore it, but a real system would
probably be unlocking a password store, which you wouldn't want to do
more than necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We give several helper config examples, but don't make clear that these
are raw values. It's up to the user to add the appropriate quoting to
put them into a config file (either by running with "git config" and
quoting against the shell, or by adding double-quotes as appropriate
within the git-config file).
Let's flesh them out as full config blocks, which makes the syntax more
clear (and makes it possible for people to just cut-and-paste them as a
starting point). I added double-quotes to any values larger than a
single word. That isn't strictly necessary in all cases, but it
sidesteps explaining the rules about exactly when you need to quote a
value.
The existing quotes can be converted to single-quotes in one instance,
and backslash-esccaped in the other. I also swapped out backticks for
our preferred $().
Reported-by: douglas.fuller@gmail.com
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'commit_shallow_file()' and 'rollback_shallow_file()' were
introduced, they did not have a documenting comment, when they could
have benefited from one.
Add a brief note about what these functions do, and make a special note
that they reset stat-validity checks.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, some functions will be moved from 'commit.c' to have
prototypes in a new 'shallow.h' and their implementations in
'shallow.c'.
Three functions in 'commit.c' use 'commit_graft_pos()' (they are
'register_commit_graft()', 'lookup_commit_graft()', and
'unregister_shallow()'). The first two of these will stay in 'commit.c',
but the latter will move to 'shallow.c', and thus needs
'commit_graft_pos' to be non-static.
Prepare for that by making 'commit_graft_pos' non-static so that it can
be called from both 'commit.c' and 'shallow.c'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d787d311db (checkout: split part of it to new command 'switch',
2019-03-29), the `git switch` command was created by extracting the
common functionality of cmd_checkout() in checkout_main(). However, in
b7b5fce270 (switch: better names for -b and -B, 2019-03-29), the branch
creation and force creation options for 'switch' were changed to -c and
-C, respectively. As a result of this, error messages and comments that
previously referred to `-b` and `-B` became invalid for `git switch`.
For error messages that refer to `-b` and `-B`, use a format string
instead so that `-c` and `-C` can be printed when `git switch` is
invoked.
Reported-by: Robert Simpson
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref --stdin" learned a handful of new verbs to let the
user control ref update transactions more explicitly, which helps
as an ingredient to implement two-phase commit-style atomic
ref-updates across multiple repositories.
* ps/transactional-update-ref-stdin:
update-ref: implement interactive transaction handling
update-ref: read commands in a line-wise fashion
update-ref: move transaction handling into `update_refs_stdin()`
update-ref: pass end pointer instead of strbuf
update-ref: drop unused argument for `parse_refname`
update-ref: organize commands in an array
strbuf: provide function to append whole lines
git-update-ref.txt: add missing word
refs: fix segfault when aborting empty transaction
The directory traversal code had redundant recursive calls which
made its performance characteristics exponential with respect to
the depth of the tree, which was corrected.
* en/fill-directory-exponential:
completion: fix 'git add' on paths under an untracked directory
Fix error-prone fill_directory() API; make it only return matches
dir: replace double pathspec matching with single in treat_directory()
dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory()
dir: replace exponential algorithm with a linear one
dir: refactor treat_directory to clarify control flow
dir: fix confusion based on variable tense
dir: fix broken comment
dir: consolidate treat_path() and treat_one_path()
dir: fix simple typo in comment
t3000: add more testcases testing a variety of ls-files issues
t7063: more thorough status checking
"sparse-checkout" UI improvements.
* en/sparse-checkout:
sparse-checkout: provide a new reapply subcommand
unpack-trees: failure to set SKIP_WORKTREE bits always just a warning
unpack-trees: provide warnings on sparse updates for unmerged paths too
unpack-trees: make sparse path messages sound like warnings
unpack-trees: split display_error_msgs() into two
unpack-trees: rename ERROR_* fields meant for warnings to WARNING_*
unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier
sparse-checkout: use improved unpack_trees porcelain messages
sparse-checkout: use new update_sparsity() function
unpack-trees: add a new update_sparsity() function
unpack-trees: pull sparse-checkout pattern reading into a new function
unpack-trees: do not mark a dirty path with SKIP_WORKTREE
unpack-trees: allow check_updates() to work on a different index
t1091: make some tests a little more defensive against failures
unpack-trees: simplify pattern_list freeing
unpack-trees: simplify verify_absent_sparse()
unpack-trees: remove unused error type
unpack-trees: fix minor typo in comment
Update the CI configuration to use GitHub Actions, retiring the one
based on Azure Pipelines.
* dd/ci-swap-azure-pipelines-with-github-actions:
ci: let GitHub Actions upload failed tests' directories
ci: add a problem matcher for GitHub Actions
tests: when run in Bash, annotate test failures with file name/line number
ci: retire the Azure Pipelines definition
README: add a build badge for the GitHub Actions runs
ci: configure GitHub Actions for CI/PR
ci: run gem with sudo to install asciidoctor
ci: explicit install all required packages
ci: fix the `jobname` of the `GETTEXT_POISON` job
ci/lib: set TERM environment variable if not exist
ci/lib: allow running in GitHub Actions
ci/lib: if CI type is unknown, show the environment variables
A new CI job to build and run test suite on linux with musl libc
has been added.
* dd/ci-musl-libc:
travis: build and test on Linux with musl libc and busybox
ci/linux32: libify install-dependencies step
ci: refactor docker runner script
ci/linux32: parameterise command to switch arch
ci/lib-docker: preserve required environment variables
ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
The stash entry created by "git rebase --autosquash" to keep the
initial dirty state were discarded by mistake upon "git rebase
--quit", which has been corrected.
* dl/merge-autostash-rebase-quit-fix:
rebase: save autostash entry into stash reflog on --quit
In a previous commit, we made incremental graph layers read-only by
using 'git_mkstemp_mode' with permissions '0444'.
There is no reason that 'commit-graph-chain's should be modifiable by
the user, since they are generated at a temporary location and then
atomically renamed into place.
To ensure that these files are read-only, too, use
'hold_lock_file_for_update_mode' with the same read-only permission
bits, and let the umask and 'adjust_shared_perm' take care of the rest.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-layered commit-graphs use 'adjust_shared_perm' to make the
commit-graph file readable (or not) to a combination of the user, group,
and others.
Call 'adjust_shared_perm' for split-graph layers to make sure that these
also respect 'core.sharedRepository'. The 'commit-graph-chain' file
already respects this configuration since it uses
'hold_lock_file_for_update' (which calls 'adjust_shared_perm' eventually
in 'create_tempfile_mode').
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, Git learned 'hold_lock_file_for_update_mode' to
allow the caller to specify the permission bits (prior to further
adjustment by the umask and shared repository permissions) used when
acquiring a temporary file.
Use this in the commit-graph machinery for writing a non-split graph to
acquire an opened temporary file with permissions read-only permissions
to match the split behavior. (In the split case, Git uses
git_mkstemp_mode' for each of the commit-graph layers with permission
bits '0444').
One can notice this discrepancy when moving a non-split graph to be part
of a new chain. This causes a commit-graph chain where all layers have
read-only permission bits, except for the base layer, which is writable
for the current user.
Resolve this discrepancy by using the new
'hold_lock_file_for_update_mode' and passing the desired permission
bits.
Doing so causes some test fallout in t5318 and t6600. In t5318, this
occurs in tests that corrupt a commit-graph file by writing into it. For
these, 'chmod u+w'-ing the file beforehand resolves the issue. The
additional spot in 'corrupt_graph_verify' is necessary because of the
extra 'git commit-graph write' beforehand (which *does* rewrite the
commit-graph file). In t6600, this is caused by copying a read-only
commit-graph file into place and then trying to replace it. For these,
make these files writable.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reading a malformed credential URL line and silently ignoring it
does not mean that we support empty lines and/or "# commented" lines
forever. We should document it to avoid confusion.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Those fetching over protocol v2 from linux-next and other kernel
repositories are reporting that v2 often fetches way too much than
needed.
* jn/demote-proto2-from-default:
Revert "fetch: default to protocol version 2"
GNU/Hurd is also among the ones that need the fopen() wrapper.
* jc/gnu-hurd-lets-fread-read-dirs:
config.mak.uname: Define FREAD_READS_DIRECTORIES for GNU/Hurd
"git grep" did not quote a path with unusual character like other
commands (like "git diff", "git status") do, but did quote when run
from a subdirectory, both of which has been corrected.
* mt/grep-cquote-path:
grep: follow conventions for printing paths w/ unusual chars
The "--decorate-refs" and "--decorate-refs-exclude" options "git
log" takes have learned a companion configuration variable
log.excludeDecoration that sits at the lowest priority in the
family.
* ds/log-exclude-decoration-config:
log: add log.excludeDecoration config option
log-tree: make ref_filter_match() a helper method
"git diff-tree --pretty --notes" used to hit an assertion failure,
as it forgot to initialize the notes subsystem.
* tb/diff-tree-with-notes:
diff-tree.c: load notes machinery when required
Allowing the user to split a patch hunk while "git stash -p" does
not work well; a band-aid has been added to make this (partially)
work better.
* js/stash-p-fix:
stash -p: (partially) fix bug concerning split hunks
t3904: fix incorrect demonstration of a bug
Code in builtin/*, i.e. those can only be called from within
built-in subcommands, that implements bulk of a couple of
subcommands have been moved to libgit.a so that they could be used
by others.
* dl/libify-a-few:
Lib-ify prune-packed
Lib-ify fmt-merge-msg
"git push --atomic" used to show failures for refs that weren't
even pushed, which has been corrected.
* jx/atomic-push:
transport-helper: new method reject_atomic_push()
transport-helper: mark failure for atomic push
send-pack: mark failure of atomic push properly
t5543: never report what we do not push
send-pack: fix inconsistent porcelain output
"git diff" in a partial clone learned to avoid lazy loading blob
objects in more casese when they are not needed.
* jt/avoid-prefetch-when-able-in-diff:
diff: restrict when prefetching occurs
diff: refactor object read
diff: make diff_populate_filespec_options struct
promisor-remote: accept 0 as oid_nr in function
"git commit-graph write --expire-time=<timestamp>" did not use the
given timestamp correctly, which has been corrected.
* ds/commit-graph-expiry-fix:
commit-graph: fix buggy --expire-time option
Documentation updates around the "--recurse-submodules" option.
* dr/doc-recurse-submodules:
doc: --recurse-submodules mostly applies to active submodules
doc: be more precise on (fetch|push).recurseSubmodules
doc: explain how to deactivate submodule.recurse completely
doc: document --recurse-submodules for reset and restore
doc: list all commands affected by submodule.recurse
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"
* jc/log-no-mailmap:
log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
clone: reorder --recursive/--recurse-submodules
parse-options: teach "git cmd -h" to show alias as alias
Raise the minimum required version of docbook-xsl package to 1.74,
as 1.74.0 was from late 2008, which is more than 10 years old, and
drop compatibility cruft from our documentation suite.
* ma/doc-discard-docbook-xsl-1.73:
user-manual.conf: don't specify [listingblock]
INSTALL: drop support for docbook-xsl before 1.74
manpage-normal.xsl: fold in manpage-base.xsl
manpage-bold-literal.xsl: stop using git.docbook.backslash
Doc: drop support for docbook-xsl before 1.73.0
Doc: drop support for docbook-xsl before 1.72.0
Doc: drop support for docbook-xsl before 1.71.1
The "git submodule" command did not initialize a few variables it
internally uses and was affected by variable settings leaked from
the environment.
* lx/submodule-clear-variables:
git-submodule.sh: setup uninitialized variables
The custom hash function used by "git fast-import" has been
replaced with the one from hashmap.c, which gave us a nice
performance boost.
* jk/fast-import-use-hashmap:
fast-import: replace custom hash with hashmap.c
The config API made mixed uses of int and size_t types to represent
length of various pieces of text it parsed, which has been updated
to use the correct type (i.e. size_t) throughout.
* jk/config-use-size-t:
config: reject parsing of files over INT_MAX
config: use size_t to store parsed variable baselen
git_config_parse_key(): return baselen as size_t
config: drop useless length variable in write_pair()
parse_config_key(): return subsection len as size_t
remote: drop auto-strlen behavior of make_branch() and make_rewrite()
Validation of push certificate has been made more robust against
timing attacks.
* bc/constant-memequal:
receive-pack: compilation fix
builtin/receive-pack: use constant-time comparison for HMAC value
The code that refreshes the last access and modified time of
on-disk packfiles and loose object files have been updated.
* lr/freshen-file-fix:
freshen_file(): use NULL `times' for implicit current-time
"git rebase" happens to call some hooks meant for "checkout" and
"commit" by this was not a designed behaviour than historical
accident. This has been documented.
* en/rebase-doc-hooks-called-by-accident:
git-rebase.txt: add another hook to the hooks section, and explain more
Document the recommended way to abort a failing test early (e.g. by
exiting a loop), which is to say "return 1".
* jc/doc-test-leaving-early:
t/README: suggest how to leave test early with failure
Various tests have been updated to work around issues found with
shell utilities that come with busybox etc.
* dd/test-with-busybox:
t5703: feed raw data into test-tool unpack-sideband
t4124: tweak test so that non-compliant diff(1) can also be used
t7063: drop non-POSIX argument "-ls" from find(1)
t5616: use rev-parse instead to get HEAD's object_id
t5003: skip conversion test if unzip -a is unavailable
t5003: drop the subshell in test_lazy_prereq
test-lib-functions: test_cmp: eval $GIT_TEST_CMP
t4061: use POSIX compliant regex(7)
Just like 47abd85ba0 (fetch: Strip usernames from url's before storing
them, 2009-04-17) and later 882d49ca5c (push: anonymize URL in status
output, 2016-07-13), and even later c1284b21f2 (curl: anonymize URLs
in error messages and warnings, 2019-03-04) this change anonymizes URLs
(read: strips them of user names and especially passwords) in
user-facing error messages and warnings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a03b55530a (merge: teach --autostash option, 2020-04-07), the
--autostash option was introduced for `git merge`. Notably, when
`git merge --quit` is run with an autostash entry present, it is saved
into the stash reflog. This is contrasted with the current behaviour of
`git rebase --quit` where the autostash entry is simply just dropped out
of existence.
Adopt the behaviour of `git merge --quit` in `git rebase --quit` and
save the autostash entry into the stash reflog instead of just deleting
it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the usage for `git push` is shown, it includes the following
lines
--recurse-submodules[=(check|on-demand|no)]
control recursive pushing of submodules
which seem to indicate that the argument for --recurse-submodules is
optional. However, we cannot actually run that optiion without an
argument:
$ git push --recurse-submodules
fatal: recurse-submodules missing parameter
Unset PARSE_OPT_OPTARG so that it is clear that this option requires an
argument. Since the parse-options machinery guarantees that an argument
is present now, assume that `arg` is set in the else of
option_parse_recurse_submodules().
Reported-by: Andrew White <andrew.white@audinate.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, there are many options which use OPTION_CALLBACK in a
plain ol' struct definition. However, we have the OPT_CALLBACK and
OPT_CALLBACK_F macros which are meant to abstract these plain struct
definitions away. These macros are useful as they semantically signal to
developers that these are just normal callback option with nothing fancy
happening.
Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or
OPT_CALLBACK_F where applicable. The heavy lifting was done using the
following (disgusting) shell script:
#!/bin/sh
do_replacement () {
tr '\n' '\r' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' |
tr '\r' '\n'
}
for f in $(git ls-files \*.c)
do
do_replacement <"$f" >"$f.tmp"
mv "$f.tmp" "$f"
done
The result was manually inspected and then reformatted to match the
style of the surrounding code. Finally, using
`git grep OPTION_CALLBACK \*.c`, leftover results which were not handled
by the script were manually transformed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test added by 477dcaddb6 (tests: do not let lazy prereqs inside
`test_expect_*` turn off tracing, 2020-03-26) runs a sub-test script
that traces a test with a lazy prereq, like:
test_have_prereq LAZY && echo trace
That won't work if GIT_TEST_FAIL_PREREQS is set in the environment,
because our have_prereq will report failure, and we won't run the echo
at all.
We could work around this by avoiding the &&-chain, but we can
fix this and any future tests at once by unsetting that variable for our
sub-tests. These are meant to be controlled environments where we test
the test-suite itself; the outer test snippet should be in charge of the
sub-test environment, not whatever mode the user happens to be running
in.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the function process_acks() in fetch-pack.c, the variable
received_ack is meant to track that an ACK was received, but it was
never set. This results in negotiation terminating prematurely through
the in_vain counter, when the counter should have been reset upon every
ACK.
Therefore, reset the in_vain counter upon every ACK.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, Git stops negotiation when it has sent at least
MAX_IN_VAIN (which is 256) "have" lines without having any of them
ACK-ed. But this is supposed to trigger only after the first ACK, as
pack-protocol.txt says:
However, the 256 limit *only* turns on in the canonical client
implementation if we have received at least one "ACK %s continue"
during a prior round. This helps to ensure that at least one common
ancestor is found before we give up entirely.
The code path for protocol v0 observes this, but not protocol v2,
resulting in shorter negotiation rounds but significantly larger
packfiles. Teach the code path for protocol v2 to check this criterion
only after at least one ACK was received.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
process_acks() returns 0, 1, or 2, depending on whether "ready" was
received and if not, whether at least one commit was found to be common.
Replace these magic numbers with a documented enum.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the generic parts of the source files, system headers like
<time.h> and <stdio.h> are supposed to be included indirectly
by including "git-compat-util.h", which manages portability issues.
Drop our explicit inclusions and rely on "cache.h", which includes
"git-compat-util.h".
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--root implies we want to rebase all commits since the beginning of
history. --fork-point means we want to use the reflog of the specified
upstream to find the best common ancestor between <upstream> and
<branch> and only rebase commits since that common ancestor. These
options are clearly contradictory, so throw an error (instead of
segfaulting on a NULL pointer) if both are specified.
Reported-by: Alexander Berg <alexander.berg@atos.net>
Documentation-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
46fd7b3900 ("credential: allow wildcard patterns when matching config",
2020-02-20) introduced support for matching credential helpers using
urlmatch. In doing so, it introduced code to percent-encode the paths
we get from the credential helper so that they could be effectively
matched by the urlmatch code.
Unfortunately, that code had a bug: it percent-encoded the slashes in
the path, resulting in any URL path that contained multiple levels
(i.e., a directory component) not matching.
We are currently the only caller of the percent-encoding code and could
simply change it not to encode slashes. However, we still want to
encode slashes in the username component, so we need to have both
behaviors available.
So instead, let's add a flag to control encoding slashes, which is the
behavior we want here, and use it when calling the code in this case.
Add a test for credential helper URLs using multiple slashes in the
path, which our test suite previously lacked, as well as one ensuring
that we handle usernames with slashes gracefully. Since we're testing
other percent-encoding handling, let's add one for non-ASCII UTF-8
characters as well.
Reported-by: Ilya Tretyakov <it@it3xl.ru>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apparently a recent Homebrew update now installs `gettext` into the
subdirectory /usr/local/opt/gettext/[lib/include].
Sometimes the ci job succeeds:
brew link --force gettext
Linking /usr/local/Cellar/gettext/0.20.1... 179 symlinks created
And sometimes installing the package "gettext" with force-link fails:
brew link --force gettext
Warning: Refusing to link macOS provided/shadowed software: gettext
If you need to have gettext first in your PATH run:
echo 'export PATH="/usr/local/opt/gettext/bin:$PATH"' >> ~/.bash_profile
(And the is not the final word either, since macOS itself says:
The default interactive shell is now zsh.)
Anyway, The latter requires CFLAGS to include /usr/local/opt/gettext/include
and LDFLAGS to include /usr/local/opt/gettext/lib.
Likewise, the `msgfmt` tool is no longer in the `PATH`.
While it is unclear which change is responsible for this breakage (that
most notably only occurs on CI build agents that updated very recently),
https://github.com/Homebrew/homebrew-core/pull/53489 has fixed it.
Nevertheless, let's work around this issue, as there are still quite a
few build agents out there that need some help in this regard: we
explicitly do not call `brew update` in our CI/PR builds anymore.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use 'hold_lock_file_for_update' (and the '_timeout') variant to
acquire a lock when updating references, the commit-graph file, and so
on.
In particular, the commit-graph machinery uses this to acquire a
temporary file that is used to write a non-split commit-graph. In a
subsequent commit, an issue in the commit-graph machinery produces
graph files that have a different permission based on whether or not
they are part of a multi-layer graph will be addressed.
To do so, the commit-graph machinery will need a version of
'hold_lock_file_for_update' that takes the permission bits from the
caller.
Introduce such a function in this patch for both the
'hold_lock_file_for_update' and 'hold_lock_file_for_update_timeout'
functions, and leave the existing functions alone by inlining their
definitions in terms of the new mode variants.
Note that, like in the previous commit, 'hold_lock_file_for_update_mode'
is not guarenteed to set the given mode, since it may be modified by
both the umask and 'core.sharedRepository'.
Note also that even though the commit-graph machinery only calls
'hold_lock_file_for_update', that this is defined in terms of
'hold_lock_file_for_update_timeout', and so both need an additional mode
parameter here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, 'hold_lock_file_for_update' will gain an additional
'mode' parameter to specify permissions for the associated temporary
file.
Since the lockfile.c machinery uses 'create_tempfile' which always
creates a temporary file with global read-write permissions, introduce a
variant here that allows specifying the mode.
Note that the mode given to 'create_tempfile_mode' is not guaranteed to
be written to disk, since it is subject to both the umask and
'core.sharedRepository'.
Arguably, all temporary files should have permission 0444, since they
are likely to be renamed into place and then not written to again. This
is a much larger change than we may want to take on in this otherwise
small patch, so for the time being, make 'create_tempfile' behave as it
has always done by inlining it to 'create_tempfile_mode' with mode set
to '0666'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Linux with musl libc, we have this inclusion chain:
compat/regex/regex.c:69
`-> compat/regex/regex_internal.h
`-> /usr/include/stdlib.h
`-> /usr/include/features.h
`-> /usr/include/alloca.h
In that inclusion chain, `<features.h>` claims it's _BSD_SOURCE
compatible when it's NOT asked to be either
{_POSIX,_GNU,_XOPEN,_BSD}_SOURCE, or __STRICT_ANSI__.
And, `<stdlib.h>` will include `<alloca.h>` to be compatible with
software written for GNU and BSD. Thus, redefine `alloca` macro,
which was defined before at compat/regex/regex.c:66.
Considering this is only compat code, we've taken from other project,
it's not our business to decide which source should we adhere to.
Include `<stdlib.h>` early to prevent the redefinition of alloca.
This also remove a potential warning about alloca not defined on:
#undef alloca
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't ever refer to the descriptor after mmap-ing it. And keeping it
open means we can run out of descriptors in degenerate cases (e.g.,
thousands of split chain files). Let's close it as soon as possible.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit trailers like "Thanks-to:", "Fixes:", and "Closes:" are fairly
common, but gitweb didn't highlight them like other trailers.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
356aea6 ("doc: move extensions.worktreeConfig to the right place",
2018-11-14) moved the explanation of extension.worktreeConfig from
config.txt to technical/repository-version.txt. However, the former
still contains a reference to the removed paragraph. We could fix it
referencing the gitrepository-layout man page, which contains the moved
explanation. But the git-worktree man page has additional information
and recommendations for the worktree config file, so let's reference it
instead.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to the fixes for CVE-2020-11008, we were _very_ lenient in what we
required from a URL in order to parse it into a `struct credential`.
That led to serious vulnerabilities.
There was one call site, though, that really needed that leniency: when
parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
Settings like this might be desired when users want to use, say, a given
user name on a given host, regardless of the protocol to be used.
In preparation for fixing that bug, let's refactor the code to
optionally allow for partial URLs. For the moment, this functionality is
only exposed via the now-renamed function `credential_from_url_1()`, but
it is not used. The intention is to make it easier to verify that this
commit does not change the existing behavior unless explicitly allowing
for partial URLs.
Please note that this patch does more than just reinstating a way to
imitate the behavior before those CVE-2020-11008 fixes: Before that, we
would simply ignore URLs without a protocol. In other words,
misleadingly, the following setting would be applied to _all_ URLs:
[credential "example.com"]
username = that-me
The obvious intention is to match the host name only. With this patch,
we allow precisely that: when parsing the URL with non-zero
`allow_partial_url`, we do not simply return success if there was no
protocol, but we simply leave the protocol unset and continue parsing
the URL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to the fixes for CVE-2020-11008, we were _very_ lenient in what we
required from a URL in order to parse it into a `struct credential`.
That led to serious vulnerabilities.
There was one call site, though, that really needed that leniency: when
parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
Settings like this might be desired when users want to use, say, a given
user name on a given host, regardless of the protocol to be used.
In preparation for fixing that bug, let's refactor the code to
optionally allow for partial URLs. For the moment, this functionality is
only exposed via the now-renamed function `credential_from_url_1()`, but
it is not used. The intention is to make it easier to verify that this
commit does not change the existing behavior unless explicitly allowing
for partial URLs.
Please note that this patch does more than just reinstating a way to
imitate the behavior before those CVE-2020-11008 fixes: Before that, we
would simply ignore URLs without a protocol. In other words,
misleadingly, the following setting would be applied to _all_ URLs:
[credential "example.com"]
username = that-me
The obvious intention is to match the host name only. With this patch,
we allow precisely that: when parsing the URL with non-zero
`allow_partial_url`, we do not simply return success if there was no
protocol, but we simply leave the protocol unset and continue parsing
the URL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was a lot going on behind the scenes when the vulnerability and
possible solutions were discussed. Grammar was not a primary focus,
that's why this slipped in.
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-commit(1) says ISO-8601 is one of our supported date format.
ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.
And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:
2008-02-14 20:30:45.019-04:00
While doing this, make sure that we only interpret the number after the
second and the dot as fractional when and only when the date is known,
since only ISO-8601 allows the fractional part, and we've taught our
users to interpret "12:34:56.7.days.ago" as a way to specify a time
relative to current time.
Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will reuse this logic, move it to a helper, now.
While we're at it, explicit states that we intentionally ignore
old-and-defective 2nd leap second.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A number of spots in t5537 use the non-indented heredoc '<<EOF' when
they would benefit from instead using '<<-EOF' or simply
test_write_lines.
In preparation for adding new tests in a good style and being consistent
with the surrounding code, update the existing tests to improve their
readability.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index subsystem was not closing its file descriptor
after memory-mapping the file contents. After this mmap() succeeds,
there is no need to keep the file descriptor open. In fact, there
is signficant reason to close it so we do not run out of
descriptors.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "test-tool bloom" is not fed a command, or if arguments are missing
for some commands, it will just segfault. Let's check argc and write a
friendlier usage message.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a layered commit-graph, the commit-graph machinery uses
'commit_graph_filenames_after' and 'commit_graph_hash_after' to keep
track of the layers in the chain that we are in the process of writing.
When the number of commit-graph layers shrinks, we initialize all
entries in the aforementioned arrays, because we know the structure of
the new commit-graph chain immediately (since there are no new layers,
there are no unknown hash values).
But when the number of commit-graph layers grows (i.e., that
'num_commit_graphs_after > num_commit_graphs_before'), then we leave
some entries in the filenames and hashes arrays as uninitialized,
because we will fill them in later as those values become available.
For instance, we rely on 'write_commit_graph_file's to store the
filename and hash of the last layer in the new chain, which is the one
that it is responsible for writing. But, it's possible that
'write_commit_graph_file' may fail, e.g., from file descriptor
exhaustion. In this case it is possible that 'git_mkstemp_mode' will
fail, and that function will return early *before* setting the values
for the last commit-graph layer's filename and hash.
This causes a number of upleasant side-effects. For instance, trying to
'free()' each entry in 'ctx->commit_graph_filenames_after' (and
similarly for the hashes array) causes us to 'free()' uninitialized
memory, since the area is allocated with 'malloc()' and is therefore
subject to contain garbage (which is left alone when
'write_commit_graph_file' returns early).
This can manifest in other issues, like a general protection fault,
and/or leaving a stray 'commit-graph-chain.lock' around after the
process dies. (The reasoning for this is still a mystery to me, since
we'd otherwise usually expect the kernel to run tempfile.c's 'atexit()'
handlers in the case of a normal death...)
To resolve this, initialize the memory with 'CALLOC_ARRAY' so that
uninitialized entries are filled with zeros, and can thus be 'free()'d
as a noop instead of causing a fault.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1400 the prerequisite 'ULIMIT_FILE_DESCRIPTORS' is defined and used
to effectively guard the helper function 'run_with_limited_open_files'
from being used on systems that do not satisfy this prerequisite.
In the subsequent patch, we will introduce another test outside of t1400
that would benefit from using this prerequisite. So, move it to
'test-lib.sh' instead so that it can be used by multiple tests.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a commit-graph layer, we do so in a temporary file which is
renamed into place. If we fail to create a temporary file, for e.g.,
because we have too many open files, then 'git_mkstemp_mode' sets the
pattern to the empty string, in which case we get an error something
along the lines of:
error: unable to create ''
It's not useful to show the pattern here at all, since we (1) know the
pattern is well-formed, and (2) would have already shown the dirname
when trying to create the leading directories. So, replace this error
with something friendlier.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't use the "parent" parameter at all (probably because the bloom
filter for a commit is always defined against a single parent anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function is_date, confusingly also set tm_year. tm_mon, and tm_mday
after validating input.
Rename to set_date to reflect its real usage.
Also, change return value is 0 on success and -1 on failure following
our convention on function do some real work.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're passing buffer from strbuf to reencode_string,
which will call strlen(3) on that buffer,
and discard the length of newly created buffer.
Then, we compute the length of the return buffer to attach to strbuf.
During this process, we introduce a discrimination between mail
originally written in utf-8 and other encoding.
* if the email was written in utf-8, we leave it as is. If there is
a NUL character in that line, we complains loudly:
error: a NUL byte in commit log message not allowed.
* if the email was written in other encoding, we truncate the data as
the NUL character in that line, then we used the truncated line for
the metadata.
We can do better by reusing all the available information,
and call the underlying lower level function that will be called
indirectly by reencode_string. By doing this, we will also postpone
the NUL character processing to the commit step, which will
complains about the faulty metadata.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we are at it, make sure we run a clean up after testing.
In a later patch, we will test for more corrupted patch.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing of URL for the credential helper has been corrected.
* jk/credential-parsing-end-of-host-in-URL:
credential: treat "?" and "#" in URLs as end of host
Allow "git rebase" to reapply all local commits, even if the may be
already in the upstream, without checking first.
* jt/rebase-allow-duplicate:
rebase --merge: optionally skip upstreamed commits
"git rebase" (again) learns to honor "--no-keep-empty", which lets
the user to discard commits that are empty from the beginning (as
opposed to the ones that become empty because of rebasing). The
interactive rebase also marks commits that are empty in the todo.
* en/rebase-no-keep-empty:
rebase: fix an incompatible-options error message
rebase: reinstate --no-keep-empty
rebase -i: mark commits that begin empty in todo editor
A Windows-specific test element has been made more robust against
misuse from both user's environment and programmer's errors.
* js/mingw-is-hidden-test-fix:
t: restrict `is_hidden` to be called only on Windows
mingw: make test_path_is_hidden more robust
t: consolidate the `is_hidden` functions
The interactive input from various codepaths are consolidated and
any prompt possibly issued earlier are fflush()ed before we read.
* js/flush-prompt-before-interative-input:
interactive: explicitly `fflush` stdout before expecting input
interactive: refactor code asking the user for interactive input
"git log" learned "--show-pulls" that helps pathspec limited
history views; a merge commit that takes the whole change from a
side branch, which is normally omitted from the output, is shown
in addition to the commits that introduce real changes.
* ds/revision-show-pulls:
revision: --show-pulls adds helpful merges
Misc fixes for Windows.
* js/mingw-fixes:
mingw: help debugging by optionally executing bash with strace
mingw: do not treat `COM0` as a reserved file name
mingw: use modern strftime implementation if possible
We've left the command line parsing of "git log :/a/b/" broken for
about a full year without anybody noticing, which has been
corrected.
* jc/missing-ref-store-fix:
repository: mark the "refs" pointer as private
sha1-name: do not assume that the ref store is initialized
The output from "git format-patch" uses RFC 2047 encoding for
non-ASCII letters on From: and Subject: headers, so that it can
directly be fed to e-mail programs. A new option has been added
to produce these headers in raw.
* eb/format-patch-no-encode-headers:
format-patch: teach --no-encode-email-headers
The more aggressive updates to remote-tracking branches we had for
the past 7 years or so were not reflected in the documentation,
which has been corrected.
* pb/pull-fetch-doc:
pull doc: correct outdated description of an example
pull doc: refer to a specific section in 'fetch' doc
"git rebase" learned the "--no-gpg-sign" option to countermand
commit.gpgSign the user may have.
* dd/no-gpg-sign:
Documentation: document merge option --no-gpg-sign
Documentation: merge commit-tree --[no-]gpg-sign
Documentation: reword commit --no-gpg-sign
Documentation: document am --no-gpg-sign
cherry-pick/revert: honour --no-gpg-sign in all case
rebase.c: honour --no-gpg-sign
The logic to auto-follow tags by "git clone --single-branch" was
not careful to avoid lazy-fetching unnecessary tags, which has been
corrected.
* jk/use-quick-lookup-in-clone-for-tag-following:
clone: use "quick" lookup while following tags
"git rebase" with the merge backend did not work well when the
rebase.abbreviateCommands configuration was set.
* ag/rebase-merge-allow-ff-under-abbrev-command:
t3432: test `--merge' with `rebase.abbreviateCommands = true', too
sequencer: don't abbreviate a command if it doesn't have a short form
Code cleanup.
* jk/oid-array-cleanups:
oidset: stop referring to sha1-array
ref-filter: stop referring to "sha1 array"
bisect: stop referring to sha1_array
test-tool: rename sha1-array to oid-array
oid_array: rename source file from sha1-array
oid_array: use size_t for iteration
oid_array: use size_t for count and allocation
"git pull --rebase" tried to run a rebase even after noticing that
the pull results in a fast-forward and no rebase is needed nor
sensible, for the past few years due to a mistake nobody noticed.
* en/pull-do-not-rebase-after-fast-forwarding:
pull: avoid running both merge and rebase
"git pull" shares many options with underlying "git fetch", but
some of them were not documented and some of those that would make
sense to pass down were not passed down.
* rs/pull-options-sync-code-and-doc:
pull: pass documented fetch options on
pull: remove --update-head-ok from documentation
The server-end of the v2 protocol to serve "git clone" and "git
fetch" was not prepared to see a delim packets at unexpected
places, which led to a crash.
* jk/harden-protocol-v2-delim-handling:
test-lib-functions: simplify packetize() stdin code
upload-pack: handle unexpected delim packets
test-lib-functions: make packetize() more efficient
Utitiles run via the run_command() API were not spawned correctly
on Cygwin, when the paths to them are given as a full path with
backslashes.
* ak/run-command-on-cygwin-fix:
run-command: trigger PATH lookup properly on Cygwin
When fed a midx that records no objects, some codepaths tried to
loop from 0 through (num_objects-1), which, due to integer
arithmetic wrapping around, made it nonsense operation with out of
bounds array accesses. The code has been corrected to reject such
an midx file.
* dr/midx-avoid-int-underflow:
midx.c: fix an integer underflow
Simplify the commit ancestry connectedness check in a partial clone
repository in which "promised" objects are assumed to be obtainable
lazily on-demand from promisor remote repositories.
* jt/connectivity-check-optim-in-partial-clone:
connected: always use partial clone optimization
"git p4" learned four new hooks and also "--no-verify" option to
bypass them (and the existing "p4-pre-submit" hook).
* bk/p4-pre-edit-changelist:
git-p4: add RCS keyword status message
git-p4: add p4 submit hooks
git-p4: restructure code in submit
git-p4: add --no-verify option
git-p4: add p4-pre-submit exit text
git-p4: create new function run_git_hook
git-p4: rewrite prompt to be Windows compatible
The import-tars importer (in contrib/fast-import/) used to create
phony files at the top-level of the repository when the archive
contains global PAX headers, which made its own logic to detect and
omit the common leading directory ineffective, which has been
corrected.
* js/import-tars-do-not-make-phony-files-from-pax-headers:
import-tars: ignore the global PAX header
Enable tests that require GnuPG on Windows.
* js/tests-gpg-integration-on-windows:
tests: increase the verbosity of the GPG-related prereqs
tests: turn GPG, GPGSM and RFC1991 into lazy prereqs
tests: do not let lazy prereqs inside `test_expect_*` turn off tracing
t/lib-gpg.sh: stop pretending to be a stand-alone script
tests(gpg): allow the gpg-agent to start on Windows
This reverts commit 684ceae32d.
Users fetching from linux-next and other kernel remotes are reporting
that the limited ref advertisement causes negotiation to reach
MAX_IN_VAIN, resulting in too-large fetches.
Reported-by: Lubomir Rintel <lkundrak@v3.sk>
Reported-by: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GNU/Hurd is another platform that behaves like this. Set it to
UnfortunatelyYes so that config directory files are correctly processed.
This fixes the corresponding 'proper error on directory "files"' test in
t1308-config-set.sh.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not use C99 "for loop initial declaration" in our codebase
(yet), but one snuck in.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some asynchronous operations, we build a chain of callbacks to
execute when the operation is done. These callbacks are held in $after,
and a new callback can be added by appending to $after. Once the
operation is done, $after is executed as a script.
But if we don't append a semi-colon after the procedure calls, they will
appear to Tcl as arguments to the previous procedure's arguments. So,
for example, if $after is "foo", and we just append "bar", then $after
becomes "foo bar", and bar will be treated as an argument to foo. If foo
does not accept any optional arguments, it would result in Tcl throwing
an error. If instead we do append a semi-colon, $after will look like
"foo;bar;", and these will be treated as two separate procedure calls.
Before d9c6469 (git-gui: update status bar to track operations,
2019-12-01), this problem was masked because ui_ready/ui_status did
accept an optional argument. In d9c6469, ui_ready stopped accepting an
optional argument, and this error started showing up.
Another instance of this problem is when a call to ui_status without a
trailing semicolon. ui_status never accepted an optional argument to
begin with, but the issue never managed to surface.
So, fix these errors by making sure we always append a semi-colon after
procedure calls when multiple callbacks are involved in $after.
Helped-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Ansgar Röber <ansgar.roeber@rwth-aachen.de>
In the example by Jon Loeliger the selector 'A^2' was duplicated. This
might confuse readers.
Signed-off-by: Michael F. Schönitzer <michael@schoenitzer.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since its introduction in 7249e91 (revision.c: support --notes
command-line option, 2011-03-29), combining '--notes' with any option
that causes us to format notes (e.g., '--pretty', '--format="%N"', etc)
results in a failed assertion at runtime.
$ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
Aborted
This failure is due to diff-tree not calling 'load_display_notes' to
initialize the notes machinery.
Ordinarily, this failure isn't triggered, because it requires passing
both '--notes' and another of the above mentioned options. In the case
of '--pretty', for example, we set 'opt->verbose_header', causing
'show_log()' to eventually call 'format_display_notes()', which expects
a non-NULL 'display_note_trees'.
Without initializing the notes machinery, 'display_note_trees' remains
NULL, and thus triggers an assertion failure.
Fix this by initializing the notes machinery after parsing our options,
and harden this behavior against regression with a test in t4013. (Note
that the added ref in this test requires updating two unrelated tests
which use 'log --all', and thus need to learn about the new refs).
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were using `test_must_fail p4` to test that the p4 command failed as
expected. However, test_must_fail() is used to ensure that commands fail
in an expected way, not due to something like a segv. Since we are not
in the business of verifying the sanity of the external world, replace
`test_must_fail p4` with `! p4` and assume that the `p4` command does
not die unexpectedly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_must_fail` function should only be used for git commands;
we are not in the business of catching segmentation fault by external
commands. Shell helper functions test_cmp and svn_cmd used in this
script are wrappers around external commands, so just use `! cmd`
instead of `test_must_fail cmd`
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
this file not exist, but there shouldn't exit _any_ filesystem entity in
these paths, replace `test_must_fail test -f` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
these directories not exist, but there shouldn't exist _any_ filesystem
entity in these paths, replace `test_must_fail test -d` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to t/README, test_must_fail() should only be used to test for
failure in Git commands.
Replace the invocation of `test_must_fail test_path_is_file` with
`test_path_is_missing` since, in this test case, the path should not
exist at all.
In all the cases where `test_must_fail test_alternate_is_used` appears,
test_alternate_is_used() fails because test_line_count() cannot open the
non-existent $alternates_file. Replace
`test_must_fail test_alternate_is_used` with `test_path_is_missing` to
test for this directly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we should assume that external commands work sanely. Replace
`test_must_fail test -e` with `test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep does not follow the conventions used by other Git commands when
printing paths that contain unusual characters (as double-quotes or
newlines). Commands such as ls-files, commit, status and diff will:
- Quote and escape unusual pathnames, by default.
- Print names verbatim and unquoted when "-z" is used.
But grep *never* quotes/escapes absolute paths with unusual chars and
*always* quotes/escapes relative ones, even with "-z". Besides being
inconsistent in its own output, the deviation from other Git commands
can be confusing. So let's make it follow the two rules above and add
some tests for this new behavior. Note that, making grep quote/escape
all unusual paths by default, also make it fully compliant with the
core.quotePath configuration, which is currently ignored for absolute
paths.
Reported-by: Greg Hurrell <greg@hurrell.net>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's URL parser interprets
https:///example.com/repo.git
to have no host and a path of "example.com/repo.git". Curl, on the
other hand, internally redirects it to https://example.com/repo.git. As
a result, until "credential: parse URL without host as empty host, not
unset", tricking a user into fetching from such a URL would cause Git to
send credentials for another host to example.com.
Teach fsck to block and detect .gitmodules files using such a URL to
prevent sharing them with Git versions that are not yet protected.
A relative URL in a .gitmodules file could also be used to trigger this.
The relative URL resolver used for .gitmodules does not normalize
sequences of slashes and can follow ".." components out of the path part
and to the host part of a URL, meaning that such a relative URL can be
used to traverse from a https://foo.example.com/innocent superproject to
a https:///attacker.example.com/exploit submodule. Fortunately,
redundant extra slashes in .gitmodules are rare, so we can catch this by
detecting one after a leading sequence of "./" and "../" components.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Until "credential: refuse to operate when missing host or protocol",
Git's credential handling code interpreted URLs with empty scheme to
mean "give me credentials matching this host for any protocol".
Luckily libcurl does not recognize such URLs (it tries to look for a
protocol named "" and fails). Just in case that changes, let's reject
them within Git as well. This way, credential_from_url is guaranteed to
always produce a "struct credential" with protocol and host set.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
libcurl permits making requests without a URL scheme specified. In
this case, it guesses the URL from the hostname, so I can run
git ls-remote http::ftp.example.com/path/to/repo
and it would make an FTP request.
Any user intentionally using such a URL is likely to have made a typo.
Unfortunately, credential_from_url is not able to determine the host and
protocol in order to determine appropriate credentials to send, and
until "credential: refuse to operate when missing host or protocol",
this resulted in another host's credentials being leaked to the named
host.
Teach credential_from_url_gently to consider such a URL to be invalid
so that fsck can detect and block gitmodules files with such URLs,
allowing server operators to avoid serving them to downstream users
running older versions of Git.
This also means that when such URLs are passed on the command line, Git
will print a clearer error so affected users can switch to the simpler
URL that explicitly specifies the host and protocol they intend.
One subtlety: .gitmodules files can contain relative URLs, representing
a URL relative to the URL they were cloned from. The relative URL
resolver used for .gitmodules can follow ".." components out of the path
part and past the host part of a URL, meaning that such a relative URL
can be used to traverse from a https://foo.example.com/innocent
superproject to a https::attacker.example.com/exploit submodule.
Fortunately a leading ':' in the first path component after a series of
leading './' and '../' components is unlikely to show up in other
contexts, so we can catch this by detecting that pattern.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
When we try to initialize credential loading by URL and find that the
URL is invalid, we set all fields to NULL in order to avoid acting on
malicious input. Later when we request credentials, we diagonse the
erroneous input:
fatal: refusing to work with credential missing host field
This is problematic in two ways:
- The message doesn't tell the user *why* we are missing the host
field, so they can't tell from this message alone how to recover.
There can be intervening messages after the original warning of
bad input, so the user may not have the context to put two and two
together.
- The error only occurs when we actually need to get a credential. If
the URL permits anonymous access, the only encouragement the user gets
to correct their bogus URL is a quiet warning.
This is inconsistent with the check we perform in fsck, where any use
of such a URL as a submodule is an error.
When we see such a bogus URL, let's not try to be nice and continue
without helpers. Instead, die() immediately. This is simpler and
obviously safe. And there's very little chance of disrupting a normal
workflow.
It's _possible_ that somebody has a legitimate URL with a raw newline in
it. It already wouldn't work with credential helpers, so this patch
steps that up from an inconvenience to "we will refuse to work with it
at all". If such a case does exist, we should figure out a way to work
with it (especially if the newline is only in the path component, which
we normally don't even pass to helpers). But until we see a real report,
we're better off being defensive.
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
In 07259e74ec (fsck: detect gitmodules URLs with embedded newlines,
2020-03-11), git fsck learned to check whether URLs in .gitmodules could
be understood by the credential machinery when they are handled by
git-remote-curl.
However, the check is overbroad: it checks all URLs instead of only
URLs that would be passed to git-remote-curl. In principle a git:// or
file:/// URL does not need to follow the same conventions as an http://
URL; in particular, git:// and file:// protocols are not succeptible to
issues in the credential API because they do not support attaching
credentials.
In the HTTP case, the URL in .gitmodules does not always match the URL
that would be passed to git-remote-curl and the credential machinery:
Git's URL syntax allows specifying a remote helper followed by a "::"
delimiter and a URL to be passed to it, so that
git ls-remote http::https://example.com/repo.git
invokes git-remote-http with https://example.com/repo.git as its URL
argument. With today's checks, that distinction does not make a
difference, but for a check we are about to introduce (for empty URL
schemes) it will matter.
.gitmodules files also support relative URLs. To ensure coverage for the
https based embedded-newline attack, urldecode and check them directly
for embedded newlines.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
The credential helper protocol was designed to be very flexible: the
fields it takes as input are treated as a pattern, and any missing
fields are taken as wildcards. This allows unusual things like:
echo protocol=https | git credential reject
to delete all stored https credentials (assuming the helpers themselves
treat the input that way). But when helpers are invoked automatically by
Git, this flexibility works against us. If for whatever reason we don't
have a "host" field, then we'd match _any_ host. When you're filling a
credential to send to a remote server, this is almost certainly not what
you want.
Prevent this at the layer that writes to the credential helper. Add a
check to the credential API that the host and protocol are always passed
in, and add an assertion to the credential_write function that speaks
credential helper protocol to be doubly sure.
There are a few ways this can be triggered in practice:
- the "git credential" command passes along arbitrary credential
parameters it reads from stdin.
- until the previous patch, when the host field of a URL is empty, we
would leave it unset (rather than setting it to the empty string)
- a URL like "example.com/foo.git" is treated by curl as if "http://"
was present, but our parser sees it as a non-URL and leaves all
fields unset
- the recent fix for URLs with embedded newlines blanks the URL but
otherwise continues. Rather than having the desired effect of
looking up no credential at all, many helpers will return _any_
credential
Our earlier test for an embedded newline didn't catch this because it
only checked that the credential was cleared, but didn't configure an
actual helper. Configuring the "verbatim" helper in the test would show
that it is invoked (it's obviously a silly helper which doesn't look at
its input, but the point is that it shouldn't be run at all). Since
we're switching this case to die(), we don't need to bother with a
helper. We can see the new behavior just by checking that the operation
fails.
We'll add new tests covering partial input as well (these can be
triggered through various means with url-parsing, but it's simpler to
just check them directly, as we know we are covered even if the url
parser changes behavior in the future).
[jn: changed to die() instead of logging and showing a manual
username/password prompt]
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We may feed a URL like "cert:///path/to/cert.pem" into the credential
machinery to get the key for a client-side certificate. That
credential has no hostname field, which is about to be disallowed (to
avoid confusion with protocols where a helper _would_ expect a
hostname).
This means as of the next patch, credential helpers won't work for
unlocking certs. Let's fix that by doing two things:
- when we parse a url with an empty host, set the host field to the
empty string (asking only to match stored entries with an empty
host) rather than NULL (asking to match _any_ host).
- when we build a cert:// credential by hand, similarly assign an
empty string
It's the latter that is more likely to impact real users in practice,
since it's what's used for http connections. But we don't have good
infrastructure to test it.
The url-parsing version will help anybody using git-credential in a
script, and is easy to test.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Many of the tests in t0300 give partial inputs to git-credential,
omitting a protocol or hostname. We're checking only high-level things
like whether and how helpers are invoked at all, and we don't care about
specific hosts. However, in preparation for tightening up the rules
about when we're willing to run a helper, let's start using input that's
a bit more realistic: pretend as if http://example.com is being
examined.
This shouldn't change the point of any of the tests, but do note we have
to adjust the expected output to accommodate this (filling a credential
will repeat back the protocol/host fields to stdout, and the helper
debug messages and askpass prompt will change on stderr).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We test a toy credential helper that writes "quit=1" and confirms that
we stop running other helpers. However, that helper is unrealistic in
that it does not bother to read its stdin at all.
For now we don't send any input to it, because we feed git-credential a
blank credential. But that will change in the next patch, which will
cause this test to racily fail, as git-credential will get SIGPIPE
writing to the helper rather than exiting because it was asked to.
Let's make this one-off helper more like our other sample helpers, and
have it source the "dump" script. That will read stdin, fixing the
SIGPIPE problem. But it will also write what it sees to stderr. We can
make the test more robust by checking that output, which confirms that
we do run the quit helper, don't run any other helpers, and exit for the
reason we expected.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Add new method in transport-helper to reject all references if any
reference is failed for atomic push.
This method is reused in "send-pack.c" and "transport-helper.c", one for
SSH, git and file protocols, and the other for HTTP protocol.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in
push_refs_with_push, 2019-07-11) noticed the incomplete report of
failure of an atomic push for HTTP protocol. But the implementation
has a flaw that mark all remote references as failure.
Only mark necessary references as failure in `push_refs_with_push()` of
transport-helper.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing with SSH or other smart protocol, references are validated
by function `check_to_send_update()` before they are sent in commands
to `send_pack()` of "receve-pack". For atomic push, if a reference is
rejected after the validation, only references pushed by user should be
marked as failure, instead of report failure on all remote references.
Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in
push_refs_with_push, 2019-07-11) wanted to fix report issue of HTTP
protocol, but marked all remote references failure for atomic push.
In order to fix the issue of status report for SSH or other built-in
smart protocol, revert part of that commit and add additional status
for function `atomic_push_failure()`. The additional status for it
except the "REF_STATUS_EXPECTING_REPORT" status are:
- REF_STATUS_NONE : Not marked as "REF_STATUS_EXPECTING_REPORT" yet.
- REF_STATUS_OK : Assume OK for dryrun or status_report is disabled.
This fix won't resolve the issue of status report in transport-helper
for HTTP or other protocols, and breaks test case in t5541. Will fix
it in additional commit.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we push some references to the git server, we expect git to report
the status of the references we are pushing; no more, no less. But when
pusing with atomic mode, if some references cannot be pushed, Git reports
the reject message on all references in the remote repository.
Add new test cases in t5543, and fix them in latter commit.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The porcelain output of a failed `git-push` command is inconsistent for
different protocols. For example, the following `git-push` command
may fail due to the failure of the `pre-receive` hook.
git push --porcelain origin HEAD:refs/heads/master
For SSH protocol, the porcelain output does not end with a "Done"
message:
To <URL/of/upstream.git>
! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined)
While for HTTP protocol, the porcelain output does end with a "Done"
message:
To <URL/of/upstream.git>
! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined)
Done
The following code at the end of function `send_pack()` indicates that
`send_pack()` should not return an error if some references are rejected
in porcelain mode.
int send_pack(...)
... ...
if (args->porcelain)
return 0;
for (ref = remote_refs; ref; ref = ref->next) {
switch (ref->status) {
case REF_STATUS_NONE:
case REF_STATUS_UPTODATE:
case REF_STATUS_OK:
break;
default:
return -1;
}
}
return 0;
}
So if atomic push failed, must check the porcelain mode before return
an error. And `receive_status()` should not return an error for a
failed updated reference, because `send_pack()` will check them instead.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing 'restore' and 'switch' sub commands to zsh completion
candidate output. E.g.
$ git re<tab>
rebase -- forward-port local commits to the updated upstream head
reset -- reset current HEAD to the specified state
restore -- restore working tree files
$ git s<tab>
show -- show various types of objects
status -- show the working tree status
switch -- switch branches
Signed-off-by: Terry Moschou <tmoschou@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters help reduce the amount of tree
parsing required during history queries. Before calculating a
diff, we can ask the filter if a path changed between a commit
and its first parent. If the filter says "no" then we can move
on without parsing trees. If the filter says "maybe" then we
parse trees to discover if the answer is actually "yes" or "no".
When computing a blame, there is a section in find_origin() that
computes a diff between a commit and one of its parents. When this
is the first parent, we can check the Bloom filters before calling
diff_tree_oid().
In order to make this work with the blame machinery, we need to
initialize a struct bloom_key with the initial path. But also, we
need to add more keys to a list if a rename is detected. We then
check to see if _any_ of these keys answer "maybe" in the diff.
During development, I purposefully left out this "add a new key
when a rename is detected" to see if the test suite would catch
my error. That is how I discovered the issues with
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS from the previous change.
With that change, we can feel some confidence in the coverage of
this change.
If a user requests copy detection using "git blame -C", then there
are more places where the set of "important" files can expand. I
do not know enough about how this happens in the blame machinery.
Thus, the Bloom filter integration is explicitly disabled in this
mode. A later change could expand the bloom_key data with an
appropriate call (or calls) to add_bloom_key().
If we did not disable this mode, then the following tests would
fail:
t8003-blame-corner-cases.sh
t8011-blame-split-file.sh
Generally, this is a performance enhancement and should not
change the behavior of 'git blame' in any way. If a repo has a
commit-graph file with computed changed-path Bloom filters, then
they should notice improved performance for their 'git blame'
commands.
Here are some example timings that I found by blaming some paths
in the Linux kernel repository:
git blame arch/x86/kernel/topology.c >/dev/null
Before: 0.83s
After: 0.24s
git blame kernel/time/time.c >/dev/null
Before: 0.72s
After: 0.24s
git blame tools/perf/ui/stdio/hist.c >/dev/null
Before: 0.27s
After: 0.11s
I specifically looked for "deep" paths that were also edited many
times. As a counterpoint, the MAINTAINERS file was edited many
times but is located in the root tree. This means that the cost of
computing a diff relative to the pathspec is very small. Here are
the timings for that command:
git blame MAINTAINERS >/dev/null
Before: 20.1s
After: 18.0s
These timings are the best of five. The worst-case runs were on the
order of 2.5 minutes for both cases. Note that the MAINTAINERS file
has 18,740 lines across 17,000+ commits. This happens to be one of
the cases where this change provides the least improvement.
The lack of improvement for the MAINTAINERS file and the relatively
modest improvement for the other examples can be easily explained.
The blame machinery needs to compute line-level diffs to determine
which lines were changed by each commit. That makes up a large
proportion of the computation time, and this change does not
attempt to improve on that section of the algorithm. The
MAINTAINERS file is large and changed often, so it takes time to
determine which lines were updated by which commit. In contrast,
the code files are much smaller, and it takes longer to comute
the line-by-line diff for a single patch on the Linux mailing
lists.
Outside of the "-C" integration, I believe there is little more to
gain from the changed-path Bloom filters for 'git blame' after this
patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_COMMIT_GRAPH environment variable updates the commit-
graph file whenever "git commit" is run, ensuring that we always
have an updated commit-graph throughout the test suite. The
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variable was
introduced to write the changed-path Bloom filters whenever "git
commit-graph write" is run. However, the GIT_TEST_COMMIT_GRAPH
trick doesn't launch a separate process and instead writes it
directly.
To expand the number of tests that have commits in the commit-graph
file, add a helper method that computes the commit-graph and place
that helper inside "git commit" and "git merge".
In the helper method, check GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS
to ensure we are writing changed-path Bloom filters whenever
possible.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters work only when we can compute an
explicit Bloom filter key in advance. When a pathspec is given
that allows case-insensitive checks or wildcard matching, we
must disable the Bloom filter performance checks.
By checking the pathspec in prepare_to_use_bloom_filters(), we
avoid setting up the Bloom filter data and thus revert to the
usual logic.
Before this change, the following tests would fail*:
t6004-rev-list-path-optim.sh (Tests 6-7)
t6130-pathspec-noglob.sh (Tests 3-6)
t6131-pathspec-icase.sh (Tests 3-5)
*These tests would fail when using GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_BLOOM_FILTERS except that the latter
environment variable was not set up correctly to write the changed-
path Bloom filters in the test suite. That will be fixed in the
next change.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To help pinpoint the source of a regression, it is useful to know some
info about the compiler which the user's Git client was built with. By
adding a generic get_compiler_info() in 'compat/' we can choose which
relevant information to share per compiler; to get started, let's
demonstrate the version of glibc if the user built with 'gcc'.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The contents of uname() can give us some insight into what sort of
system the user is running on, and help us replicate their setup if need
be. The domainname field is not guaranteed to be available, so don't
collect it.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Knowing which version of Git a user has and how it was built allows us
to more precisely pin down the circumstances when a certain issue
occurs, so teach bugreport how to tell us the same output as 'git
version --build-options'.
It's not ideal to directly call 'git version --build-options' because
that output goes to stdout. Instead, wrap the version string in a helper
within help.[ch] library, and call that helper from within the bugreport
library.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Git how to prompt the user for a good bug report: reproduction
steps, expected behavior, and actual behavior. Later, Git can learn how
to collect some diagnostic information from the repository.
If users can send us a well-written bug report which contains diagnostic
information we would otherwise need to ask the user for, we can reduce
the number of question-and-answer round trips between the reporter and
the Git contributor.
Users may also wish to send a report like this to their local "Git
expert" if they have put their repository into a state they are confused
by.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting in 3ac68a93fd, help.o began to depend on builtin/branch.o,
builtin/clean.o, and builtin/config.o. This meant that help.o was
unusable outside of the context of the main Git executable.
To make help.o usable by other commands again, move list_config_help()
into builtin/help.c (where it makes sense to assume other builtin libraries
are present).
When command-list.h is included but a member is not used, we start to
hear a compiler warning. Since the config list is generated in a fairly
different way than the command list, and since commands and config
options are semantically different, move the config list into its own
header and move the generator into its own script and build rule.
For reasons explained in 976aaedc (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29), some build
artifacts we consider non-source files cannot be generated in the
Visual Studio environment, and we already have some Makefile tweaks
to help Visual Studio to use generated command-list.h header file.
Do the same to a new generated file, config-list.h, introduced by
this change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
In 'git log', the --decorate-refs-exclude option appends a pattern
to a string_list. This list is used to prevent showing some refs
in the decoration output, or even by --simplify-by-decoration.
Users may want to use their refs space to store utility refs that
should not appear in the decoration output. For example, Scalar [1]
runs a background fetch but places the "new" refs inside the
refs/scalar/hidden/<remote>/* refspace instead of refs/<remote>/*
to avoid updating remote refs when the user is not looking. However,
these "hidden" refs appear during regular 'git log' queries.
A similar idea to use "hidden" refs is under consideration for core
Git [2].
Add the 'log.excludeDecoration' config option so users can exclude
some refs from decorations by default instead of needing to use
--decorate-refs-exclude manually. The config value is multi-valued
much like the command-line option. The documentation is careful to
point out that the config value can be overridden by the
--decorate-refs option, even though --decorate-refs-exclude would
always "win" over --decorate-refs.
Since the 'log.excludeDecoration' takes lower precedence to
--decorate-refs, and --decorate-refs-exclude takes higher
precedence, the struct decoration_filter needed another field.
This led also to new logic in load_ref_decorations() and
ref_filter_match().
There are several tests in t4202-log.sh that test the
--decorate-refs-(include|exclude) options, so these are extended.
Since the expected output is already stored as a file, most tests
could simply replace a "--decorate-refs-exclude" option with an
in-line config setting. Other tests involve the precedence of
the config option compared to command-line options and needed more
modification.
[1] https://github.com/microsoft/scalar
[2] https://lore.kernel.org/git/77b1da5d3063a2404cd750adfe3bb8be9b6c497d.1585946894.git.gitgitgadget@gmail.com/
Helped-by: Junio C Hamano <gister@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref_filter_match() method is defined in refs.h and implemented
in refs.c, but is only used by add_ref_decoration() in log-tree.c.
Move it into that file as a static helper method. The
match_ref_pattern() comes along for the ride.
While moving the code, also make a slight adjustment to have
ref_filter_match() take a struct decoration_filter pointer instead
of multiple string lists. This is non-functional, but will make a
later change be much cleaner.
The diff is easier to parse when using the --color-moved option.
Reported-by: Junio C Hamano <gister@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "mboxrd" pretty format was introduced in 9f23e04061 (pretty: support
"mboxrd" output format, 2016-06-05) but wasn't mentioned in the
documentation.
Signed-off-by: Emma Brooks <me@pluvano.com>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the supplied integer for "precision" is negative in
`"%.*s", len, line` then it is ignored. So the current code is
equivalent to just `"%s", line` because it is executed only if
`len` is negative.
Fix this by saving the value of `len` before overwriting it with the
return value of `parse_git_diff_header()`.
Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git range-diff` calls `git log` internally and tries to parse its
output. But `git log` output can be customized by the user in their
git config and for certain configurations either an error will be
returned by `git range-diff` or it will crash.
To fix this explicitly set the output format of the internally
executed `git log` with `--pretty=medium`. Because that cancels
`--notes`, add explicitly `--notes` at the end.
Also, make sure we never crash in the same way - trying to dereference
`util` which was never created and has remained NULL. It would happen
if the first line of `git log` output does not begin with 'commit '.
Alternative considered but discarded - somehow disable all git configs
and behave as if no config is present in the internally executed
`git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read
`.git/config`.
Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's unusual to see:
https://example.com?query-parameters
without an intervening slash, like:
https://example.com/some-path?query-parameters
or even:
https://example.com/?query-parameters
but it is a valid end to the hostname (actually "authority component")
according to RFC 3986. Likewise for "#".
And curl will parse the URL according to the standard, meaning it will
contact example.com, but our credential code would ask about a bogus
hostname with a "?" in it. Let's make sure we follow the standard, and
more importantly ask about the same hosts that curl will be talking to.
It would be nice if we could just ask curl to parse the URL for us. But
it didn't grow a URL-parsing API until 7.62, so we'd be stuck with
fallback code either way. Plus we'd need this code in the main Git
binary, where we've tried to avoid having a link dependency on libcurl.
But let's at least fix our parser. Moving to curl's parser would prevent
other potential discrepancies, but this gives us immediate relief for
the known problem, and would help our fallback code if we eventually use
curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update freshen_file() to use a NULL `times', semantically equivalent to
the currently setup, with an explicit `actime' and `modtime' set to the
"current time", but with the advantage that it works with other files
not owned by the current user.
Fixes an issue on shared repos with a split index, where eventually a
user's operation creates a shared index, and another user will later do
an operation that will try to update its freshness, but will instead
raise a warning:
$ git status
warning: could not freshen shared index '.git/sharedindex.bd736fa10e0519593fefdb2aec253534470865b2'
Signed-off-by: Luciano Miguel Ferreira Rocha <luciano.rocha@booking.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When operating on a stream of commit OIDs on stdin, 'git commit-graph
write' checks that each OID refers to an object that is indeed a commit.
This is convenient to make sure that the given input is well-formed, but
can sometimes be undesirable.
For example, server operators may wish to feed the refnames that were
updated during a push to 'git commit-graph write --input=stdin-commits',
and silently discard refs that don't point at commits. This can be done
by combing the output of 'git for-each-ref' with '--format
%(*objecttype)', but this requires opening up a potentially large number
of objects. Instead, it is more convenient to feed the updated refs to
the commit-graph machinery, and let it throw out refs that don't point
to commits.
Introduce '--[no-]check-oids' to make such a behavior possible. With
'--check-oids' (the default behavior to retain backwards compatibility),
'git commit-graph write' will barf on a non-commit line in its input.
With 'no-check-oids', such lines will be silently ignored, making the
above possible by specifying this option.
No matter which is supplied, 'git commit-graph write' retains the
behavior from the previous commit of rejecting non-OID inputs like
"HEAD" and "refs/heads/foo" as before.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'write_commit_graph()' function takes in either a string list of
pack indices, or a string list of hexadecimal commit OIDs. These
correspond to the '--stdin-packs' and '--stdin-commits' mode(s) from
'git commit-graph write'.
Using a string_list of hexadecimal commit IDs is not the most efficient
use of memory, since we can instead use the 'struct oidset', which is
more well-suited for this case.
This has another benefit which will become apparent in the following
commit. This is that we are about to disambiguate the kinds of errors we
produce with '--stdin-commits' into "non-hex input" and "hex-input, but
referring to a non-commit object". By having 'write_commit_graph' take
in a 'struct oidset *' of commits, we place the burden on the caller (in
this case, the builtin) to handle the first case, and the commit-graph
machinery can handle the second case.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally, it may be useful for callers to know the number of object
IDs in an oidset. Right now, the only way to compute this is to call
'kh_size' on the internal 'kh_set_oid_t'.
Similar to how we wrap other 'kh_*' functions over the 'oidset' type,
let's allow callers to compute this value by introducing 'oidset_size'.
We will add its first caller in the subsequent commit.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using split commit-graphs, it is sometimes useful to completely
replace the commit-graph chain with a new base.
For example, consider a scenario in which a repository builds a new
commit-graph incremental for each push. Occasionally (say, after some
fixed number of pushes), they may wish to rebuild the commit-graph chain
with all reachable commits.
They can do so with
$ git commit-graph write --reachable
but this removes the chain entirely and replaces it with a single
commit-graph in 'objects/info/commit-graph'. Unfortunately, this means
that the next push will have to move this commit-graph into the first
layer of a new chain, and then write its new commits on top.
Avoid such copying entirely by allowing the caller to specify that they
wish to replace the entirety of their commit-graph chain, while also
specifying that the new commit-graph should become the basis of a fresh,
length-one chain.
This addresses the above situation by making it possible for the caller
to instead write:
$ git commit-graph write --reachable --split=replace
which writes a new length-one chain to 'objects/info/commit-graphs',
making the commit-graph incremental generated by the subsequent push
relatively cheap by avoiding the aforementioned copy.
In order to do this, remove an assumption in 'write_commit_graph_file'
that chains are always at least two incrementals long.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we laid the groundwork for supporting different
splitting strategies. In this commit, we introduce the first splitting
strategy: 'no-merge'.
Passing '--split=no-merge' is useful for callers which wish to write a
new incremental commit-graph, but do not want to spend effort condensing
the incremental chain [1]. Previously, this was possible by passing
'--size-multiple=0', but this no longer the case following 63020f175f
(commit-graph: prefer default size_mult when given zero, 2020-01-02).
When '--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain, and it will always write a new incremental.
[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is not dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.
In the next two commits, we will introduce additional splitting
strategies that can exert additional control over:
- when a split commit-graph is and isn't written, and
- when the existing commit-graph chain is discarded completely and
replaced with another graph
To prepare for this, make '--split' take an optional strategy (as in
'--split[=<strategy>]'), and add a new enum to describe which strategy
is being used. For now, no strategies are given, and the only enumerated
value is 'COMMIT_GRAPH_SPLIT_UNSPECIFIED', indicating the absence of a
strategy.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 61df89c8e5 (commit-graph: don't early exit(1) on e.g. "git status",
2019-03-25), the former 'load_commit_graph_one' was refactored into
'open_commit_graph' and 'load_commit_graph_one_fd_st' as a means of
avoiding an early-exit from non-library code.
However, 'load_commit_graph_one' does not support commit-graph chains,
and hence the 'read-graph' test tool does not work with them.
Replace 'load_commit_graph_one' with 'read_commit_graph_one' in order to
support commit-graph chains. In the spirit of 61df89c8e5,
'read_commit_graph_one' does not ever 'die()', making it a suitable
replacement here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function won't work anywhere else, so let's mark it as an explicit
bug if it is called on a non-Windows platform.
Let's also rename the function to avoid cluttering the global namespace
with an overly-generic function name.
While at it, we also fix the code comment above that function: the
lower-case `windows` refers to something different than `Windows`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function uses Windows' system tool `attrib` to determine the state
of the hidden flag of a file or directory.
We should not actually expect the first `attrib.exe` in the PATH to
be the one we are looking for. Or that it is in the PATH, for that
matter.
Let's use the full path to the tool instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `is_hidden` function can be used (only on Windows) to determine
whether a directory or file have their `hidden` flag set.
This function is duplicated between two test scripts. It is better to
move it into `test-lib-functions.sh` so that it is reused.
This patch is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using `starts_with()`, the magic number 7, `strlen()` and a
fair number of additions to verify the three parts of the config key
"branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them
more explicitly.
We need to introduce a new variable for this (we certainly can't modify
`k` just because we see "branch."!). With `skip_prefix()` we often use
quite bland names like `p` or `str`. Let's do the same. If and when this
function needs to do more prefix-skipping, we'll have a generic variable
ready for this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing against an upstream that has had many commits since the
original branch was created:
O -- O -- ... -- O -- O (upstream)
\
-- O (my-dev-branch)
it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.
Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.
This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user specifies the apply backend with options that only work
with the merge backend, such as
git rebase --apply --exec /bin/true HEAD~3
the error message has always been
fatal: --exec requires an interactive rebase
This error message is misleading and was one of the reasons we renamed
the interactive backend to the merge backend. Update the error message
to state that these options merely require use of the merge backend.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d48e5e21da ("rebase (interactive-backend): make --keep-empty the
default", 2020-02-15) turned --keep-empty (for keeping commits which
start empty) into the default. The logic underpinning that commit was:
1) 'git commit' errors out on the creation of empty commits without an
override flag
2) Once someone determines that the override is worthwhile, it's
annoying and/or harmful to required them to take extra steps in
order to keep such commits around (and to repeat such steps with
every rebase).
While the logic on which the decision was made is sound, the result was
a bit of an overcorrection. Instead of jumping to having --keep-empty
being the default, it jumped to making --keep-empty the only available
behavior. There was a simple workaround, though, which was thought to
be good enough at the time. People could still drop commits which
started empty the same way the could drop any commits: by firing up an
interactive rebase and picking out the commits they didn't want from the
list. However, there are cases where external tools might create enough
empty commits that picking all of them out is painful. As such, having
a flag to automatically remove start-empty commits may be beneficial.
Provide users a way to drop commits which start empty using a flag that
existed for years: --no-keep-empty. Interpret --keep-empty as
countermanding any previous --no-keep-empty, but otherwise leaving
--keep-empty as the default.
This might lead to some slight weirdness since commands like
git rebase --empty=drop --keep-empty
git rebase --empty=keep --no-keep-empty
look really weird despite making perfect sense (the first will drop
commits which become empty, but keep commits that started empty; the
second will keep commits which become empty, but drop commits which
started empty). However, --no-keep-empty was named years ago and we are
predominantly keeping it for backward compatibility; also we suspect it
will only be used rarely since folks already have a simple way to drop
commits they don't want with an interactive rebase.
Reported-by: Bryan Turner <bturner@atlassian.com>
Reported-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While many users who intentionally create empty commits do not want them
thrown away by a rebase, there are third-party tools that generate empty
commits that a user might not want. In the past, users have used rebase
to get rid of such commits (a side-effect of the fact that the --apply
backend is not currently capable of keeping them). While such users
could fire up an interactive rebase and just remove the lines
corresponding to empty commits, that might be difficult if the
third-party tool generates many of them. Simplify this task for users
by marking such lines with a suffix of " # empty" in the todo list.
Suggested-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the last few commits have made it possible for the config parser
to handle config files up to the limits of size_t, the rest of the code
isn't really ready for this. In particular, we often feed the keys as
strings into printf "%s" format specifiers. And because the printf
family of functions must return an int to specify the result, they
complain. Here are two concrete examples (using glibc; we're in
uncharted territory here so results may vary):
Generate a gigantic .gitmodules file like this:
git submodule add /some/other/repo foo
{
printf '[submodule "'
perl -e 'print "a" x 2**31'
echo '"]path = foo'
} >.gitmodules
git commit -m 'huge gitmodule'
then try this:
$ git show
BUG: strbuf.c:397: your vsnprintf is broken (returned -1)
The problem is that we end up calling:
strbuf_addf(&sb, "submodule.%s.ignore", submodule_name);
which relies on vsnprintf(), and that function has no way to report back
a size larger than INT_MAX.
Taking that same file, try this:
git config --file=.gitmodules --list --name-only
On my system it produces an output with exactly 4GB of spaces. I
confirmed in a debugger that we reach the config callback with the key
intact: it's 2147483663 bytes and full of a's. But when we print it with
this call:
printf("%s%c", key_, term);
we just get the spaces.
So given the fact that these are insane cases which we have no need to
support, the weird behavior from feeding the results to printf even if
the code is careful, and the possibility of uncareful code introducing
its own integer truncation issues, let's just declare INT_MAX as a limit
for parsing config files.
We'll enforce the limit in get_next_char(), which generalizes over all
sources (blobs, files, etc) and covers any element we're parsing
(whether section, key, value, etc). For simplicity, the limit is over
the length of the _whole_ file, so you couldn't have two 1GB values in
the same file. This should be perfectly fine, as the expected size for
config files is generally kilobytes at most.
With this patch both cases above will yield:
fatal: bad config line 1 in file .gitmodules
That's not an amazing error message, but the parser isn't set up to
provide specific messages (it just breaks out of the parsing loop and
gives that generic error even if see a syntactic issue). And we really
wouldn't expect to see this case outside of somebody maliciously probing
the limits of the config system.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the config parsing infrastructure is limited in what it can
parse only by the size of memory, because it parses character by
character, building up strbufs for keys, values, etc. One exception is
the "baselen" value we keep in git_parse_source(), which is an int.
That stores the length of the section.subsection base, to which we can
then append individual key names (by truncating back to the baselen with
strbuf_setlen(), and then appending characters for the key name). But
because it's an int, if we see an absurdly long section or subsection,
we may overflow the integer, wrapping negative. That negative value is
then implicitly cast to a size_t when we pass it to strbuf_setlen(),
creating a very large value and triggering a BUG. For example:
$ {
printf '[foo "'
perl -e 'print "a" x 2**31'
echo '"]bar = value'
} >huge
$ git config --file=huge --list
fatal: BUG: strbuf_setlen() beyond buffer
While this is obviously a silly case that we don't care about
supporting, it's worth fixing it by switching to a size_t for a few
reasons:
- we should try to avoid hitting BUG assertions at all
- avoiding integer truncation or overflow sets a good example and
makes it easier to audit the code for more important issues
- the BUG outcome is what happens in _this_ instance, because we wrap
negative. If we used a 2**32 subsection, we'd wrap to a small
positive value and actually generate wrong output (the subsection of
our key would be truncated).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the recent change to parse_config_key(), the best type to return
a string length is a size_t, as it won't cause integer truncation for a
gigantic key. And as with that change, this is mostly a clarity /
hygiene issue for now, as our config parser would choke on such a large
key anyway.
There are a few ripple effects within the config code, as callers switch
to using size_t. I also adjusted a few related variables that iterate
over strings. The most unexpected change is that a call to strbuf_addf()
had to switch to strbuf_add(). We can't use a size_t with "%.*s",
because printf precisions must have type "int" (we could cast, of
course, but that would miss the point of using size_t in the first
place).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We compute the length of a subset of a string, but then use that length
only to feed a "%.*s" printf placeholder for the same string. We can
just use "%s" to achieve the same thing.
The variable became useless in cb891a5989 (Use a strbuf for building up
section header and key/value pair strings., 2007-12-14), which swapped
out a write() which _did_ use the length for a strbuf_addf() call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return the length to a subset of a string using an "int *"
out-parameter. This is fine most of the time, as we'd expect config keys
to be relatively short, but it could behave oddly if we had a gigantic
config key. A more appropriate type is size_t.
Let's switch over, which lets our callers use size_t as appropriate
(they are bound by our type because they must pass the out-parameter as
a pointer). This is mostly just a cleanup to make it clear this code
handles long strings correctly. In practice, our config parser already
chokes on long key names (because of a similar int/size_t mixup!).
When doing an int/size_t conversion, we have to be careful that nobody
was trying to assign a negative value to the variable. I manually
confirmed that for each case here. They tend to just feed the result to
xmemdupz() or similar; in a few cases I adjusted the parameter types for
helper functions to make sure the size_t is preserved.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The make_branch() and make_rewrite() functions can take a NUL-terminated
string or a ptr/len pair. They use a sentinel value of "0" for the len
to tell the difference between the two. However, when parsing config
like:
[branch ""]
merge = whatever
whose key flattens to:
branch..merge
we might actually have a zero-length branch name. This is obviously
nonsense, but the current code would consider it as a NUL-terminated
string and use the branch name ".merge".
We could use a better sentinel value here (like "-1"), but that gets in
the way of moving to size_t, which is a more appropriate type for a
ptr/len combo.
Let's instead just drop this feature and have the callers (of which
there are only two total) use strlen() themselves. This simplifies the
code, and lets us move to using size_t.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On certain network filesystems (currently encountered with Isilon, but
in theory more network storage solutions could be causing the same
issue), when the directory in question is missing,
`raceproof_create_file()` fails with an `ERROR_INVALID_PARAMETER`
instead of an `ERROR_PATH_NOT_FOUND`.
Since it is highly unlikely that we produce such an error by mistake
(the parameters we pass are fairly benign), we can be relatively certain
that the directory is missing in this instance. So let's just translate
that error automagically.
This fixes https://github.com/git-for-windows/git/issues/1345.
Signed-off-by: Nathan Sanders <spekbukkem@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Arguably, CI builds' most important task is to not only identify
regressions, but to make it as easy as possible to investigate what went
wrong.
In that light, we will want to provide users with a way to inspect the
tests' output as well as the corresponding directories.
This commit adds build steps that are only executed when tests failed,
uploading the relevant information as build artifacts. These artifacts
can then be downloaded by interested parties to diagnose the failures
more efficiently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test fails, it is nice to see where the corresponding code lives
in the worktree. Sadly, it seems that only Bash allows us to infer this
information. Let's do it when we detect that we're running in a Bash.
This will come in handy in the next commit, where we teach the GitHub
Actions workflow to annotate failed test runs with this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have GitHub Actions now. Running the same builds and tests in Azure
Pipelines would be redundant, and a waste of energy.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds CI builds via GitHub Actions. While the underlying
technology is at least _very_ similar to that of Azure Pipelines, GitHub
Actions are much easier to set up than Azure Pipelines:
- no need to install a GitHub App,
- no need to set up an Azure DevOps account,
- all you need to do is push to your fork on GitHub.
Therefore, it makes a lot of sense for us to have a working GitHub
Actions setup.
While copy/editing `azure-pipelines.yml` into
`.github/workflows/main.yml`, we also use the opportunity to accelerate
the step that sets up a minimal subset of Git for Windows' SDK in the
Windows-build job:
- we now download a `.tar.xz` stored in Azure Blobs and extract it
simultaneously by calling `curl` and piping the result to `tar`,
- decompressing via `xz`,
- all three utilities are installed together with Git for Windows
At the same time, we also make use of the matrix build feature, which
reduces the amount of repeated text by quite a bit.
Also, we do away with the parts that try to mount a file share on which
`prove` can store data between runs. It is just too complicated to set
up, and most times the tree changes anyway, so there is little return on
investment there.
Initial-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will run Documentation job in GitHub Actions.
The job will run without elevated permission.
Run `gem` with `sudo` to elevate permission in order to be able to
install to system location.
This will also keep this installation in-line with other installation in
our Linux system for CI.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[Danh: reword commit message]
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will support GitHub Action.
Explicitly install all of our build dependencies on Linux.
Since GitHub Action's Linux VM hasn't installed our build dependencies.
And there're no harm to reinstall them (in Travis)
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At least one interactive command writes a prompt to `stdout` and then
reads user input on `stdin`: `git clean --interactive`. If the prompt is
left in the buffer, the user will not realize the program is waiting for
their input.
So let's just flush `stdout` before reading the user's input.
Signed-off-by: 마누엘 <nalla@hamal.uberspace.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are quite a few code locations (e.g. `git clean --interactive`)
where Git asks the user for an answer. In preparation for fixing a bug
shared by all of them, and also to DRY up the code, let's refactor it.
Please note that most of these callers trimmed white-space both at the
beginning and at the end of the answer, instead of trimming only the
end (as the caller in `add-patch.c` does).
Therefore, technically speaking, we change behavior in this patch. At
the same time, it can be argued that this is actually a bug fix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSYS2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.
That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.
Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default file history simplification of "git log -- <path>" or
"git rev-list -- <path>" focuses on providing the smallest set of
commits that first contributed a change. The revision walk greatly
restricts the set of walked commits by visiting only the first
TREESAME parent of a merge commit, when one exists. This means
that portions of the commit-graph are not walked, which can be a
performance benefit, but can also "hide" commits that added changes
but were ignored by a merge resolution.
The --full-history option modifies this by walking all commits and
reporting a merge commit as "interesting" if it has _any_ parent
that is not TREESAME. This tends to be an over-representation of
important commits, especially in an environment where most merge
commits are created by pull request completion.
Suppose we have a commit A and we create a commit B on top that
changes our file. When we merge the pull request, we create a merge
commit M. If no one else changed the file in the first-parent
history between M and A, then M will not be TREESAME to its first
parent, but will be TREESAME to B. Thus, the simplified history
will be "B". However, M will appear in the --full-history mode.
However, suppose that a number of topics T1, T2, ..., Tn were
created based on commits C1, C2, ..., Cn between A and M as
follows:
A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn
\ \ \ \ / / / /
\ \__.. \ \/ ..__T1 / Tn
\ \__.. /\ ..__T2 /
\_____________________B \____________________/
If the commits T1, T2, ... Tn did not change the file, then all of
P1 through Pn will be TREESAME to their first parent, but not
TREESAME to their second. This means that all of those merge commits
appear in the --full-history view, with edges that immediately
collapse into the lower history without introducing interesting
single-parent commits.
The --simplify-merges option was introduced to remove these extra
merge commits. By noticing that the rewritten parents are reachable
from their first parents, those edges can be simplified away. Finally,
the commits now look like single-parent commits that are TREESAME to
their "only" parent. Thus, they are removed and this issue does not
cause issues anymore. However, this also ends up removing the commit
M from the history view! Even worse, the --simplify-merges option
requires walking the entire history before returning a single result.
Many Git users are using Git alongside a Git service that provides
code storage alongside a code review tool commonly called "Pull
Requests" or "Merge Requests" against a target branch. When these
requests are accepted and merged, they typically create a merge
commit whose first parent is the previous branch tip and the second
parent is the tip of the topic branch used for the request. This
presents a valuable order to the parents, but also makes that merge
commit slightly special. Users may want to see not only which
commits changed a file, but which pull requests merged those commits
into their branch. In the previous example, this would mean the
users want to see the merge commit "M" in addition to the single-
parent commit "C".
Users are even more likely to want these merge commits when they
use pull requests to merge into a feature branch before merging that
feature branch into their trunk.
In some sense, users are asking for the "first" merge commit to
bring in the change to their branch. As long as the parent order is
consistent, this can be handled with the following rule:
Include a merge commit if it is not TREESAME to its first
parent, but is TREESAME to a later parent.
These merges look like the merge commits that would result from
running "git pull <topic>" on a main branch. Thus, the option to
show these commits is called "--show-pulls". This has the added
benefit of showing the commits created by closing a pull request or
merge request on any of the Git hosting and code review platforms.
To test these options, extend the standard test example to include
a merge commit that is not TREESAME to its first parent. It is
surprising that that option was not already in the example, as it
is instructive.
In particular, this extension demonstrates a common issue with file
history simplification. When a user resolves a merge conflict using
"-Xours" or otherwise ignoring one side of the conflict, they create
a TREESAME edge that probably should not be TREESAME. This leads
users to become frustrated and complain that "my change disappeared!"
In my experience, showing them history with --full-history and
--simplify-merges quickly reveals the problematic merge. As mentioned,
this option is expensive to compute. The --show-pulls option
_might_ show the merge commit (usually titled "resolving conflicts")
more quickly. Of course, this depends on the user having the correct
parent order, which is backwards when using "git pull master" from a
topic branch.
There are some special considerations when combining the --show-pulls
option with --simplify-merges. This requires adding a new PULL_MERGE
object flag to store the information from the initial TREESAME
comparisons. This helps avoid dropping those commits in later filters.
This is covered by a test, including how the parents can be simplified.
Since "struct object" has already ruined its 32-bit alignment by using
33 bits across parsed, type, and flags member, let's not make it worse.
PULL_MERGE is used in revision.c with the same value (1u<<15) as
REACHABLE in commit-graph.c. The REACHABLE flag is only used when
writing a commit-graph file, and a revision walk using --show-pulls
does not happen in the same process. Care must be taken in the future
to ensure this remains the case.
Update Documentation/rev-list-options.txt with significant details
around this option. This requires updating the example in the
History Simplification section to demonstrate some of the problems
with TREESAME second parents.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, `--autostash` only worked with `git pull --rebase`. However, in
the last patch, merge learned `--autostash` as well so there's no reason
why we should have this restriction anymore. Teach pull to pass
`--autostash` to merge, just like it did for rebase.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, test_pull_autostash() was hardcoded to run
`test_cmp_rev HEAD^ copy` to test that a rebase happened. However, in a
future patch, we plan on testing merging as well. Make
test_pull_autostash() accept a parent number as an argument so that, in
the future, we can test if a merge happened in addition to a rebase.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rebase, one can pass the `--autostash` option to cause the worktree
to be automatically stashed before continuing with the rebase. This
option is missing in merge, however.
Implement the `--autostash` option and corresponding `merge.autoStash`
option in merge which stashes before merging and then pops after.
This option is useful when a developer has some local changes on a topic
branch but they realize that their work depends on another branch.
Previously, they had to run something like
git fetch ...
git stash push
git merge FETCH_HEAD
git stash pop
but now, that is reduced to
git fetch ...
git merge --autostash FETCH_HEAD
When an autostash is generated, it is automatically reapplied to the
worktree only in three explicit situations:
1. An incomplete merge is commit using `git commit`.
2. A merge completes successfully.
3. A merge is aborted using `git merge --abort`.
In all other situations where the merge state is removed using
remove_merge_branch_state() such as aborting a merge via
`git reset --hard`, the autostash is saved into the stash reflog
instead keeping the worktree clean.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Suggested-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split apply_save_autostash() into apply_autostash_oid() and
apply_save_autostash() where the former operates on an OID string and
the latter reads the OID from a file before passing it into
apply_save_autostash_oid().
This function is required for a future commmit which will rely on being
able to apply an autostash whose OID is stored as a string.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract common functionality of apply_autostash() into
apply_save_autostash() and use it to implement save_autostash(). This
function will be used in a future commit.
The difference between save_autostash() and apply_autostash() is that
the former does not try to apply the stash. It skips that step and
just stores the created entry in the stash reflog.
This is useful in the case where we abort an operation when an autostash
is present but we don't want to dirty the worktree with the application
of the stash. For example, in a future commit, we will implement
`git merge --autostash`. Since merges can be aborted using
`git reset --hard`, we'd make use of save_autostash() to save the
autostash entry instead of applying it to the worktree thus keeping the
worktree undirtied.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explicitly remove autostash file in apply_autostash() once it has been
applied successfully.
This is currently a no-op because the only users of this function will unlink
the state (including the autostash file) after this function runs.
However, in the future, we will introduce a user of the function that
does not explicitly remove the state so we do it here.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lib-ify the autostash code by extracting perform_autostash() from rebase
into sequencer. In a future commit, this will be used to implement
`--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying create_autostash() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions and `the_repository`. Also, make it accept a `path` argument
so that we no longer rely have to rely on `struct rebase_options`.
Finally, make it accept a `default_reflog_action` argument so we no
longer have to rely on `DEFAULT_REFLOG_ACTION`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will lib-ify this code. In preparation for
this, extract the code into the create_autostash() function so that it
can be cleaned up before it is finally lib-ified.
This patch is best viewed with `--color-moved` and
`--color-moved-ws=allow-indentation-change`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the process of lib-ifying the autostash code. In a future
commit, this will be used to implement `--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying reset_head() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions. Also, make it accept a `const char *default_reflog_action`
argument so that the default action of "rebase" isn't hardcoded in.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The apply_autostash() function in builtin/rebase.c is similar enough to
the apply_autostash() function in sequencer.c that they are almost
interchangeable, except for the type of arg they accept. Make the
sequencer.c version extern and use it in rebase.
The rebase version was introduced in 6defce2b02 (builtin rebase: support
`--autostash` option, 2018-09-04) as part of the shell to C conversion.
It opted to duplicate the function because, at the time, there was
another in-progress project converting interactive rebase from shell to
C as well and they did not want to clash with them by refactoring
sequencer.c version of apply_autostash(). Since both efforts are long
done, we can freely combine them together now.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The preferred terminology is to refer to object identifiers as "OIDs".
Rename the `stash_sha1` variable to `stash_oid` in order to conform to
this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to make apply_autostash() more generic for future extraction, make
it accept a `path` argument so that the location from where to read the
reference to the autostash commit can be customized. Remove the `opts`
argument since it was unused before anyway.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since in sequencer.c, read_one() basically duplicates the functionality
of read_oneliner(), reduce code duplication by replacing read_one() with
read_oneliner().
This was done with the following Coccinelle script
@@
expression a, b;
@@
- read_one(a, b)
+ !read_oneliner(b, a, READ_ONELINER_WARN_MISSING)
and long lines were manually broken up.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're comparing a push cert nonce, we currently do so using strcmp.
Most implementations of strcmp short-circuit and exit as soon as they
know whether two values are equal. This, however, is a problem when
we're comparing the output of HMAC, as it leaks information in the time
taken about how much of the two values match if they do indeed differ.
In our case, the nonce is used to prevent replay attacks against our
server via the embedded timestamp and replay attacks using requests from
a different server via the HMAC. Push certs, which contain the nonces,
are signed, so an attacker cannot tamper with the nonces without
breaking validation of the signature. They can, of course, create their
own signatures with invalid nonces, but they can also create their own
signatures with valid nonces, so there's nothing to be gained. Thus,
there is no security problem.
Even though it doesn't appear that there are any negative consequences
from the current technique, for safety and to encourage good practices,
let's use a constant time comparison function for nonce verification.
POSIX does not provide one, but they are easy to write.
The technique we use here is also used in NaCl and the Go standard
library and relies on the fact that bitwise or and xor are constant time
on all known architectures.
We need not be concerned about exiting early if the actual and expected
lengths differ, since the standard cryptographic assumption is that
everyone, including an attacker, knows the format of and algorithm used
in our nonces (and in any event, they have the source code and can
determine it easily). As a result, we assume everyone knows how long
our nonces should be. This philosophy is also taken by the Go standard
library and other cryptographic libraries when performing constant time
comparisons on HMAC values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c931ba4e (sha1-name.c: remove the_repo from handle_one_ref(),
2019-04-16) replaced the use of for_each_ref() helper, which works
with the main ref store of the default repository instance, with
refs_for_each_ref(), which can work on any ref store instance, by
assuming that the repository instance the function is given has its
ref store already initialized.
But it is possible that nobody has initialized it, in which case,
the code ends up dereferencing a NULL pointer.
Reported-by: Érico Rolim <erico.erc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters record an entry in the filter for
every path that was changed. This includes every add and delete,
regardless of whether a rename was detected. Detecting renames
causes significant performance issues, but also will trigger
downloading missing blobs in partial clone.
The simple fix is to disable rename detection when computing a
changed-path Bloom filter. This should already be disabled by
default, but it is good to explicitly enforce the intended
behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 1925fe0c8a ("Documentation: wrap config listings in "----"",
2019-09-07) wrapped this fairly large block of example config directives
in "----". The closing "----" ended up a few lines too early though.
Make sure to include the trailing "IncludeIf.onbranch:..." example, too.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7e9e048661 (stash -p: demonstrate failure of split with mixed y/n,
2015-04-16), a regression test for a known breakage that was added to
the test script `t3904-stash-patch.sh` that demonstrated that splitting
a hunk and trying to stash only part of that split hunk fails (but
shouldn't).
As expected, it still fails, but for the wrong reason: once the bug is
fixed, we would expect stderr to show nothing, yet the regression test
expects stderr to show something.
Let's fix that by telling that regression test case to expect nothing to
be printed to stderr.
While at it, also drop the obvious left-over from debugging where the
regression test did not mind `git stash -p` to return a non-zero exit
status.
Of course, the regression test still fails, but this time for the
correct reason.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4dc42c6c18 (mingw: refuse paths containing reserved names,
2019-12-21), we started disallowing file names that are reserved, e.g.
`NUL`, `CONOUT$`, etc.
This included `COM<n>` where `<n>` is a digit. Unfortunately, this
includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to
the official documentation, `COM0` is mentioned in the "NT Namespaces"
section but it is explicitly _omitted_ from the list of reserved names:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
Tests corroborate this: it is totally possible to write a file called
`com0.c` on Windows 10, but not `com1.c`.
So let's tighten the code to disallow only the reserved `COM<n>` file
names, but to allow `COM0` again.
This fixes https://github.com/git-for-windows/git/issues/2470.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Microsoft introduced a new "Universal C Runtime Library" (UCRT) with
Visual Studio 2015. The UCRT comes with a new strftime() implementation
that supports more date formats. We link git against the older
"Microsoft Visual C Runtime Library" (MSVCRT), so to use the UCRT
strftime() we need to load it from ucrtbase.dll using
DECLARE_PROC_ADDR()/INIT_PROC_ADDR().
Most supported Windows systems should have recieved the UCRT via Windows
update, but in some cases only MSVCRT might be available. In that case
we fall back to using that implementation.
With this change, it is possible to use e.g. the `%g` and `%V` date
format specifiers, e.g.
git show -s --format=%cd --date=format:‘%g.%V’ HEAD
Without this change, the user would see this error message on Windows:
fatal: invalid strftime format: '‘%g.%V’'
This fixes https://github.com/git-for-windows/git/issues/2495
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a (late) companion for f6461b82b9 (Documentation: fix build
with Asciidoctor 2, 2019-09-15).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit subjects or authors have non-ASCII characters, git
format-patch Q-encodes them so they can be safely sent over email.
However, if the patch transfer method is something other than email (web
review tools, sneakernet), this only serves to make the patch metadata
harder to read without first applying it (unless you can decode RFC 2047
in your head). git am as well as some email software supports
non-Q-encoded mail as described in RFC 6531.
Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the
user control this behavior.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 6cdccfce1e (i18n: make GETTEXT_POISON a runtime option,
2018-11-08), the `jobname` was adjusted to have the `GIT_TEST_` prefix,
but that prefix makes no sense in this context.
Co-authored-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitHub Action doesn't set TERM environment variable, which is required
by "tput".
Fallback to dumb if it's not set.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each CI system we support, we need a specific arm in that if/else
construct in ci/lib.sh. Let's add one for GitHub Actions.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* dd/ci-musl-libc:
travis: build and test on Linux with musl libc and busybox
ci/linux32: libify install-dependencies step
ci: refactor docker runner script
ci/linux32: parameterise command to switch arch
ci/lib-docker: preserve required environment variables
ci: make MAKEFLAGS available inside the Docker container in the Linux32 job
* dd/test-with-busybox:
t5703: feed raw data into test-tool unpack-sideband
t4124: tweak test so that non-compliant diff(1) can also be used
t7063: drop non-POSIX argument "-ls" from find(1)
t5616: use rev-parse instead to get HEAD's object_id
t5003: skip conversion test if unzip -a is unavailable
t5003: drop the subshell in test_lazy_prereq
test-lib-functions: test_cmp: eval $GIT_TEST_CMP
t4061: use POSIX compliant regex(7)
The function read_oneliner() is a generally useful util function.
Instead of hiding it as a static function within sequencer.c, extern it
so that it can be reused by others.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on externing read_oneliner(). Future users of
read_oneliner() will want the ability to output warnings in the event
that the `path` doesn't exist. Introduce the
`READ_ONELINER_WARN_MISSING` flag which, if active, would issue a
warning when a file doesn't exist by always executing warning_errno()
in the case where strbuf_read_file() fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will need read_oneliner() to accept flags other
than just `skip_if_empty`. Instead of having an argument for each flag,
teach read_oneliner() to accept the bitfield `flags` instead. For now,
only recognize the `READ_ONELINER_SKIP_IF_EMPTY` flag. More flags will
be added in a future commit.
The result of this is that parallel topics which introduce invocations
of read_oneliner() will still be compatible with this new function
signature since, instead of passing 1 or 0 for `skip_if_empty`, they'll
be passing 1 or 0 to `flags`, which gives equivalent behavior.
Mechanically fix up invocations of read_oneliner() with the following
spatch
@@
expression a, b;
@@
read_oneliner(a, b,
- 1
+ READ_ONELINER_SKIP_IF_EMPTY
)
and manually break up long lines in the result.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently check whether a file exists and return early before reading
the file. Instead of accessing the file twice, always read the file and
check `errno` to see if the file doesn't exist.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08)
optimized "diff" by prefetching blobs in a partial clone, but there are
some cases wherein blobs do not need to be prefetched. In these cases,
any command that uses the diff machinery will unnecessarily fetch blobs.
diffcore_std() may read blobs when it calls the following functions:
(1) diffcore_skip_stat_unmatch() (controlled by the config variable
diff.autorefreshindex)
(2) diffcore_break() and diffcore_merge_broken() (for break-rewrite
detection)
(3) diffcore_rename() (for rename detection)
(4) diffcore_pickaxe() (for detecting addition/deletion of specified
string)
Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(),
diffcore_break(), and diffcore_rename() to prefetch blobs upon the first
read of a missing object. This covers (1), (2), and (3): to cover the
rest, teach diffcore_std() to prefetch if the output type is one that
includes blob data (and hence blob data will be required later anyway),
or if it knows that (4) will be run.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the object reads in diff_populate_filespec() to have the first
object read not be in an if/else branch, because in a future patch, a
retry will be added to that first object read.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of diff_populate_filespec() currently can be customized
through a bitflag, but a subsequent patch requires it to support a
non-boolean option. Replace the bitflag with an options struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a lot of code to honor GIT_REFLOG_ACTION throughout git,
including some in sequencer.c; unfortunately, reflog_message() and its
callers ignored it. Instruct reflog_message() to check the existing
environment variable, and use it when present as an override to
action_name().
Also restructure pick_commits() to only temporarily modify
GIT_REFLOG_ACTION for a short duration and then restore the old value,
so that when we do this setting within a loop we do not keep adding "
(pick)" substrings and end up with a reflog message of the form
rebase (pick) (pick) (pick) (finish): returning to refs/heads/master
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, we will add new Travis Job for linux-musl.
Most of other code in this file could be reuse for that job.
Move the code to install dependencies to a common script.
Should we add new CI system that can run directly in container,
we can reuse this script for installation step.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will support alpine check in docker later in this series.
While we're at it, tell people to run as root in podman,
if podman is used as drop-in replacement for docker,
because podman will map host-user to container's root,
therefore, mapping their permission.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch, the remaining of this command will be re-used for the
CI job for linux with musl libc.
Allow customisation of the emulator, now.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using "su -m" to preserve environment variables in the shell run
by "su". But, that options will be ignored while "-l" (aka "--login") is
specified in util-linux and busybox's su.
In a later patch this script will be reused for checking Git for Linux
with musl libc on Alpine Linux, Alpine Linux uses "su" from busybox.
Since we don't have interest in all environment variables,
pass only those necessary variables to the inner script.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation refers to "initialized" or "populated" submodules,
to explain which submodules are affected by '--recurse-submodules', but
the real terminology here is 'active' submodules. Update the
documentation accordingly.
Some terminology:
- Active is defined in gitsubmodules(7), it only involves the
configuration variables 'submodule.active', 'submodule.<name>.active'
and 'submodule.<name>.url'. The function
submodule.c::is_submodule_active checks that a submodule is active.
- Populated means that the submodule's working tree is present (and the
gitfile correctly points to the submodule repository), i.e. either the
superproject was cloned with ` --recurse-submodules`, or the user ran
`git submodule update --init`, or `git submodule init [<path>]` and
`git submodule update [<path>]` separately which populated the
submodule working tree. This does not involve the 3 configuration
variables above.
- Initialized (at least in the context of the man pages involved in this
patch) means both "populated" and "active" as defined above, i.e. what
`git submodule update --init` does.
The --recurse-submodules option mostly affects active submodules. An
exception is `git fetch` where the option affects populated submodules.
As a consequence, in `git pull --recurse-submodules` the fetch affects
populated submodules, but the resulting working tree update only affects
active submodules.
In the documentation of `git-pull`, let's distinguish between the
fetching part which affects populated submodules, and the updating of
worktrees, which only affects active submodules.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default value also depends on the value of submodule.recurse.
Use this opportunity to correct some grammar mistakes in
Documentation/config/fetch.txt signaled by Robert P. J. Day.
Also mention `fetch.recurseSubmodules` in fetch-options.txt. In
git-push.txt, `push.recurseSubmodules` is implicitly mentioned (by
explaining how to disable it), so no need to add it there.
Lastly add a link to `git-fetch` in `git-pull.txt` to explain the
meaning of `--recurse-submodules` there.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also unify the formulation about --no-recurse-submodules for checkout
and switch, which we reuse for restore.
And correct the formulation about submodules' HEAD in read-tree, which
we reuse in reset.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note that `ls-files` is not affected, even though it has a
`--recurse-submodules` option, so list it as an exception too.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a custom hash in fast-import to store the set of objects we've
imported so far. It has a fixed set of 2^16 buckets and chains any
collisions with a linked list. As the number of objects grows larger
than that, the load factor increases and we degrade to O(n) lookups and
O(n^2) insertions.
We can scale better by using our hashmap.c implementation, which will
resize the bucket count as we grow. This does incur an extra memory cost
of 8 bytes per object, as hashmap stores the integer hash value for each
entry in its hashmap_entry struct (which we really don't care about
here, because we're just reusing the embedded object hash). But I think
the numbers below justify this (and our per-object memory cost is
already much higher).
I also looked at using khash, but it seemed to perform slightly worse
than hashmap at all sizes, and worse even than the existing code for
small sizes. It's also awkward to use here, because we want to look up a
"struct object_entry" from a "struct object_id", and it doesn't handle
mismatched keys as well. Making a mapping of object_id to object_entry
would be more natural, but that would require pulling the embedded oid
out of the object_entry or incurring an extra 32 bytes per object.
In a synthetic test creating as many cheap, tiny objects as possible
perl -e '
my $bits = shift;
my $nr = 2**$bits;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' $bits | git fast-import
I got these results:
nr_objects master khash hashmap
2^20 0m4.317s 0m5.109s 0m3.890s
2^21 0m10.204s 0m9.702s 0m7.933s
2^22 0m27.159s 0m17.911s 0m16.751s
2^23 1m19.038s 0m35.080s 0m31.963s
2^24 4m18.766s 1m10.233s 1m6.793s
which points to hashmap as the winner. We didn't have any perf tests for
fast-export or fast-import, so I added one as a more real-world case.
It uses an export without blobs since that's significantly cheaper than
a full one, but still is an interesting case people might use (e.g., for
rewriting history). It will emphasize this change in some ways (as a
percentage we spend more time making objects and less shuffling blob
bytes around) and less in others (the total object count is lower).
Here are the results for linux.git:
Test HEAD^ HEAD
----------------------------------------------------------------------------
9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3%
9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3%
It only has ~5.2M commits and trees, so this is a larger effect than I
expected (the 2^23 case above only improved by 50s or so, but here we
gained almost 90s). This is probably due to actually performing more
object lookups in a real import with trees and commits, as opposed to
just dumping a bunch of blobs into a pack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag to the test setup suite
in order to toggle writing Bloom filters when running any of the git tests.
If set to true, we will compute and write Bloom filters every time a test
calls `git commit-graph write`, as if the `--changed-paths` option was
passed in.
The test suite passes when GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS are enabled.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These tests exercises writing commit graph with Bloom filters
and exercises 'git log -- path' with all the applicable
options. They check that the output is the same with and
without Bloom filters, confirm Bloom filters were used by
checking if trace2 statistics were logged correctly.
Also confirms cases where Bloom filters are not used:
1. Multiple path specs,
2. --walk-reflogs (see patch titled 'revision.c: use Bloom filters...'
for details,
3. If the latest commit graph does not have Bloom filters
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 statistics around Bloom filter usage and behavior
for 'git log -- path' commands that are hoping to benefit from
the presence of computed changed paths Bloom filters.
These statistics are great for performance analysis work and
for formal testing, which we will see in the commit following
this one.
Helped-by: Derrick Stolee <dstolee@microsoft.com
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revision walk will now use Bloom filters for commits to speed up
revision walks for a particular path (for computing history for
that path), if they are present in the commit-graph file.
We load the Bloom filters during the prepare_revision_walk step,
currently only when dealing with a single pathspec. Extending
it to work with multiple pathspecs can be explored and built on
top of this series in the future.
While comparing trees in rev_compare_trees(), if the Bloom filter
says that the file is not different between the two trees, we don't
need to compute the expensive diff. This is where we get our
performance gains. The other response of the Bloom filter is '`:maybe',
in which case we fall back to the full diff calculation to determine
if the path was changed in the commit.
We do not try to use Bloom filters when the '--walk-reflogs' option
is specified. The '--walk-reflogs' option does not walk the commit
ancestry chain like the rest of the options. Incorporating the
performance gains when walking reflog entries would add more
complexity, and can be explored in a later series.
Performance Gains:
We tested the performance of `git log -- <path>` on the git repo, the linux
and some internal large repos, with a variety of paths of varying depths.
On the git and linux repos:
- we observed a 2x to 5x speed up.
On a large internal repo with files seated 6-10 levels deep in the tree:
- we observed 10x to 20x speed ups, with some paths going up to 28 times
faster.
Helped-by: Derrick Stolee <dstolee@microsoft.com
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --changed-paths option to git commit-graph write. This option will
allow users to compute information about the paths that have changed
between a commit and its first parent, and write it into the commit graph
file. If the option is passed to the write subcommand we set the
COMMIT_GRAPH_WRITE_BLOOM_FILTERS flag and pass it down to the
commit-graph logic.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add logic to
a) parse Bloom filter information from the commit graph file and,
b) re-use existing Bloom filters.
See Documentation/technical/commit-graph-format for the format in which
the Bloom filter information is written to the commit graph file.
To read Bloom filter for a given commit with lexicographic position
'i' we need to:
1. Read BIDX[i] which essentially gives us the starting index in BDAT for
filter of commit i+1. It is essentially the index past the end
of the filter of commit i. It is called end_index in the code.
2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT
for filter of commit i. It is called the start_index in the code.
For the first commit, where i = 0, Bloom filter data starts at the
beginning, just past the header in the BDAT chunk. Hence, start_index
will be 0.
3. The length of the filter will be end_index - start_index, because
BIDX[i] gives the cumulative 8-byte words including the ith
commit's filter.
We toggle whether Bloom filters should be recomputed based on the
compute_if_not_present flag.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the technical documentation for commit-graph-format with
the formats for the Bloom filter index (BIDX) and Bloom filter
data (BDAT) chunks. Write the computed Bloom filters information
to the commit graph file using this format.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f269048754 (fetch: opportunistically update tracking refs,
2013-05-11), the underlying `git fetch` in `git pull <remote> <branch>`
updates the configured remote-tracking branch for <branch>.
However, an example in the 'Examples' section of the `git pull`
documentation still states that this is not the case.
Correct the description of this example.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the `<refspec>` parameter in the `git fetch`
documentation refers to the section "CONFIGURED REMOTE-TRACKING
BRANCHES" in this same documentation page.
In the `git pull` documentation, let's also refer specifically to this
section instead of just linking to the `git fetch` documentation.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 94a88e2524 (Makefile: avoid running curl-config multiple times,
2020-03-26) put the call to $(CURL_CONFIG) into a "simple" variable
which is expanded immediately, rather than expanding it each time it's
needed. However, that also means that we expand it whenever the Makefile
is parsed, whether we need it or not.
This is wasteful, but also breaks the ci/test-documentation.sh job, as
it does not have curl at all and complains about the extra messages to
stderr. An easy way to see it is just:
$ make CURL_CONFIG=does-not-work check-builtins
make: does-not-work: Command not found
make: does-not-work: Command not found
GIT_VERSION = 2.26.0.108.gb3f3f45f29
make: does-not-work: Command not found
make: does-not-work: Command not found
./check-builtins.sh
We can get the best of both worlds if we're willing to accept a little
Makefile hackery. Courtesy of the article at:
http://make.mad-scientist.net/deferred-simple-variable-expansion/
this patch uses a lazily-evaluated recursive variable which replaces its
contents with an immediately assigned simple one on first use.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read_populate_opts(), we call read_oneliner() _after_ calling
strbuf_release(). This means that `buf` is leaked at the end of the
function.
Always clean up the strbuf by going to `done_rebase_i` whether or not
we return an error.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge with --gpg-sign option, and clarify that --no-gpg-sign also
override earlier --gpg-sign.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
{cherry-pick,revert} --edit hasn't honoured --no-gpg-sign yet.
Pass this option down to git-commit to honour it.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are 3 callers to promisor_remote_get_direct() that first check if
the number of objects to be fetched is equal to 0. Fold that check into
promisor_remote_get_direct(), and in doing so, be explicit as to what
promisor_remote_get_direct() does if oid_nr is 0 (it returns 0, success,
immediately).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have an environment variable `jobs=16` defined in our CI system, and
this environment makes our build job failed with the following message:
error: pathspec '16' did not match any file(s) known to git
The pathspec '16' for Git command is from the environment variable
"jobs".
This is because "git-submodule" command is implemented in shell script,
and environment variables may change its behavior. Set values for
uninitialized variables, such as "jobs" and "recommend_shallow" will
fix this issue.
Helped-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Li Xuejiang <xuejiang@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) command can only handle queueing transactions
right now via its "--stdin" parameter, but there is no way for users to
handle the transaction itself in a more explicit way. E.g. in a
replicated scenario, one may imagine a coordinator that spawns
git-update-ref(1) for multiple repositories and only if all agree that
an update is possible will the coordinator send a commit. Such a
transactional session could look like
> start
< start: ok
> update refs/heads/master $OLD $NEW
> prepare
< prepare: ok
# All nodes have returned "ok"
> commit
< commit: ok
or
> start
< start: ok
> create refs/heads/master $OLD $NEW
> prepare
< fatal: cannot lock ref 'refs/heads/master': reference already exists
# On all other nodes:
> abort
< abort: ok
In order to allow for such transactional sessions, this commit
introduces four new commands for git-update-ref(1), which matches those
we have internally already with the exception of "start":
- start: start a new transaction
- prepare: prepare the transaction, that is try to lock all
references and verify their current value matches the
expected one
- commit: explicitly commit a session, that is update references to
match their new expected state
- abort: abort a session and roll back all changes
By design, git-update-ref(1) will commit as soon as standard input is
being closed. While fine in a non-transactional world, it is definitely
unexpected in a transactional world. Because of this, as soon as any of
the new transactional commands is used, the default will change to
aborting without an explicit "commit". To avoid a race between queueing
updates and the first "prepare" that starts a transaction, the "start"
command has been added to start an explicit transaction.
Add some tests to exercise this new functionality.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) supports a `--stdin` mode that allows it to read
all reference updates from standard input. This is mainly used to allow
for atomic reference updates that are all or nothing, so that either all
references will get updated or none.
Currently, git-update-ref(1) reads all commands as a single block of up
to 1000 characters and only starts processing after stdin gets closed.
This is less flexible than one might wish for, as it doesn't really
allow for longer-lived transactions and doesn't allow any verification
without committing everything. E.g. one may imagine the following
exchange:
> start
< start: ok
> update refs/heads/master $NEWOID1 $OLDOID1
> update refs/heads/branch $NEWOID2 $OLDOID2
> prepare
< prepare: ok
> commit
< commit: ok
When reading all input as a whole block, the above interactive protocol
is obviously impossible to achieve. But by converting the command to
read commands linewise, we can make it more interactive than before.
Obviously, the linewise interface is only a first step in making
git-update-ref(1) work in a more transaction-oriented way. Missing is
most importantly support for transactional commands that manage the
current transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the actual logic to update the transaction is handled in
`update_refs_stdin()`, the transaction itself is started and committed
in `cmd_update_ref()` itself. This makes it hard to handle transaction
abortion and commits as part of `update_refs_stdin()` itself, which is
required in order to introduce transaction handling features to `git
update-refs --stdin`.
Refactor the code to move all transaction handling into
`update_refs_stdin()` to prepare for transaction handling features.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently pass both an `strbuf` containing the current command line
as well as the `next` pointer pointing to the first argument to
commands. This is both confusing and makes code more intertwined.
Convert this to use a simple pointer as well as a pointer pointing to
the end of the input as a preparatory step to line-wise reading of
stdin.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `parse_refname` function accepts a `struct strbuf *input` argument
that isn't used at all. As we're about to convert commands to not use a
strbuf anymore but instead an end pointer, let's drop this argument now
to make the converting commit easier to review.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently manually wire up all commands known to `git-update-ref
--stdin`, making it harder than necessary to preprocess arguments after
the command is determined. To make this more extensible, let's refactor
the code to use an array of known commands instead. While this doesn't
add a lot of value now, it is a preparatory step to implement line-wise
reading of commands.
As we're going to introduce commands without trailing spaces, this
commit also moves whitespace parsing into the respective commands.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time we ran 'make --jobs=2 ...' to build Git, its
documentation, or to apply Coccinelle semantic patches. Then commit
eaa62291ff (ci: inherit --jobs via MAKEFLAGS in run-build-and-tests,
2019-01-27) came along, and started using the MAKEFLAGS environment
variable to centralize setting the number of parallel jobs in
'ci/libs.sh'. Alas, it forgot to update 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container running the 32
bit Linux job, and, consequently, since then that job builds Git
sequentially, and it ignores any Makefile knobs that we might set in
MAKEFLAGS (though we don't set any for the 32 bit Linux job at the
moment).
So update the 'docker run' invocation in 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container as well. Set
CC=gcc for the 32 bit Linux job, because that's the compiler installed
in the 32 bit Linux Docker image that we use (Travis CI nowadays sets
CC=clang by default, but clang is not installed in this image).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of 'touch -m' to modify a file's mtime is slightly less
portable than using our own 'test-tool chmtime'. The important
thing is that these pack-files are ordered in a special way to
ensure the multi-pack-index selects some as the "newer" pack-files
when resolving duplicate objects.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph builtin has an --expire-time option that takes a
datetime using OPT_EXPIRY_DATE(). However, the implementation inside
expire_commit_graphs() was treating a non-zero value as a number of
seconds to subtract from "now".
Update t5323-split-commit-graph.sh to demonstrate the correct value
of the --expire-time option by actually creating a crud .graph file
with mtime earlier than the expire time. Instead of using a super-
early time (1980) we use an explicit, and recent, time. Using
test-tool chmtime to create two files on either end of an exact
second, we create a test that catches this failure no matter the
current time. Using a fixed date is more portable than trying to
format a relative date string into the --expiry-date input.
I noticed this when inspecting some Scalar repos that had an excess
number of commit-graph files. In Scalar, we were using this second
interpretation by using "--expire-time=3600" to mean "delete graphs
older than one hour ago" to avoid deleting a commit-graph that a
foreground process may be trying to load.
Also I noticed that the help text was copied from the --max-commits
option. Fix that help text.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported on the git mailing list, since git-2.25,
git add untracked-dir/
has been tab completing to
git add untracked-dir/./
The cause for this was that with commit b9670c1f5e (dir: fix checks on
common prefix directory, 2019-12-19),
git ls-files -o --directory untracked-dir/
(or the equivalent `git -C untracked-dir ls-files -o --directory`) began
reporting
untracked-dir/
instead of listing paths underneath that directory. It may also be
worth noting that the real command in question was
git -C untracked-dir ls-files -o --directory '*'
which is equivalent to
git ls-files -o --directory 'untracked-dir/*'
which behaves the same for the purposes of this issue (the '*' can match
the empty string), but becomes relevant for the proposed fix.
At first, based on the report, I decided to try to view this as a
regression and tried to find a way to recover the old behavior without
breaking other stuff, or at least breaking as little as possible.
However, in the end, I couldn't figure out a way to do it that wouldn't
just cause lots more problems than it solved. The old behavior was a
bug:
* Although older git would avoid cleaning anything with `git clean -f
.git`, it would wipe out everything under that direcotry with `git
clean -f .git/`. Despite the difference in command used, this is
relevant because the exact same change that fixed clean changed the
behavior of ls-files.
* Older git would report different results based solely on presence or
absence of a trailing slash for $SUBDIR in the command `git ls-files
-o --directory $SUBDIR`.
* Older git violated the documented behavior of not recursing into
directories that matched the pathspec when --directory was
specified.
* And, after all, commit b9670c1f5e (dir: fix checks on common prefix
directory, 2019-12-19) didn't overlook this issue; it explicitly
stated that the behavior of the command was being changed to bring
it inline with the docs.
(Also, if it helps, despite that commit being merged during the 2.25
series, this bug was not reported during the 2.25 cycle, nor even during
most of the 2.26 cycle -- it was reported a day before 2.26 was
released. So the impact of the change is at least somewhat small.)
Instead of relying on a bug of ls-files in reporting the wrong content,
change the invocation of ls-files used by git-completion to make it grab
paths one depth deeper. Do this by changing '$DIR/*' (match $DIR/ plus
0 or more characters) into '$DIR/?*' (match $DIR/ plus 1 or more
characters). Note that the '?' character should not be added when
trying to complete a filename (e.g. 'git ls-files -o --directory
"merge.c?*"' would not correctly return "merge.c" when such a file
exists), so we have to make sure to add the '?' character only in cases
where the path specified so far is a directory.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the expected calling convention for the dir.c API was:
fill_directory(&dir, ..., pathspec)
foreach entry in dir->entries:
if (dir_path_match(entry, pathspec))
process_or_display(entry)
This may have made sense once upon a time, because the fill_directory() call
could use cheap checks to avoid doing full pathspec matching, and an external
caller may have wanted to do other post-processing of the results anyway.
However:
* this structure makes it easy for users of the API to get it wrong
* this structure actually makes it harder to understand
fill_directory() and the functions it uses internally. It has
tripped me up several times while trying to fix bugs and
restructure things.
* relying on post-filtering was already found to produce wrong
results; pathspec matching had to be added internally for multiple
cases in order to get the right results (see commits 404ebceda0
(dir: also check directories for matching pathspecs, 2019-09-17)
and 89a1f4aaf7 (dir: if our pathspec might match files under a
dir, recurse into it, 2019-09-17))
* it's bad for performance: fill_directory() already has to do lots
of checks and knows the subset of cases where it still needs to do
more checks. Forcing external callers to do full pathspec
matching means they must re-check _every_ path.
So, add the pathspec matching within the fill_directory() internals, and
remove it from external callers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
treat_directory() had a call to both do_match_pathspec() and
match_pathspec(). These calls have migrated through the code somewhat
since their introduction, but we don't actually need both. Replace the
two calls with one, and while at it, move the check earlier in order to
reduce the need for callers of fill_directory() to do post-filtering of
results.
The next patch will address post-filtering more forcefully and provide
more relevant history and context.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handling DIR_KEEP_UNTRACKED_CONTENTS within treat_directory() instead of
as a post-processing step in read_directory():
* allows us to directly access and remove the relevant entries instead
of needing to calculate which ones need to be removed
* keeps the logic for directory handling in one location (and puts it
closer the the logic for stripping out extra ignored entries, which
seems logical).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir's read_directory_recursive() naturally operates recursively in order
to walk the directory tree. Treating of directories is sometimes weird
because there are so many different permutations about how to handle
directories. Some examples:
* 'git ls-files -o --directory' only needs to know that a directory
itself is untracked; it doesn't need to recurse into it to see what
is underneath.
* 'git status' needs to recurse into an untracked directory, but only
to determine whether or not it is empty. If there are no files
underneath, the directory itself will be omitted from the output.
If it is not empty, only the directory will be listed.
* 'git status --ignored' needs to recurse into untracked directories
and report all the ignored entries and then report the directory as
untracked -- UNLESS all the entries under the directory are
ignored, in which case we don't print any of the entries under the
directory and just report the directory itself as ignored. (Note
that although this forces us to walk all untracked files underneath
the directory as well, we strip them from the output, except for
users like 'git clean' who also set DIR_KEEP_TRACKED_CONTENTS.)
* For 'git clean', we may need to recurse into a directory that
doesn't match any specified pathspecs, if it's possible that there
is an entry underneath the directory that can match one of the
pathspecs. In such a case, we need to be careful to omit the
directory itself from the list of paths (see commit 404ebceda0
("dir: also check directories for matching pathspecs", 2019-09-17))
Part of the tension noted above is that the treatment of a directory can
change based on the files within it, and based on the various settings
in dir->flags. Trying to keep this in mind while reading over the code,
it is easy to think in terms of "treat_directory() tells us what to do
with a directory, and read_directory_recursive() is the thing that
recurses". Since we need to look into a directory to know how to treat
it, though, it is quite easy to decide to (also) recurse into the
directory from treat_directory() by adding a read_directory_recursive()
call. Adding such a call is actually fine, IF we make sure that
read_directory_recursive() does not also recurse into that same
directory.
Unfortunately, commit df5bcdf83a ("dir: recurse into untracked dirs
for ignored files", 2017-05-18), added exactly such a case to the code,
meaning we'd have two calls to read_directory_recursive() for an
untracked directory. So, if we had a file named
one/two/three/four/five/somefile.txt
and nothing in one/ was tracked, then 'git status --ignored' would
call read_directory_recursive() twice on the directory 'one/', and
each of those would call read_directory_recursive() twice on the
directory 'one/two/', and so on until read_directory_recursive() was
called 2^5 times for 'one/two/three/four/five/'.
Avoid calling read_directory_recursive() twice per level by moving a
lot of the special logic into treat_directory().
Since dir.c is somewhat complex, extra cruft built up around this over
time. While trying to unravel it, I noticed several instances where the
first call to read_directory_recursive() would return e.g.
path_untracked for some directory and a later one would return e.g.
path_none, despite the fact that the directory clearly should have been
considered untracked. The code happened to work due to the side-effect
from the first invocation of adding untracked entries to dir->entries;
this allowed it to get the correct output despite the supposed override
in return value by the later call.
I am somewhat concerned that there are still bugs and maybe even
testcases with the wrong expectation. I have tried to carefully
document treat_directory() since it becomes more complex after this
change (though much of this complexity came from elsewhere that probably
deserved better comments to begin with). However, much of my work felt
more like a game of whackamole while attempting to make the code match
the existing regression tests than an attempt to create an
implementation that matched some clear design. That seems wrong to me,
but the rules of existing behavior had so many special cases that I had
a hard time coming up with some overarching rules about what correct
behavior is for all cases, forcing me to hope that the regression tests
are correct and sufficient. Such a hope seems likely to be ill-founded,
given my experience with dir.c-related testcases in the last few months:
Examples where the documentation was hard to parse or even just wrong:
* 3aca58045f (git-clean.txt: do not claim we will delete files with
-n/--dry-run, 2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
Examples where testcases were declared wrong and changed:
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* a2b13367fe (Revert "dir.c: make 'git-status --ignored' work within
leading directories", 2019-12-10)
Examples where testcases were clearly inadequate:
* 502c386ff9 (t7300-clean: demonstrate deleting nested repo with an
ignored file breakage, 2019-08-25)
* 7541cc5302 (t7300: add testcases showing failure to clean specified
pathspecs, 2019-09-17)
* a5e916c745 (dir: fix off-by-one error in match_pathspec_item,
2019-09-17)
* 404ebceda0 (dir: also check directories for matching pathspecs,
2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* 452efd11fb (t3011: demonstrate directory traversal failures,
2019-12-10)
* b9670c1f5e (dir: fix checks on common prefix directory, 2019-12-19)
Examples where "correct behavior" was unclear to everyone:
https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
Other commits of note:
* 902b90cf42 (clean: fix theoretical path corruption, 2019-09-17)
However, on the positive side, it does make the code much faster. For
the following simple shell loop in an empty repository:
for depth in $(seq 10 25)
do
dirs=$(for i in $(seq 1 $depth) ; do printf 'dir/' ; done)
rm -rf dir
mkdir -p $dirs
>$dirs/untracked-file
/usr/bin/time --format="$depth: %e" git status --ignored >/dev/null
done
I saw the following timings, in seconds (note that the numbers are a
little noisy from run-to-run, but the trend is very clear with every
run):
10: 0.03
11: 0.05
12: 0.08
13: 0.19
14: 0.29
15: 0.50
16: 1.05
17: 2.11
18: 4.11
19: 8.60
20: 17.55
21: 33.87
22: 68.71
23: 140.05
24: 274.45
25: 551.15
For the above run, using strace I can look for the number of untracked
directories opened and can verify that it matches the expected
2^($depth+1)-2 (the sum of 2^1 + 2^2 + 2^3 + ... + 2^$depth).
After this fix, with strace I can verify that the number of untracked
directories that are opened drops to just $depth, and the timings all
drop to 0.00. In fact, it isn't until a depth of 190 nested directories
that it sometimes starts reporting a time of 0.01 seconds and doesn't
consistently report 0.01 seconds until there are 240 nested directories.
The previous code would have taken
17.55 * 2^220 / (60*60*24*365) = 9.4 * 10^59 YEARS
to have completed the 240 nested directories case. It's not often
that you get to speed something up by a factor of 3*10^69.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic in treat_directory() is handled by a multi-case
switch statement, but this switch is very asymmetrical, as
the first two cases are simple but the third is more
complicated than the rest of the method. In fact, the third
case includes a "break" statement that leads to the block
of code outside the switch statement. That is the only way
to reach that block, as the switch handles all possible
values from directory_exists_in_index();
Extract the switch statement into a series of "if" statements.
This simplifies the trivial cases, while clarifying how to
reach the "show_other_directories" case. This is particularly
important as the "show_other_directories" case will expand
in a later change.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite having contributed several fixes in this area, I have for months
(years?) assumed that the "exclude" variable was a directive; this
caused me to think of it as a different mode we operate in and left me
confused as I tried to build up a mental model around why we'd need such
a directive. I mostly tried to ignore it while focusing on the pieces I
was trying to understand.
Then I finally traced this variable all back to a call to is_excluded(),
meaning it was actually functioning as an adjective. In particular, it
was a checked property ("Does this path match a rule in .gitignore?"),
rather than a mode passed in from the caller. Change the variable name
to match the part of speech used by the function called to define it,
which will hopefully make these bits of code slightly clearer to the
next reader.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 16e2cfa909 ("read_directory(): further split treat_path()",
2010-01-08) split treat_one_path() out of treat_path(), because
treat_leading_path() would not have access to a dirent but wanted to
re-use as much of treat_path() as possible. Not re-using all of
treat_path() caused other bugs, as noted in commit b9670c1f5e ("dir:
fix checks on common prefix directory", 2019-12-19). Finally, in commit
ad6f2157f9 ("dir: restructure in a way to avoid passing around a
struct dirent", 2020-01-16), dirents were removed from treat_path() and
other functions entirely.
Since the only reason for splitting these functions was the lack of a
dirent -- which no longer applies to either function -- and since the
split caused problems in the past resulting in us not using
treat_one_path() separately anymore, just undo the split.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds seven new ls-files tests. While currently all seven test
pass, my earlier rounds of restructuring dir.c to replace an exponential
algorithm with a linear one passed all the tests in the testsuite but
failed six of these seven new tests. Add these tests to increase our
case coverage.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out the t7063 has some testcases that even without using the
untracked cache cover situations that nothing else in the testsuite
handles. Checking the results of
git status --porcelain
both with and without the untracked cache, and comparing both against
our expected results helped uncover a critical bug in some dir.c
restructuring.
Unfortunately, it's not easy to run status and tell it to ignore the
untracked cache; the only knob we have is core.untrackedCache=false,
which is used to instruct git to *delete* the untracked cache (which
might also ignore the untracked cache when it operates, but that isn't
specified in the docs).
Create a simple helper that will create a clone of the index that is
missing the untracked cache bits, and use it to compare that the results
with the untracked cache match the results we get without the untracked
cache.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning with --single-branch, we implement git-fetch's usual
tag-following behavior, grabbing any tag objects that point to objects
we have locally.
When we're a partial clone, though, our has_object_file() check will
actually lazy-fetch each tag. That not only defeats the purpose of
--single-branch, but it does it incredibly slowly, potentially kicking
off a new fetch for each tag. This is even worse for a shallow clone,
which implies --single-branch, because even tags which are supersets of
each other will be fetched individually.
We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call,
which is what git-fetch does in this case.
Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch
does. The rationale is discussed in 5827a03545 (fetch: use "quick"
has_sha1_file for tag following, 2016-10-13), but here the tradeoff
would apply even more so because clone is very unlikely to be racing
with another process repacking our newly-created repository.
This may provide a very small speedup even in the non-partial case case,
as we'd avoid calling reprepare_packed_git() for each tag (though in
practice, we'd only have a single packfile, so that reprepare should be
quite cheap).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the config file we use when we build the user manual with
AsciiDoc. The comment at the top of this chunk that we're removing says
the following:
"unbreak" docbook-xsl v1.68 for manpages (sic!). v1.69 works with or
without this.
This comes from d19fbc3c17 ("Documentation: add git user's manual",
2007-01-07), where it looks like this conf file in general and this
snippet in particular was copy-pasted from asciidoc.conf.
This chunk is very similar to something we just got rid of for the
manpages, and because this appears to be aimed at v1.68 -- which we no
longer support for the manpages as of a few commits ago --, it's
tempting to get rid of this. That reveals an interesting aspect of
"works with or without this": it turns out it actually works /better/
without!
Dropping this makes us render code snippets and shell listings using
<screen> rather than <literallayout>, just like Asciidoctor does. In
user-manual.pdf, this puts the contents into dimmed-background,
easy-to-distinguish-from-the-surrounding-text boxes, as opposed to
white-background (transparent) boxes.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast forwarding, `git --merge' should act the same whether
`rebase.abbreviateCommands' is set or not, but so far it was not the
case. This duplicates the tests ensuring that `--merge' works when fast
forwarding to check if it also works with abbreviated commands.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the sequencer is requested to abbreviate commands, it will replace
those that do not have a short form (eg. `noop') by a comment mark.
`noop' serves no purpose, except when fast-forwarding (ie. by running
`git rebase'). Removing it will break this command when
`rebase.abbreviateCommands' is set to true.
Teach todo_list_to_strbuf() to check if a command has an actual
short form, and to ignore it if not.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the given example, `commit` cannot be `NULL` (because this is the
loop condition: if it was `NULL`, the loop body would not be entered at
all). It took this developer a moment or two to see that this is
therefore dead code.
Let's remove it, to avoid puzzling future readers.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ths has been oid_array for some time, though the source only recently
moved from sha1-array.[ch] to oid-array.[ch]. In either case, we should
say "oid-array" here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment refers to the "sha1s in the given sha1 array". But this became
an oid_array along with everywhere else in 910650d2f8 (Rename sha1_array
to oid_array, 2017-03-31). Plus there's an extra line of leftover
editing cruft we can drop.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our join_sha1_array_hex() function long ago switched to using an
oid_array; let's change the name to match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This matches the actual data structure name, as well as the source file
that contains the code we're testing. The test scripts need updating to
use the new name, as well.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit started using size_t for our allocations. There are
some iterations that use int or unsigned, though. These aren't dangerous
with respect to memory, but they could produce incorrect results.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The oid_array object uses an "int" to store the number of items and the
allocated size. It's rather unlikely for somebody to have more than 2^31
objects in a repository (the sha1's alone would be 40GB!), but if they
do, we'd overflow our alloc variable.
You can reproduce this case with something like:
git init repo
cd repo
# make a pack with 2^24 objects
perl -e '
my $nr = 2**24;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' | git fast-import
# now make 256 copies of it; most of these objects will be duplicates,
# but oid_array doesn't de-dup until all values are read and it can
# sort the result.
cd .git/objects/pack/
pack=$(echo *.pack)
idx=$(echo *.idx)
for i in $(seq 0 255); do
# no need to waste disk space
ln "$pack" "pack-extra-$i.pack"
ln "$idx" "pack-extra-$i.idx"
done
# and now force an oid_array to store all of it
git cat-file --batch-all-objects --batch-check
which results in:
fatal: size_t overflow: 32 * 18446744071562067968
So the good news is that st_mult() sees the problem (the large number is
because our int wraps negative, and then that gets cast to a size_t),
doing the job it was meant to: bailing in crazy situations rather than
causing an undersized buffer.
But we should avoid hitting this case at all, and instead limit
ourselves based on what malloc() is willing to give us. We can easily do
that by switching to size_t.
The cat-file process above made it to ~120GB virtual set size before the
integer overflow (our internal hash storage is 32-bytes now in
preparation for sha256, so we'd expect ~128GB total needed, plus
potentially more to copy from one realloc'd block to another)). After
this patch (and about 130GB of RAM+swap), it does eventually read in the
whole set. No test for obvious reasons.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git is an enormously flexible and powerful piece of software. However,
it can be intimidating for many users and there are a set of common
questions that users often ask. While we already have some new user
documentation, it's worth adding a FAQ to address common questions that
users often have. Even though some of this is addressed elsewhere in
the documentation, experience has shown that it is difficult for users
to find, so a centralized location is helpful.
Add such a FAQ and fill it with some common questions and answers.
While there are few entries now, we can expand it in the future to cover
more things as we find new questions that users have. Let's also add
section markers so that people answering questions can directly link
users to the proper answer.
The FAQ also addresses common configuration questions that apply not
only to Git as an independent piece of software but also the ecosystem
of CI tools and hosting providers that people use, since these are the
source of common questions. An attempt has been made to avoid
mentioning any particular provider or tool, but to nevertheless cover
common configurations that apply to a wide variety of such tools.
Note that the long lines for certain questions are required, since
Asciidoctor does not permit broken lines there.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the strbuf interface already provides functions to read a line
into it that completely replaces its current contents, we do not have an
interface that allows for appending lines without discarding current
contents.
Add a new function `strbuf_appendwholeline` that reads a line including
its terminating character into a strbuf non-destructively. This is a
preparatory step for git-update-ref(1) reading standard input line-wise
instead of as a block.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description for the "verify" command is lacking a single word "is",
which this commit corrects.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cleaning up a transaction that has no updates queued, then the
transaction's backend data will not have been allocated. We correctly
handle this for the packed backend, where the cleanup function checks
whether the backend data has been allocated at all -- if not, then there
is nothing to clean up. For the files backend we do not check this and
as a result will hit a segfault due to dereferencing a `NULL` pointer
when cleaning up such a transaction.
Fix the issue by checking whether `backend_data` is set in the files
backend, too.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git commit-graph write --changed-paths', we sort the
commits by pack-order to save time when computing the changed-paths
bloom filters. This does not help when finding the commits via the
'--reachable' flag.
If not using pack-order, then sort by generation number before
examining the diff. Commits with similar generation are more likely
to have many trees in common, making the diff faster.
On the Linux kernel repository, this change reduced the computation
time for 'git commit-graph write --reachable --changed-paths' from
3m00s to 1m37s.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Looking at the diff of commit objects in pack order is much faster than
in sha1 order, as it gives locality to the access of tree deltas
(whereas sha1 order is effectively random). Unfortunately the
commit-graph code sorts the commits (several times, sometimes as an oid
and sometimes a pointer-to-commit), and we ultimately traverse in sha1
order.
Instead, let's remember the position at which we see each commit, and
traverse in that order when looking at bloom filters. This drops my time
for "git commit-graph write --changed-paths" in linux.git from ~4
minutes to ~1.5 minutes.
Probably the "--reachable" code path would want something similar.
Or alternatively, we could use a different data structure (either a
hash, or maybe even just a bit in "struct commit") to keep track of
which oids we've seen, etc instead of sorting. And then we could keep
the original order.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new COMMIT_GRAPH_WRITE_CHANGED_PATHS flag that makes Git compute
Bloom filters for the paths that changed between a commit and it's
first parent, for each commit in the commit-graph. This computation
is done on a commit-by-commit basis.
We will write these Bloom filters to the commit-graph file, to store
this data on disk, in the next change in this series.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing the changed-paths bloom filters for the commit-graph,
we limit the size of the filter by restricting the number of paths
in the diff. Instead of computing a large diff and then ignoring the
result, it is better to halt the diff computation early.
Create a new "max_changes" option in struct diff_options. If non-zero,
then halt the diff computation after discovering strictly more changed
paths. This includes paths corresponding to trees that change.
Use this max_changes option in the bloom filter calculations. This
reduces the time taken to compute the filters for the Linux kernel
repo from 2m50s to 2m35s. On a large internal repository with ~500
commits that perform tree-wide changes, the time reduced from
6m15s to 3m48s.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the core implementation for computing Bloom filters for
the paths changed between a commit and it's first parent.
We fill the Bloom filters as (const char *data, int len) pairs
as `struct bloom_filters" within a commit slab.
Filters for commits with no changes and more than 512 changes,
is represented with a filter of length zero. There is no gain
in distinguishing between a computed filter of length zero for
a commit with no changes, and an uncomputed filter for new commits
or for commits with more than 512 changes. The effect on
`git log -- path` is the same in both cases. We will fall back to
the normal diffing algorithm when we can't benefit from the
existence of Bloom filters.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the constructs for Bloom filters, Bloom filter keys
and Bloom filter settings.
For details on what Bloom filters are and how they work, refer
to Dr. Derrick Stolee's blog post [1]. It provides a concise
explanation of the adoption of Bloom filters as described in
[2] and [3].
Implementation specifics:
1. We currently use 7 and 10 for the number of hashes and the
size of each entry respectively. They served as great starting
values, the mathematical details behind this choice are
described in [1] and [4]. The implementation, while not
completely open to it at the moment, is flexible enough to allow
for tweaking these settings in the future.
Note: The performance gains we have observed with these values
are significant enough that we did not need to tweak these
settings. The performance numbers are included in the cover letter
of this series and in the commit message of the subsequent commit
where we use Bloom filters to speed up `git log -- path`.
2. As described in [1] and [3], we do not need 7 independent hashing
functions. We use the Murmur3 hashing scheme, seed it twice and
then combine those to procure an arbitrary number of hash values.
3. The filters will be sized according to the number of changes in
each commit, in multiples of 8 bit words.
[1] Derrick Stolee
"Supercharging the Git Commit Graph IV: Bloom Filters"
https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-Bloom-filters/
[2] Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese
"An Improved Construction for Counting Bloom Filters"
http://theory.stanford.edu/~rinap/papers/esa2006b.pdfhttps://doi.org/10.1007/11841036_61
[3] Peter C. Dillinger and Panagiotis Manolios
"Bloom Filters in Probabilistic Verification"
http://www.ccs.neu.edu/home/pete/pub/Bloom-filters-verification.pdfhttps://doi.org/10.1007/978-3-540-30494-4_26
[4] Thomas Mueller Graf, Daniel Lemire
"Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters"
https://arxiv.org/abs/1912.08258
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a minor cleanup to make it easier to change
the number of chunks being written to the commit
graph.
Reviewed-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 50033772d5 ("connected: verify promisor-ness of partial clone",
2020-01-30), the fast path (checking promisor packs) in
check_connected() now passes a subset of the slow path (rev-list) - if
all objects to be checked are found in promisor packs, both the fast
path and the slow path will pass; otherwise, the fast path will
definitely not pass. This means that we can always attempt the fast path
whenever we need to do the slow path.
The fast path is currently guarded by a flag; therefore, remove that
flag. Also, make the fast path fallback to the slow path - if the fast
path fails, the failing OID and all remaining OIDs will be passed to
rev-list.
The main user-visible benefit is the performance of fetch from a partial
clone - specifically, the speedup of the connectivity check done before
the fetch. In particular, a no-op fetch into a partial clone on my
computer was sped up from 7 seconds to 0.01 seconds. This is a
complement to the work in 2df1aa239c ("fetch: forgo full
connectivity check if --filter", 2020-01-30), which is the child of the
aforementioned 50033772d5. In that commit, the connectivity check
*after* the fetch was sped up.
The addition of the fast path might cause performance reductions in
these cases:
- If a partial clone or a fetch into a partial clone fails, Git will
fruitlessly run rev-list (it is expected that everything fetched
would go into promisor packs, so if that didn't happen, it is most
likely that rev-list will fail too).
- Any connectivity checks done by receive-pack, in the (in my opinion,
unlikely) event that a partial clone serves receive-pack.
I think that these cases are rare enough, and the performance reduction
in this case minor enough (additional object DB access), that the
benefit of avoiding a flag outweighs these.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'pack.useSparse' configuration variable now defaults to 'true',
enabling an optimization that has been experimental since Git 2.21.
* ds/default-pack-use-sparse-to-true:
pack-objects: flip the use of GIT_TEST_PACK_SPARSE
config: set pack.useSparse=true by default
Several of the previous commits have been bumping the minimum supported
version of docbook-xsl and dropping various workarounds. Most recently,
we made the minimum be 1.73.0.
In INSTALL, we claim that with 1.73, one needs a certain patch in
contrib/patches/. There is no such patch. It was added in 2ec39edad9
("INSTALL: add warning on docbook-xsl 1.72 and 1.73", 2007-08-03) and
dropped in 9721ac9010 ("contrib: remove continuous/ and patches/",
2013-06-03).
Rather than resurrecting version 1.73 and the patch and testing them,
just raise our minimum supported docbook-xsl version to 1.74.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After an earlier commit, we only include manpage-base.xsl from a single
file, manpage-normal.xsl. Fold the former into the latter.
We only ever needed the "base, normal and non-normal" construct to
support a single non-normal case, namely to work around issues with
docbook-xsl 1.72 handling backslashes and dots. If we ever need
something like this again, we can re-introduce manpage-base.xsl and
friends. Whatever issue we'd be trying to work around, it probably
wouldn't involve dots and backslashes like this, anyway.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to assign git.docbook.backslash one of two different values --
one "normal" and one for working around a problem with docbook-xsl 1.72.
After the previous commit, we don't support that version anymore and
always use the "normal" value, a literal backslash.
Just explicitly use a backslash instead of using git.docbook.backslash.
The next commit will drop the definition of git.docbook.backslash
entirely.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop the DOCBOOK_XSL_172 config knob, which was needed with docbook-xsl
1.72 (but neither 1.71 nor 1.73). Version 1.73.0 is more than twelve
years old.
Together with the last few commits, we are now at a point where we don't
have any Makefile knobs to cater to old/broken versions of docbook-xsl.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
docbook-xsl 1.72.0 is thirteen years old. Drop the ASCIIDOC_ROFF knob
which was needed to support 1.68.1 - 1.71.1. The next commit will
increase the required/assumed version further.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop the DOCBOOK_SUPPRESS_SP mechanism, which needs to be used with
docbook-xsl versions 1.69.1 through 1.71.0.
We probably broke this for Asciidoctor builds in f6461b82b9
("Documentation: fix build with Asciidoctor 2", 2019-09-15). That is, we
should/could fix this similar to 55aca515eb ("manpage-bold-literal.xsl:
match for namespaced "d:literal" in template", 2019-10-31). But rather
than digging out such an old version of docbook-xsl to test that, let's
just use this as an excuse for dropping this decade-old workaround.
DOCBOOK_SUPPRESS_SP was not needed with docbook-xsl 1.69.0 and older.
Maybe such old versions still work fine on our docs, or maybe not. Let's
just refer to everything before 1.71.1 as "not supported". The next
commit will increase the required/assumed version further.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code path in packetize() for reading stdin needs to handle NUL
bytes, so we can't rely on shell variables. However, the current code
takes a whopping 4 processes and uses a temporary file. We can do this
much more simply and efficiently by using a single perl invocation (and
we already rely on perl in the matching depacketize() function).
We'll keep the non-stdin code path as it is, since that uses zero extra
processes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The construct has been in POSIX for the past 10+ years, and we have
used in t9xxx (subversion) series of the tests, so we know it is at
portable across systems that people have run those tests, which is
almost everything we'd care about.
Let's loosen the rule; luckily, the check-non-portable-shell script
does not have any rule to find its use, so the only change needed is
a removal of one paragraph from the documentation.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Over time, we added the support to our test framework to make it
easy to leave a test early with failure, but it was not clearly
documented in t/README to help developers writing new tests.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fetch options --deepen, --negotiation-tip, --server-option,
--shallow-exclude, and --shallow-since are documented for git pull as
well, but are not actually accepted by that command. Pass them on to
make the code match its documentation.
Reported-by: 天几 <muzimuzhi@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git pull' implicitly passes --update-head-ok to 'git fetch', but
doesn't itself accept that option from users. That makes sense, as it
wouldn't work without the possibility to update HEAD. Remove the option
from the command's documentation to match its actual behavior.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codebase uses tabs for indentation. Convert an erroneous space
indent into a tab indent.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verifying a midx index with 0 objects, the
m->num_objects - 1
underflows and wraps around to 4294967295.
Fix this both by checking that the midx contains at least one oid,
and also that we don't write any midx when there is no packfiles.
Update the tests to check that `git multi-pack-index write` does
not write an midx when there is no objects, and another to check
that `git multi-pack-index verify` warns when it verifies an midx with no
objects. For this last test, use t5319/no-objects.midx which was
generated by an older version of git.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When opt_rebase is true, we still first check if we can fast-forward.
If the branch is fast-forwardable, then we can avoid the rebase and just
use merge to do the fast-forward logic. However, when commit a6d7eb2c7a
("pull: optionally rebase submodules (remote submodule changes only)",
2017-06-23) added the ability to rebase submodules it accidentally
caused us to run BOTH a merge and a rebase. Add a flag to avoid doing
both.
This was found when a user had both pull.rebase and rebase.autosquash
set to true. In such a case, the running of both merge and rebase would
cause ORIG_HEAD to be updated twice (and match HEAD at the end instead
of the commit before the rebase started), against expectation.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We add the result of "curl-config --libs" when linking curl programs,
but we never bother calling "curl-config --cflags". Presumably nobody
noticed because:
- a system libcurl installed into /usr/include/curl wouldn't need any
flags ("/usr/include" is already in the search path, and the
#include lines all look <curl/curl.h>, etc).
- using CURLDIR sets up both the includes and the library path
However, if you prefer CURL_CONFIG to CURLDIR, something simple like:
make CURL_CONFIG=/path/to/curl-config
doesn't work. We'd link against the libcurl specified by that program,
but not find its header files when compiling.
Let's invoke "curl-config --cflags" similar to the way we do for
"--libs". Note that we'll feed the result into BASIC_CFLAGS. The rest of
the Makefile doesn't distinguish which files need curl support during
compilation and which do not. That should be OK, though. At most this
should be adding a "-I" directive, and this is how CURLDIR already
behaves. And since we follow the immediate-variable pattern from
CURL_LDFLAGS, we won't accidentally invoke curl-config once per
compilation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user hasn't set the CURL_LDFLAGS Makefile variable, we invoke
curl-config like this:
CURL_LIBCURL += $(shell $(CURL_CONFIG) --libs)
Because the shell function is run when the value is expanded, we invoke
curl-config each time we need to link something (which generally ends up
being four times for a full build).
Instead, let's use an immediate Makefile variable, which only needs
expanding once. We can't combine that with the existing "+=", but since
we only do this when CURL_LDFLAGS is undefined, we can just set that
variable.
That also allows us to simplify our conditional a bit, since both sides
will then put the result into CURL_LIBCURL. While we're touching it,
let's fix the indentation to match the nearby code (we're inside an
outer conditional, so everything else is indented one level).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 645c432d61 (pack-objects: use reachability bitmap index when
generating non-stdout pack, 2016-09-10) added two timing tests for
packing to an on-disk file, both with and without bitmaps. However, the
non-bitmap one isn't interesting to have as part of p5310's regression
suite. It _could_ be used as a baseline to show off the improvement in
the bitmap case, but:
- the point of the t/perf suite is to find performance regressions,
and it won't help with that. We don't compare the numbers between
two tests (which the perf suite has no idea are even related), and
any change in its numbers would have nothing to do with bitmaps.
- it did show off the improvement in the commit message of 645c432d61,
but it wasn't even necessary there. The bitmap case already shows an
improvement (because before the patch, it behaved the same as the
non-bitmap case), and the perf suite is even able to show the
difference between the before and after measurements.
On top of that, it's one of the most expensive tests in the suite,
clocking in around 60s for linux.git on my machine (as compared to 16s
for the bitmapped version). And by default when using "./run", we'd run
it three times!
So let's just drop it. It's not useful and is adding minutes to perf
runs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing the arguments list for a v2 ls-refs or fetch command, we
loop like this:
while (packet_reader_read(request) != PACKET_READ_FLUSH) {
const char *arg = request->line;
...handle arg...
}
to read and handle packets until we see a flush. The hidden assumption
here is that anything except PACKET_READ_FLUSH will give us valid packet
data to read. But that's not true; PACKET_READ_DELIM or PACKET_READ_EOF
will leave packet->line as NULL, and we'll segfault trying to look at
it.
Instead, we should follow the more careful model demonstrated on the
client side (e.g., in process_capabilities_v2): keep looping as long
as we get normal packets, and then make sure that we broke out of the
loop due to a real flush. That fixes the segfault and correctly
diagnoses any unexpected input from the client.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packetize() function takes its input on stdin, and requires 4
separate sub-processes to format a simple string. We can do much better
by getting the length via the shell's "${#packet}" construct. The one
caveat is that the shell can't put a NUL into a variable, so we'll have
to continue to provide the stdin form for a few calls.
There are a few other cleanups here in the touched code:
- the stdin form of packetize() had an extra stray "%s" when printing
the packet
- the converted calls in t5562 can be made simpler by redirecting
output as a block, rather than repeated appending
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If commands like merge or rebase materialize files as part of their work,
or a previous sparse-checkout command failed to update individual files
due to dirty changes, users may want a command to simply 'reapply' the
sparsity rules. Provide one.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting and clearing of the SKIP_WORKTREE bit is not only done when
users run 'sparse-checkout'; other commands such as 'checkout' also run
through unpack_trees() which has logic for handling this special bit.
As such, we need to consider how they handle special cases. A couple
comparison points should help explain the rationale for changing how
unpack_trees() handles these bits:
Ignoring sparse checkouts for a moment, if you are switching
branches and have dirty changes, it is only considered an error that
will prevent the branch switching from being successful if the dirty
file happens to be one of the paths with different contents.
SKIP_WORKTREE has always been considered advisory; for example, if
rebase or merge need or even want to materialize a path as part of
their work, they have always been allowed to do so regardless of the
SKIP_WORKTREE setting. This has been used for unmerged paths, but
it was often used for paths it wasn't needed just because it made
the code simpler. It was a best-effort consideration, and when it
materialized paths contrary to the SKIP_WORKTREE setting, it was
never required to even print a warning message.
In the past if you trying to run e.g. 'git checkout' and:
1) you had a path that was materialized and had some dirty changes
2) the path was listed in $GITDIR/info/sparse-checkout
3) this path did not different between the current and target branches
then despite the comparison points above, the inability to set
SKIP_WORKTREE was treated as a *hard* error that would abort the
checkout operation. This is completely inconsistent with how
SKIP_WORKTREE is handled elsewhere, and rather annoying for users as
leaving the paths materialized in the working copy (with a simple
warning) should present no problem at all.
Downgrade any errors from inability to toggle the SKIP_WORKTREE bit to a
warning and allow the operations to continue.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sparse-checkout runs to update the list of sparsity patterns, it
gives warnings if it can't remove paths from the working tree because
those files have dirty changes. Add a similar warning for unmerged
paths as well.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The messages for problems with sparse paths are phrased as errors that
cause the operation to abort, even though we are not making the
operation abort. Reword the messages to make sense in their new
context.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
display_error_msgs() is never called to show messages of both ERROR_*
and WARNING_* types at the same time; it is instead called multiple
times, separately for each type. Since we want to display these types
differently, make two slightly different versions of this function.
A subsequent commit will further modify unpack_trees() and how it calls
the new display_warning_msgs().
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to treat issues with setting the SKIP_WORKTREE bit as a warning
rather than an error; rename the enum values to reflect this intent as
a simple step towards that goal.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A minor change, but we want to convert the sparse messages to warnings
and this allows us to group warnings and errors.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
setup_unpack_trees_porcelain() provides much improved error/warning
messages; instead of a message that assumes that there is only one path
with a given problem despite being used by code that intentionally is
grouping and showing errors together, it uses a message designed to be
used with groups of paths. For example, this transforms
error: Entry ' folder1/a
folder2/a
' not uptodate. Cannot update sparse checkout.
into
error: Cannot update sparse checkout: the following entries are not up to date:
folder1/a
folder2/a
In the past the suboptimal messages were never actually triggered
because we would error out if the working directory wasn't clean before
we even called unpack_trees(). The previous commit changed that,
though, so let's use the better error messages.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the equivalent of 'git read-tree -mu HEAD' in the sparse-checkout
codepaths for setting the SKIP_WORKTREE bits and instead use the new
update_sparsity() function.
Note that when an issue is hit, the error message splits 'error' and
'Cannot update sparse checkout' on separate lines. For now, we use two
greps to find both pieces of the error message but subsequent commits
will clean up the messages reported to the user.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the only way to update the SKIP_WORKTREE bits for various
paths was invoking `git read-tree -mu HEAD` or calling the same code
that this codepath invoked. This however had a number of problems if
the index or working directory were not clean. First, let's consider
the case:
Flipping SKIP_WORKTREE -> !SKIP_WORKTREE (materializing files)
If the working tree was clean this was fine, but if there were files or
directories or symlinks or whatever already present at the given path
then the operation would abort with an error. Let's label this case
for later discussion:
A) There is an untracked path in the way
Now let's consider the opposite case:
Flipping !SKIP_WORKTREE -> SKIP_WORKTREE (removing files)
If the index and working tree was clean this was fine, but if there were
any unclean paths we would run into problems. There are three different
cases to consider:
B) The path is unmerged
C) The path has unstaged changes
D) The path has staged changes (differs from HEAD)
If any path fell into case B or C, then the whole operation would be
aborted with an error. With sparse-checkout, the whole operation would
be aborted for case D as well, but for its predecessor of using `git
read-tree -mu HEAD` directly, any paths that fell into case D would be
removed from the working copy and the index entry for that path would be
reset to match HEAD -- which looks and feels like data loss to users
(only a few are even aware to ask whether it can be recovered, and even
then it requires walking through loose objects trying to match up the
right ones).
Refusing to remove files that have unsaved user changes is good, but
refusing to work on any other paths is very problematic for users. If
the user is in the middle of a rebase or has made modifications to files
that bring in more dependencies, then for their build to work they need
to update the sparse paths. This logic has been preventing them from
doing so. Sometimes in response, the user will stage the files and
re-try, to no avail with sparse-checkout or to the horror of losing
their changes if they are using its predecessor of `git read-tree -mu
HEAD`.
Add a new update_sparsity() function which will not error out in any of
these cases but behaves as follows for the special cases:
A) Leave the file in the working copy alone, clear the SKIP_WORKTREE
bit, and print a warning (thus leaving the path in a state where
status will report the file as modified, which seems logical).
B) Do NOT mark this path as SKIP_WORKTREE, and leave it as unmerged.
C) Do NOT mark this path as SKIP_WORKTREE and print a warning about
the dirty path.
D) Mark the path as SKIP_WORKTREE, but do not revert the version
stored in the index to match HEAD; leave the contents alone.
I tried a different behavior for A (leave the SKIP_WORKTREE bit set),
but found it very surprising and counter-intuitive (e.g. the user sees
it is present along with all the other files in that directory, tries to
stage it, but git add ignores it since the SKIP_WORKTREE bit is set). A
& C seem like optimal behavior to me. B may be as well, though I wonder
if printing a warning would be an improvement. Some might be slightly
surprised by D at first, but given that it does the right thing with
`git commit` and even `git commit -a` (`git add` ignores entries that
are marked SKIP_WORKTREE and thus doesn't delete them, and `commit -a`
is similar), it seems logical to me.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a populate_from_existing_patterns() function for reading the
path_patterns from $GIT_DIR/info/sparse-checkout so that we can re-use
it elsewhere.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a path is dirty, removing from the working tree risks losing data.
As such, we want to make sure any such path is not marked with
SKIP_WORKTREE. While the current callers of this code detect this case
and re-populate with a previous set of sparsity patterns, we want to
allow some paths to be marked with SKIP_WORKTREE while others are left
unmarked without it being considered an error. The reason this
shouldn't be considered an error is that SKIP_WORKTREE has always been
an advisory-only setting; merge and rebase for example were free to
materialize paths and clear the SKIP_WORKTREE bit in order to accomplish
their work even though they kept the SKIP_WORKTREE bit set for other
paths. Leaving dirty working files in the working tree is thus a
natural extension of what we have already been doing.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_updates() previously assumed it was working on o->result. We want
to use this function in combination with a different index_state, so
take the intended index_state as a parameter.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit e091228e17 ("sparse-checkout: update working directory
in-process", 2019-11-21) allowed passing a pre-defined set of patterns
to unpack_trees(). However, if o->pl was NULL, it would still read the
existing patterns and use those. If those patterns were read into a
data structure that was allocated, naturally they needed to be free'd.
However, despite the same function being responsible for knowing about
both the allocation and the free'ing, the logic for tracking whether to
free the pattern_list was hoisted to an outer function with an
additional flag in unpack_trees_options. Put the logic back in the
relevant function and discard the now unnecessary flag.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify_absent_sparse() was introduced in commit 08402b0409
("merge-recursive: distinguish "removed" and "overwritten" messages",
2010-08-11), and has always had exactly one caller which always passes
error_type == ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN. This function
then checks whether error_type is this value, and if so, sets it instead
to ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN. It has been nearly a decade
and no other caller has been created, and no other value has ever been
passed, so just pass the expected value to begin with.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
commit 08402b0409 ("merge-recursive: distinguish "removed" and
"overwritten" messages", 2010-08-11) split
ERROR_WOULD_LOSE_UNTRACKED
into both
ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN
ERROR_WOULD_LOSE_UNTRACKED_REMOVED
and also split
ERROR_WOULD_LOSE_ORPHANED
into both
ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN
ERROR_WOULD_LOSE_ORPHANED_REMOVED
However, despite the split only three of these four types were used.
ERROR_WOULD_LOSE_ORPHANED_REMOVED was not put into use when it was
introduced and nothing else has used it in the intervening decade
either. Remove it.
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need for shell libraries to have the executable bit. They're
meant to be sourced, and running them stand-alone is pointless. Let's
reduce any possible confusion by making it more clear they're not meant
to be run this way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of lib-credential.sh is to be sourced into other test
scripts. It doesn't need a "#!/bin/sh" line, as running it directly
makes no sense. Nor does it serve any real filetype documentation
purpose, as the file is clearly named with a ".sh" extension.
In the spirit of c74c72034f (test: replace shebangs with descriptions in
shell libraries, 2013-11-25), let's replace it with a human-readable
description.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Cygwin, the codepath for POSIX-like systems is taken in
run-command.c::start_command(). The prepare_cmd() helper
function is called to decide if the command needs to be looked
up in the PATH. The logic there is to do the PATH-lookup if
and only if it does not have any slash '/' in it. If this test
passes we end up attempting to run the command by appending the
string after each colon-separated component of PATH.
The Cygwin environment supports both Windows and POSIX style
paths, so both forwardslahes '/' and back slashes '\' can be
used as directory separators for any external program the user
supplies.
Examples for path strings which are being incorrectly searched
for in the PATH instead of being executed as is:
- "C:\Program Files\some-program.exe"
- "a\b\c.exe"
To handle these, the PATH lookup detection logic in prepare_cmd()
is taught to know about this Cygwin quirk, by introducing
has_dir_sep(path) helper function to abstract away the difference
between true POSIX and Cygwin systems.
Signed-off-by: Andras Kucsma <r0maikx02b@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, testing if two refs weren't equal with compare_refs() was done
with `test_must_fail compare_refs`. This was wrong for two reasons.
First, test_must_fail should only be used on git commands. Second,
negating the error code is a little heavy-handed since in the case where
one of the git invocations within compare_refs() fails, we will report
success, even though it failed at an unexpected point.
Teach compare_refs() to accept `!` as the first argument which would
_only_ negate the test_cmp()'s return code.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on only allowing `test_must_fail` to work on a
restricted subset of commands, including `git`. Reorder the commands so
that `nongit` comes before `test_must_fail`. This way, `test_must_fail`
operates on a git command.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the 'did not use upload-pack service' test, we have a complicated
song-and-dance to ensure that there are no "/git-upload-pack" lines in
"$HTTPD_ROOT_PATH/access.log". Simplify this by just checking that grep
returns a non-zero exit code.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected references are generated using a here-doc with some inline
command substitutions. If one of the `git rev-parse` invocations within
the command substitutions fails, its return code is swallowed and we
won't know about it. Replace these command substitutions with
generate_references(), which actually reports when `git rev-parse`
fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
busybox's sed isn't binary clean.
Thus, triggers false-negative on this test.
We could replace sed with perl on this usecase.
But, we could slightly modify the helper to discard unwanted data in the
beginning.
Fix the false negative by updating this helper.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff(1) implementation of busybox produces the unified context
format even without being asked, and it cannot produce the default
format, but a test in this script relies on.
We could rewrite the test so that we count the lines in the
postimage out of the unified context format, but the format is not
supported by some implementations of diff (e.g. HP-UX).
Accomodate busybox by adding a fallback code to count postimage
lines in unified context output, when counting in the output in the
default format finds nothing.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
[jc: applied Documentation/CodingGuidelines and tweaked the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pull" learned to warn when no pull.rebase configuration
exists, and neither --[no-]rebase nor --ff-only is given (which
would result a merge).
* ah/force-pull-rebase-configuration:
pull: warn if the user didn't say whether to rebase or to merge
"git stash" has kept an escape hatch to use the scripted version
for a few releases, which got stale. It has been removed.
* tg/retire-scripted-stash:
stash: remove the stash.useBuiltin setting
stash: get git_stash_config at the top level
When "git describe C" finds an annotated tag with tagname A to be
the best name to explain commit C, and the tag is stored in a
"wrong" place in the refs/tags hierarchy, e.g. refs/tags/B, the
command gave a warning message but used A (not B) to describe C.
If C is exactly at the tag, the describe output would be "A", but
"git rev-parse A^0" would not be equal as "git rev-parse C^0". The
behavior of the command has been changed to use the "long" form
i.e. A-0-gOBJECTNAME, which is correctly interpreted by rev-parse.
* jc/describe-misnamed-annotated-tag:
describe: force long format for a name based on a mislocated tag
The "--fork-point" mode of "git rebase" regressed when the command
was rewritten in C back in 2.20 era, which has been corrected.
* at/rebase-fork-point-regression-fix:
rebase: --fork-point regression fix
The code to interface with GnuPG has been refactored.
* hi/gpg-prefer-check-signature:
gpg-interface: prefer check_signature() for GPG verification
t: increase test coverage of signature verification output
Provide more information (e.g. the object of the tree-ish in which
the blob being converted appears, in addition to its path, which
has already been given) to smudge/clean conversion filters.
* bc/filter-process:
t0021: test filter metadata for additional cases
builtin/reset: compute checkout metadata for reset
builtin/rebase: compute checkout metadata for rebases
builtin/clone: compute checkout metadata for clones
builtin/checkout: compute checkout metadata for checkouts
convert: provide additional metadata to filters
convert: permit passing additional metadata to filter processes
builtin/checkout: pass branch info down to checkout_worktree
SHA-256 transition continues.
* bc/sha-256-part-1-of-4: (22 commits)
fast-import: add options for rewriting submodules
fast-import: add a generic function to iterate over marks
fast-import: make find_marks work on any mark set
fast-import: add helper function for inserting mark object entries
fast-import: permit reading multiple marks files
commit: use expected signature header for SHA-256
worktree: allow repository version 1
init-db: move writing repo version into a function
builtin/init-db: add environment variable for new repo hash
builtin/init-db: allow specifying hash algorithm on command line
setup: allow check_repository_format to read repository format
t/helper: make repository tests hash independent
t/helper: initialize repository if necessary
t/helper/test-dump-split-index: initialize git repository
t6300: make hash algorithm independent
t6300: abstract away SHA-1-specific constants
t: use hash-specific lookup tables to define test constants
repository: require a build flag to use SHA-256
hex: add functions to parse hex object IDs in any algorithm
hex: introduce parsing variants taking hash algorithms
...
Fix "git checkout --recurse-submodules" of a nested submodule
hierarchy.
* pb/recurse-submodules-fix:
t/lib-submodule-update: add test removing nested submodules
unpack-trees: check for missing submodule directory in merged_entry
unpack-trees: remove outdated description for verify_clean_submodule
t/lib-submodule-update: move a test to the right section
t/lib-submodule-update: remove outdated test description
t7112: remove mention of KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED
Especially when debugging a test failure that can only be reproduced in
the CI build (e.g. when the developer has no access to a macOS machine
other than running the tests on a macOS build agent), output should not
be suppressed.
In the instance of `hi/gpg-prefer-check-signature`, where one
GPG-related test failed for no apparent reason, the entire output of
`gpg` and `gpgsm` was suppressed, even in verbose mode, leaving
interested readers no clue what was going wrong.
Let's fix this by no longer redirecting the output not to `/dev/null`.
This is now possible because the affected prereqs were turned into lazy
ones (and are therefore evaluated via `test_eval_` which respects the
`--verbose` option).
Note that we _still_ redirect `stdout` to `/dev/null` for those commands
that sign their `stdin`, as the output would be binary (and useless
anyway, because the reader would not have anything against which to
compare the output).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to set those prereqs is executed completely outside of any
`test_eval_` block. As a consequence, its output had to be suppressed so
that it does not clutter the output of a regular test script run.
Unfortunately, the output *stays* suppressed even when the `--verbose`
option is in effect.
This hid important output when debugging why the GPG prereq was not
enabled in the Windows part of our CI builds.
In preparation for fixing that, let's move all of this code into lazy
prereqs.
The only slightly tricky part is the global environment variable
`GNUPGHOME`. Originally, it was configured only when we verified that
there is a `gpg` in the `PATH` that we can use. This is now no longer
possible, as lazy prereqs are evaluated in a subshell that changes the
working directory to a temporary one. Therefore, we simply _always_ set
that environment variable: it does not hurt anything because it does not
indicate the presence of a working GPG.
Side note: it was quite tempting to use a hack that is possible because
we do not validate what is passed to `test_lazy_prereq` (and it is
therefore possible to "break out" of the lazy_prereq subshell:
test_lazy_prereq GPG '...) && GNUPGHOME=... && (...'
However, this is rather tricksy hobbitses code, and the current patch is
_much_ easier to understand.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_expect_*` functions use `test_eval_` and so does
`test_run_lazy_prereq_`. If tracing is enabled via the `-x` option,
`test_eval_` turns on tracing while evaluating the code block, and turns
it off directly after it.
This is unwanted for nested invocations.
One somewhat surprising example of this is when running a test that
calls `test_i18ngrep`: that function requires the `C_LOCALE_OUTPUT`
prereq, and that prereq is a lazy one, so it is evaluated via
`test_eval_`, the command tracing is turned off, and the test case
continues to run _without tracing the commands_.
Another somewhat surprising example is when one lazy prereq depends on
another lazy prereq: the former will call `test_have_prereq` with the
latter one, which in turn calls `test_eval_` and -- you guessed it --
tracing (if enabled) will be turned off _before_ returning to evaluating
the other lazy prereq.
As we will introduce just such a scenario with the GPG, GPGSM and
RFC1991 prereqs, let's fix that by introducing a variable that keeps
track of the current trace level: nested `test_eval_` calls will
increment and then decrement the level, and only when it reaches 0, the
tracing will _actually_ be turned off.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It makes no sense to call `./lib-gpg.sh`. Therefore the hash-bang line
is unnecessary.
There are other similar instances in `t/`, but they are too far from the
context of the enclosing patch series, so they will be addressed
separately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mechanism to prevent "git commit" from making an empty commit
or amending during an interrupted cherry-pick was broken during the
rewrite of "git rebase" in C, which has been corrected.
* pw/advise-rebase-skip:
commit: give correct advice for empty commit during a rebase
commit: encapsulate determine_whence() for sequencer
commit: use enum value for multiple cherry-picks
sequencer: write CHERRY_PICK_HEAD for reword and edit
cherry-pick: check commit error messages
cherry-pick: add test for `--skip` advice in `git commit`
t3404: use test_cmp_rev
Update "git p4" to work with Python 3.
* yz/p4-py3:
ci: use python3 in linux-gcc and osx-gcc and python2 elsewhere
git-p4: use python3's input() everywhere
git-p4: simplify regex pattern generation for parsing diff-tree
git-p4: use dict.items() iteration for python3 compatibility
git-p4: use functools.reduce instead of reduce
git-p4: fix freezing while waiting for fast-import progress
git-p4: use marshal format version 2 when sending to p4
git-p4: open .gitp4-usercache.txt in text mode
git-p4: convert path to unicode before processing them
git-p4: encode/decode communication with git for python3
git-p4: encode/decode communication with p4 for python3
git-p4: remove string type aliasing
git-p4: change the expansion test from basestring to list
git-p4: make python2.7 the oldest supported version
The real_path() convenience function can easily be misused; with a
bit of code refactoring in the callers' side, its use has been
eliminated.
* am/real-path-fix:
get_superproject_working_tree(): return strbuf
real_path_if_valid(): remove unsafe API
real_path: remove unsafe API
set_git_dir: fix crash when used with real_path()
A handful of options to configure SSL when talking to proxies have
been added.
* js/https-proxy-config:
http: add environment variable support for HTTPS proxies
http: add client cert support for HTTPS proxies
Revamping of the advise API to allow more systematic enumeration of
advice knobs in the future.
* hw/advise-ng:
tag: use new advice API to check visibility
advice: revamp advise API
advice: change "setupStreamFailure" to "setUpstreamFailure"
advice: extract vadvise() from advise()
In Git for Windows' SDK, we use the MSYS2 version of OpenSSH, meaning
that the `gpg-agent` will fail horribly when being passed a `--homedir`
that contains colons.
Previously, we did pass the Windows version of the absolute path,
though, which starts in the drive letter followed by, you guessed it, a
colon.
Let's use the same trick found elsewhere in our test suite where `$PWD`
is used to refer to the pseudo-Unix path (which works only within the
MSYS2 Bash/OpenSSH/Perl/etc, as opposed to `$(pwd)` which refers to the
Windows path that `git.exe` understands, too).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging a test (or a set of tests), it's common to execute it
with some combination of short options, such as:
$ ./txxx-testname.sh -d -x -i
In cases like this, CLIs usually allow the short options to be bundled
in a single argument, for convenience and agility. Let's add this
feature to test-lib, allowing the above command to be run as:
$ ./txxx-testname.sh -dxi
(or any other permutation, e.g. '-ixd')
Note: Short options that require an argument can also be used in a
bundle, in any position. So, for example, '-r 5 -x', '-xr 5' and '-rx 5'
are all valid and equivalent. A special case would be having a bundle
with more than one of such options. To keep things simple, this case is
not allowed for now. This shouldn't be a major limitation, though, as
the only short option that requires an argument today is '-r'. And
concatenating '-r's as in '-rr 5 6' would probably not be very
practical: its unbundled format would be '-r 5 -r 6', for which test-lib
currently considers only the last argument. Therefore, if '-rr 5 6' were
to be allowed, it would have the same effect as just typing '-r 6'.
Note: the test-lib currently doesn't support '-r5', as an alternative
for '-r 5', so the former is not supported in bundles as well.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since commit 6b7728db81, (t7063: work around FreeBSD's lazy mtime
update feature, 2016-08-03), we started to use ls as a trick to update
directory's mtime.
However, `-ls` flag isn't required by POSIX's find(1), and
busybox(1) doesn't implement it.
Use "-exec ls -ld {} +" instead.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only HEAD's object_id is necessary, rev-list is an overkill.
Despite POSIX requires grep(1) treat single pattern with <newline>
as multiple patterns.
busybox's grep(1) (as of v1.31.1) haven't implemented it yet.
Use rev-parse to simplify the test and avoid busybox unimplemented
features.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alpine Linux's default unzip(1) doesn't support `-a`.
Skip those tests on that platform.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_lazy_prereq will be evaluated in a throw-away directory.
Drop unnecessary subshell and mkdir.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shell recognises first non-assignment token as command name.
With /bin/sh linked to either /bin/bash or /bin/dash,
`cd t/perf && ./p0000-perf-lib-sanity.sh -d -i -v` reports:
> test_cmp:1: command not found: diff -u
Using `eval` to unquote $GIT_TEST_CMP as same as precedence in `git_editor`.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
BRE interprets `+` literally, and
`\+` is undefined for POSIX BRE, from:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_02
> The interpretation of an ordinary character preceded
> by an unescaped <backslash> ( '\\' ) is undefined, except for:
> - The characters ')', '(', '{', and '}'
> - The digits 1 to 9 inclusive
> - A character inside a bracket expression
This test is failing with busybox sed, the default sed of Alpine Linux
We have 2 options here:
- Using literal `+` because BRE will interpret it as-is, or
- Using character class `[+]` to defend against a sed that expects ERE
ERE-expected sed is theoretical at this point,
but we haven't found it, yet.
And, we may run into other problems with that sed.
Let's go with first option and fix it later if that sed could be found.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin.h, there exists the distinctly lib-ish function
prune_packed_objects(). This function can currently only be called by
built-in commands but, unlike all of the other functions in the header,
it does not make sense to impose this restriction as the functionality
can be logically reused in libgit.
Extract this function into prune-packed.c so that related definitions
can exist clearly in their own header file.
While we're at it, clean up #includes that are unused.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin.h, there exists the distinctly "lib-ish" function
fmt_merge_msg(). This function can currently only be called by built-in
commands but, unlike most of the other functions in the header, it does
not make sense to impose this restriction as the functionality can be
logically reused in libgit.
Extract this function into fmt-merge-msg.c so that related definitions
can exist clearly in their own header file.
While we're at it, clean up #includes that are unused.
This patch is best viewed with --color-moved.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t7600, we were rewriting `printf '%s\n' ...` to create files from
parameters, one per line. However, we already have a function that wraps
this for us: test_write_lines(). Rewrite these instances to use that
function instead of open coding it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many += lists in the Makefile and, over time, they have gotten
slightly out of ASCII order. Sort all += lists to bring them back in
order.
ASCII sorting was chosen over strict alphabetical order even though, if
we omit file prefixes, the lists aren't sorted in strictly alphabetical
order (e.g. archive.o comes after archive-zip.o instead of before
archive-tar.o). This is intentional because the purpose of maintaining
the sorted list is to ensure line insertions are deterministic. By using
ASCII ordering, it is more easily mechanically reproducible in the
future, such as by using :sort in Vim.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tar importer in `contrib/fast-import/import-tars.perl` has a very
convenient feature: if _all_ paths stored in the imported `.tar` start
with a common prefix, e.g. `git-2.26.0/` in the tar at
https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is
stripped.
This feature makes a ton of sense because it is relatively common to
import two or more revisions of the same project into Git, and obviously
we don't want all files to live in a tree whose name changes from
revision to revision.
Now, the problem with that feature is that it breaks down if there is a
`pax_global_header` "file" located outside of said prefix, at the top of
the tree. This is the case for `.tar` files generated by Git's very own
`git archive` command: it inserts that header, and `git archive` allows
specifying a common prefix (that the header does _not_ share with the
other files contained in the archive) via `--prefix=my-project-1.0.0/`.
Let's just skip any global header when importing `.tar` files into Git.
Note: this global header might contain useful information. For example,
in the output of `git archive`, it lists the original commit, which _is_
useful information. A future improvement to the `import-tars.perl`
script might be to include that information in the commit message, or do
other things with the information (e.g. use `mtime` information
contained in the global header as date of the commit). This patch does
not prevent any future patch from making that happen, it only prevents
the header from being treated as if it was a regular file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Via trace2, Git can already log interesting config parameters (see the
trace2_cmd_list_config() function). However, this can grant an
incomplete picture because many config parameters also allow overrides
via environment variables.
To allow for more complete logs, we add a new trace2_cmd_list_env_vars()
function and supporting implementation, modeled after the pre-existing
config param logging implementation.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a test case is run in a subshell, we finalize the JUnit-style XML
when said subshell exits. But then we continue to write into that XML as
if nothing had happened.
This leads to Azure Pipelines' Publish Test Results task complaining:
Failed to read /home/vsts/work/1/s/t/out/TEST-t0000-basic.xml.
Error : Unexpected end tag. Line 110, position 5.
And indeed, the resulting XML is incorrect.
Let's "re-open" the XML in such a case, i.e. remove the previously added
closing tags.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing spaces before '&&' and switch tabs around '&&' to spaces.
Also fix the space after redirection operator in t3701 while we're here.
These issues were found using `git grep '[^ ]&&$'` and
`git grep -P '&&\t' t/`.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out that the "--filter=<filter-spec>" option is not
documented anywhere in the "git clone" page, and instead is
detailed carefully in "git rev-list" where it serves a
different purpose.
Add a small bit about this option in the documentation. It
would be worth some time to create a subsection in the "git clone"
documentation about partial clone as a concept and how it can be
a surprising experience. For example, "git checkout" will likely
trigger a pack download.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When t3419 was originally written, it was designed to run a smaller test
for correctness, and then the same test with a larger number of patches
for performance. But it seems unlikely the latter was helping us:
- it was marked with EXPENSIVE, so hardly anybody ran it anyway
- there's no indication that it was more likely to find bugs than the
smaller case (the commit message isn't very helpful, but the original
cover letter describes it as: "The first patch adds correctness and
(optional) performance tests".
- the timing results are shown only via test_debug(). So also not run
unless the user says "-d", and then not provided in any
machine-readable form.
If we're interested in performance regressions, a script in t/perf would
be more appropriate. I didn't add one here, because it's not at all
clear to me that what the script is timing is even all that interesting.
Let's simplify the script by dropping the EXPENSIVE run. That in turn
lets us drop the do_tests() wrapper, which lets us consistently use
single-quotes for our test snippets. And we can drop the useless
test_debug() timings, as well as their run() helper. And finally, while
we're here, we can replace the count() helper with the standard
test_seq().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test runs a function which itself runs several assertions. The
last of these assertions cleans up the .git/rebase-apply directory,
since when run with EXPENSIVE set, the function is invoked a second time
to run the same tests with a larger data set.
However, as of 2ac0d6273f ("rebase: change the default backend from "am"
to "merge"", 2020-02-15), the default backend of rebase has changed, and
cleaning up the rebase-apply directory has no effect: it no longer
exists, since we're using rebase-merge instead.
Since we don't really care which rebase backend is in use, let's just
use the command "git rebase --quit", which will do the right thing
regardless.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The environment variable GIT_TEST_PACK_SPARSE was previously used
to allow testing the --sparse option for "git pack-objects" in
the test suite. This allowed interesting cases of "git push" to
also test this algorithm.
Since pack.useSparse is now true by default, we do not need this
variable to _enable_ the --sparse option, but instead to _disable_
it. This flips how we work with the variable a bit.
When checking for the variable, default to a value of -1 for
"unset". If unset, then take the default from the repo settings,
which is currently 1. Then, the --[no-]sparse command-line option
will override either of these settings.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pack.useSparse config option was introduced by 3d036eb0
(pack-objects: create pack.useSparse setting, 2019-01-19) and was
first available in v2.21.0. When enabled, the pack-objects process
during 'git push' will use a sparse tree walk when deciding which
trees and blobs to send to the remote. The algorithm was introduced
by d5d2e93 (revision: implement sparse algorithm, 2019-01-16) and
has been in production use by VFS for Git since around that time.
The features.experimental config option also enabled pack.useSparse,
so hopefully that has also increased exposure.
It is worth noting that pack.useSparse has a possibility of
sending more objects across a push, but requires a special
arrangement of exact _copies_ across directories. There is a test
in t5322-pack-objects-sparse.sh that demonstrates this possibility.
This test uses the --sparse option to "git pack-objects" but we
can make it implied by the config value to demonstrate that the
default value has changed.
While updating that test, I noticed that the documentation did not
include an option for --no-sparse, which is now more important than
it was before.
Since the downside is unlikely but the upside is significant, set
the default value of pack.useSparse to true. Remove it from the
set of options implied by features.experimental.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of https://github.com/prati0100/git-gui:
git-gui: create a new namespace for chord script evaluation
git-gui: reduce Tcl version requirement from 8.6 to 8.5
git-gui--askpass: coerce answers to UTF-8 on Windows
git-gui: fix error popup when doing blame -> "Show History Context"
git-gui: add missing close bracket
git-gui: update German translation
git-gui: extend translation glossary template with more terms
git-gui: update pot template and German translation to current source code
Reduce the Tcl version requirement to 8.5 to allow git-gui to run on
MacOS distributions like High Sierra. While here, fix a potential
variable name collision.
* py/remove-tcloo:
git-gui: create a new namespace for chord script evaluation
git-gui: reduce Tcl version requirement from 8.6 to 8.5
In 'submodule--helper.c', the structures and macros for callbacks belonging
to any subcommand are named in the format: 'subcommand_cb' and 'SUBCOMMAND_CB_INIT'
respectively.
This was an exception for the subcommand 'foreach' of the command
'submodule'. Rename the aforementioned structures and macros:
'struct cb_foreach' to 'struct foreach_cb' and 'CB_FOREACH_INIT'
to 'FOREACH_CB_INIT'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though there is only one configuration variable in the
namespace, it is not quite right to have tar.umask described
among the variables for tag.* namespace.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updates to the CI settings.
* js/ci-windows-update:
Azure Pipeline: switch to the latest agent pools
ci: prevent `perforce` from being quarantined
t/lib-httpd: avoid using macOS' sed
Both "git ls-remote -h" and "git grep -h" give short usage help,
like any other Git subcommand, but it is not unreasonable to expect
that the former would behave the same as "git ls-remote --head"
(there is no other sensible behaviour for the latter). The
documentation has been updated in an attempt to clarify this.
* jc/doc-single-h-is-for-help:
Documentation: clarify that `-h` alone stands for `help`
"git show" and others gave an object name in raw format in its
error output, which has been corrected to give it in hex.
* hd/show-one-mergetag-fix:
show_one_mergetag: print non-parent in hex form.
"git merge signed-tag" while lacking the public key started to say
"No signature", which was utterly wrong. This regression has been
reverted.
* hi/gpg-use-check-signature:
Revert "gpg-interface: prefer check_signature() for GPG verification"
Fix for a bug revealed by a recent change to make the protocol v2
the default.
* ds/partial-clone-fixes:
partial-clone: avoid fetching when looking for objects
partial-clone: demonstrate bugs in partial fetch
The merge-recursive machinery failed to refresh the cache entry for
a merge result in a couple of places, resulting in an unnecessary
merge failure, which has been fixed.
* en/t3433-rebase-stat-dirty-failure:
merge-recursive: fix the refresh logic in update_file_flags
t3433: new rebase testcase documenting a stat-dirty-like failure
"git check-ignore" did not work when the given path is explicitly
marked as not ignored with a negative entry in the .gitignore file.
* en/check-ignore:
check-ignore: fix documentation and implementation to match
The code to automatically shrink the fan-out in the notes tree had
an off-by-one bug, which has been killed.
* jh/notes-fanout-fix:
notes.c: fix off-by-one error when decreasing notes fanout
t3305: check notes fanout more carefully and robustly
The index-pack code now diagnoses a bad input packstream that
records the same object twice when it is used as delta base; the
code used to declare a software bug when encountering such an
input, but it is an input error.
* jk/index-pack-dupfix:
index-pack: downgrade twice-resolved REF_DELTA to die()
"git rebase -i" identifies existing commits in its todo file with
their abbreviated object name, which could become ambigous as it
goes to create new commits, and has a mechanism to avoid ambiguity
in the main part of its execution. A few other cases however were
not covered by the protection against ambiguity, which has been
corrected.
* js/rebase-i-with-colliding-hash:
rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
rebase -i: re-fix short SHA-1 collision
parse_insn_line(): improve error message when parsing failed
Running "git rm" on a submodule failed unnecessarily when
.gitmodules is only cache-dirty, which has been corrected.
* dt/submodule-rm-with-stale-cache:
git rm submodule: succeed if .gitmodules index stat info is zero
The "--recurse-submodules" option of various subcommands did not
work well when run in an alternate worktree, which has been
corrected.
* pb/recurse-submodule-in-worktree-fix:
submodule.c: use get_git_dir() instead of get_git_common_dir()
t2405: clarify test descriptions and simplify test
t2405: use git -C and test_commit -C instead of subshells
t7410: rename to t2405-worktree-submodule.sh
An earlier update to show the location of working tree in the error
message did not consider the possibility that a git command may be
run in a bare repository, which has been corrected.
* es/outside-repo-errmsg-hints:
prefix_path: show gitdir if worktree unavailable
prefix_path: show gitdir when arg is outside repo
Minor bugfixes to "git add -i" that has recently been rewritten in C.
* js/builtin-add-i-cmds:
built-in add -i: accept open-ended ranges again
built-in add -i: do not try to `patch`/`diff` an empty list of files
Evaluating the script in the same namespace as the chord itself creates
potential for variable name collision. And in that case the script would
unknowingly use the chord's variables.
For example, say the script has a variable called 'is_completed', which
also exists in the chord's namespace. The script then calls 'eval' and
sets 'is_completed' to 1 thinking it is setting its own variable,
completely unaware of how the chord works behind the scenes. This leads
to the chord never actually executing because it sees 'is_completed' as
true and thinks it has already completed.
Avoid the potential collision by creating a separate namespace for the
script that is a child of the chord's namespace.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
On some MacOS distributions like High Sierra, Tcl 8.5 is shipped by
default. This makes git-gui error out at startup because of the version
mismatch.
The only part that requires Tcl 8.6 is SimpleChord, which depends on
TclOO. So, don't use it and use our homegrown class.tcl instead.
This means some slight syntax changes. Since class.tcl doesn't have an
"unknown" method like TclOO does, we can't just call '$note', but have
to use '$note activate' instead. The constructor now needs a proper
namespace qualifier. Update the documentation to reflect the new syntax.
As of now, the only part of git-gui that needs Tcl 8.5 is a call to
'apply' in lib/index.tcl::lambda. Keep using it until someone shows up
shouting that their OS ships with 8.4 only. Then we would have to look
into implementing it in pure Tcl.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The option name "--use-mailmap" looks OK, but it becomes awkward
when you have to negate it, i.e. "--no-use-mailmap". I, perhaps
with many other users, always try "--no-mailmap" and become unhappy
to see it fail.
Add an alias "--[no-]mailmap" to remedy this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous step made an option that is an alias to another option
identify itself as an alias to the latter. Because it is easier to
scan the list when a pointer goes backward to what a reader already
has seen, mention "recurse-submodules" first with its true short
help string, and then "recurse" with the statement that it is a
synonym to "recurse-submodules".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a long-standing NEEDSWORK comment that complains about
inconsistency between how an aliased option ("git clone --recurse"
which is the only one that currently exists) gives a help text in
a usage-error message vs "git cmd -h"). Get rid of it and then
make sure we say an option is an alias for another, instead of
repeating the same short help text for both, which leads to "they
seem to do the same---is there any subtle difference?" puzzlement
to end-users.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier update to show the location of working tree in the error
message did not consider the possibility that a git command may be
run in a bare repository, which has been corrected.
* es/outside-repo-errmsg-hints:
prefix_path: show gitdir if worktree unavailable
Check that we get the expected data when performing a merges or
generating archives. Note that we don't expect a ref for merges,
because we won't be checking out any particular ref, but instead a tree
of the merged data. For archives, however, we expect a ref as normal if
we have one.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pass the commit, and if we have it, the ref to the filters when we
perform a checkout. This should only be the case when we invoke git
reset --hard; the metadata will be unused otherwise.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking out a commit, provide metadata to the filter process
including the ref we're using.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Provide commit metadata for checkout code paths that use unpack_trees
and friends. When we're checking out a commit, use the commit
information, but don't provide commit information if we're checking out
from the index, since there need not be any particular commit associated
with the index, and even if there is one, we can't know what it is.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have the codebase wired up to pass any additional metadata
to filters, let's collect the additional metadata that we'd like to
pass.
The two main places we pass this metadata are checkouts and archives.
In these two situations, reading HEAD isn't a valid option, since HEAD
isn't updated for checkouts until after the working tree is written and
archives can accept an arbitrary tree. In other situations, HEAD will
usually reflect the refname of the branch in current use.
We pass a smaller amount of data in other cases, such as git cat-file,
where we can really only logically know about the blob.
This commit updates only the parts of the checkout code where we don't
use unpack_trees. That function and callers of it will be handled in a
future commit.
In the archive code, we leak a small amount of memory, since nothing we
pass in the archiver argument structure is freed.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of situations where a filter process can make use of
some additional metadata. For example, some people find the ident
filter too limiting and would like to include the commit or the branch
in their smudged files. This information isn't available during
checkout as HEAD hasn't been updated at that point, and it wouldn't be
available in archives either.
Let's add a way to pass this metadata down to the filter. We pass the
blob we're operating on, the treeish (preferring the commit over the
tree if one exists), and the ref we're operating on. Note that we won't
pass this information in all cases, such as when renormalizing or when
we're performing diffs, since it doesn't make sense in those cases.
The data we currently get from the filter process looks like the
following:
command=smudge
pathname=git.c
0000
With this change, we'll get data more like this:
command=smudge
pathname=git.c
refname=refs/tags/v2.25.1
treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b
blob=7be7ad34bd053884ec48923706e70c81719a8660
0000
There are a couple things to note about this approach. For operations
like checkout, treeish will always be a commit, since we cannot check
out individual trees, but for other operations, like archive, we can end
up operating on only a particular tree, so we'll provide only a tree as
the treeish. Similar comments apply for refname, since there are a
variety of cases in which we won't have a ref.
This commit wires up the code to print this information, but doesn't
pass any of it at this point. In a future commit, we'll have various
code paths pass the actual useful data down.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- context should be translated to ngữ cảnh instead of nội dung
- add missing accents
- switch adjective and secondary objects position:
* The formatted English text will be "To remove '+/-' lines",
it should be translated to "Để bỏ dòng bắt đầu với '+/-'
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
When commit 8b2f8cbcb1 ("oidset: use khash", 2018-10-04) moved from
using oidmap to khash, it replaced the oidmap.h include with both one
for hashmap.h and khash.h. Since the hashmap.h header is unnecessary,
and the point of the patch was to switch from hashmap (used by oidmap)
to khash.h, remove the unneccessary include.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While updating the microsoft/git fork on top of v2.26.0-rc0 and
consuming that build into Scalar, I noticed a corner case bug around
partial clone.
The "scalar clone" command can create a Git repository with the
proper config for using partial clone with the "blob:none" filter.
Instead of calling "git clone", it runs "git init" then sets a few
more config values before running "git fetch".
In our builds on v2.26.0-rc0, we noticed that our "git fetch"
command was failing with
error: https://github.com/microsoft/scalar did not send all necessary objects
This does not happen if you copy the config file from a repository
created by "git clone --filter=blob:none <url>", but it does happen
when adding the config option "core.logAllRefUpdates = true".
By debugging, I was able to see that the loop inside
check_connnected() that checks if all refs are contained in
promisor packs actually did not have any packfiles in the packed_git
list.
I'm not sure what corner-case issues caused this config option to
prevent the reprepare_packed_git() from being called at the proper
spot during the fetch operation. This approach requires a situation
where we use the remote helper process, which makes it difficult to
test.
It is possible to place a reprepare_packed_git() call in the fetch code
closer to where we receive a pack, but that leaves an opening for a
later change to re-introduce this problem. Further, a concurrent repack
operation could replace the pack-file list we already loaded into
memory, causing this issue in an even harder to reproduce scenario.
It is really the responsibility of anyone looping through the list of
pack-files for a certain object to fall back to reprepare_packed_git()
on a fail-to-find. The loop in check_connected() does not have this
fallback, leading to this bug.
We _could_ try looping through the packs and only reprepare the packs
after a miss, but that change is more involved and has little value.
Since this case is isolated to the case when
opt->check_refs_are_promisor_objects_only is true, we are confident that
we are verifying the refs after downloading new data. This implies that
calling reprepare_packed_git() in advance is not a huge cost compared to
the rest of the operations already made.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit refactors the use of verify_signed_buffer() outside of
gpg-interface.c to use check_signature() instead. It also turns
verify_signed_buffer() into a file-local function since it's now only
invoked internally by check_signature().
There were previously two globally scoped functions used in different
parts of Git to perform GPG signature verification:
verify_signed_buffer() and check_signature(). Now only
check_signature() is used.
The verify_signed_buffer() function doesn't guard against duplicate
signatures as described by Michał Górny [1]. Instead it only ensures a
non-erroneous exit code from GPG and the presence of at least one
GOODSIG status field. This stands in contrast with check_signature()
that returns an error if more than one signature is encountered.
The lower degree of verification makes the use of verify_signed_buffer()
problematic if callers don't parse and validate the various parts of the
GPG status message themselves. And processing these messages seems like
a task that should be reserved to gpg-interface.c with the function
check_signature().
Furthermore, the use of verify_signed_buffer() makes it difficult to
introduce new functionality that relies on the content of the GPG status
lines.
Now all operations that does signature verification share a single entry
point to gpg-interface.c. This makes it easier to propagate changed or
additional functionality in GPG signature verification to all parts of
Git, without having odd edge-cases that don't perform the same degree of
verification.
[1] https://dev.gentoo.org/~mgorny/articles/attack-on-git-signature-verification.html
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There weren't any tests for unsuccessful signature verification of
signed merge tags shown in 'git log'. There also weren't any tests for
the GPG output from 'git fmt-merge-msg'. This was noticed while
investigating a buggy refactor that slipped through the test suite; see
commit 72b006f4bf.
This commit adds signature verification tests to the 'log' and
'fmt-merge-msg' builtins.
Thanks to Linus Torvalds for reporting and finding the (now reverted)
commit that introduced the regression.
Note that the "log --show-signature for merged tag with GPG failure"
test case is really hacky. It relies on an implementation detail of
verify_signed_buffer() -- namely, it assumes that the signature is
written to a temporary file whose path is under TMPDIR.
The rationale for that test case is to check whether the code path that
yields the "No signature" message is reachable on failure. The
functionality in log-tree.c that may show this message does some
pre-parsing of a possible signature that prevents the GPG interface from
being invoked if a signature is actually missing. And I haven't been
able to construct a signature that both 1. satisfies that
pre-processing, and 2. causes GPG to fail without any sort of output on
stderr along the lines of "this is a bogus/corrupt/... signature" (the
"No signature" message should only be shown if GPG produce no output).
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
[jc: fixed missing test title noticed by Dscho]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there is no worktree at present, we can still hint the user about
Git's current directory by showing them the absolute path to the Git
directory. Even though the Git directory doesn't make it as easy to
locate the worktree in question, it can still help a user figure out
what's going on while developing a script.
This fixes a segmentation fault introduced in e0020b2f
("prefix_path: show gitdir when arg is outside repo", 2020-02-14).
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
[jc: added minimum tests, with help from Szeder Gábor]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This addresses the issue where Git for Windows asks the user for a
password, no credential helper is available, and then Git fails to pick
up non-ASCII characters from the Git GUI helper.
This can be verified e.g. via
echo host=http://abc.com |
git -c credential.helper= credential fill
and then pasting some umlauts.
The underlying reason is that Git for Windows tries to communicate using
the UTF-8 encoding no matter what the actual current code page is. So
let's indulge Git for Windows and do use that encoding.
This fixes https://github.com/git-for-windows/git/issues/2215
Signed-off-by: Luke Bonanomi <lbonanomi@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Fixes an error popup in blame because of a missing closing bracket.
* py/blame-status-error:
git-gui: fix error popup when doing blame -> "Show History Context"
Several tests wanted to verify that files were actually modified by a
merge, which it would do by checking that the mtime was updated. In
order to avoid problems with the merge completing so fast that the mtime
at the beginning and end of the operation was the same, these tests
would first set the mtime of a file to something "old". This "old"
value was usually determined as current system clock minus one second,
truncated to the nearest integer. Unfortunately, it appears the system
clock and filesystem clock are different and comparing across the two
runs into race problems resulting in flaky tests.
From https://stackoverflow.com/questions/14392975/timestamp-accuracy-on-ext4-sub-millsecond:
date will call the gettimeofday system call which will always return
the most accurate time available based on the cached kernel time,
adjusted by the CPU cycle time if available to give nanosecond
resolution. The timestamps stored in the file system however, are
only based on the cached kernel time. ie The time calculated at the
last timer interrupt.
and from https://apenwarr.ca/log/20181113:
Does mtime get set to >= the current time?
No, this depends on clock granularity. For example, gettimeofday()
can return times in microseconds on my system, but ext4 rounds
timestamps down to the previous ~10ms (but not exactly 10ms)
increment, with the surprising result that a newly-created file is
almost always created in the past:
$ python -c "
import os, time
t0 = time.time()
open('testfile', 'w').close()
print os.stat('testfile').st_mtime - t0
"
-0.00234484672546
So, instead of trying to compare across what are effectively two
different clocks, just avoid using the system clock. Any new updates to
files have to give an mtime at least as big as what is already in the
file, so we could define "old" as one second before the mtime found in
the file before the merge starts. But, to avoid problems with leap
seconds, ntp updates, filesystems that only provide two second
resolution, and other such weirdness, let's just pick an hour before the
mtime found in the file before the merge starts.
Also, clarify in one test where we check the mtime of different files
that it really was intentional. I totally forgot the reasons for that
and assumed it was a bug when asked.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Band-aid fixes for two fallouts from switching the default "rebase"
backend.
* en/rebase-backend:
git-rebase.txt: highlight backend differences with commit rewording
sequencer: clear state upon dropping a become-empty commit
i18n: unmark a message in rebase.c
In the future, we're going to want to use the branch info in
checkout_worktree, so let's pass the whole struct branch_info down, not
just the revision name. We hoist the definition of struct branch_info
so it's in scope.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The credential protocol can't handle values with newlines. We already
detect and block any such URLs from being used with credential helpers,
but let's also add an fsck check to detect and block gitmodules files
with such URLs. That will let us notice the problem earlier when
transfer.fsckObjects is turned on. And in particular it will prevent bad
objects from spreading, which may protect downstream users running older
versions of Git.
We'll file this under the existing gitmodulesUrl flag, which covers URLs
with option injection. There's really no need to distinguish the exact
flaw in the URL in this context. Likewise, I've expanded the description
of t7416 to cover all types of bogus URLs.
The credential protocol can't represent newlines in values, but URLs can
embed percent-encoded newlines in various components. A previous commit
taught the low-level writing routines to die() when encountering this,
but we can be a little friendlier to the user by detecting them earlier
and handling them gracefully.
This patch teaches credential_from_url() to notice such components,
issue a warning, and blank the credential (which will generally result
in prompting the user for a username and password). We blank the whole
credential in this case. Another option would be to blank only the
invalid component. However, we're probably better off not feeding a
partially-parsed URL result to a credential helper. We don't know how a
given helper would handle it, so we're better off to err on the side of
matching nothing rather than something unexpected.
The die() call in credential_write() is _probably_ impossible to reach
after this patch. Values should end up in credential structs only by URL
parsing (which is covered here), or by reading credential protocol input
(which by definition cannot read a newline into a value). But we should
definitely keep the low-level check, as it's our final and most accurate
line of defense against protocol injection attacks. Arguably it could
become a BUG(), but it probably doesn't matter much either way.
Note that the public interface of credential_from_url() grows a little
more than we need here. We'll use the extra flexibility in a future
patch to help fsck catch these cases.
The credential tests have a "check" function which feeds some input to
git-credential and checks the stdout and stderr. We look for exact
matches in the output. For stdout, this makes sense; the output is
the credential protocol. But for stderr, we may be showing various
diagnostic messages, or the prompts fed to the askpass program, which
could be translated. Let's mark them as such.
The credential protocol that we use to speak to helpers can't represent
values with newlines in them. This was an intentional design choice to
keep the protocol simple, since none of the values we pass should
generally have newlines.
However, if we _do_ encounter a newline in a value, we blindly transmit
it in credential_write(). Such values may break the protocol syntax, or
worse, inject new valid lines into the protocol stream.
The most likely way for a newline to end up in a credential struct is by
decoding a URL with a percent-encoded newline. However, since the bug
occurs at the moment we write the value to the protocol, we'll catch it
there. That should leave no possibility of accidentally missing a code
path that can trigger the problem.
At this level of the code we have little choice but to die(). However,
since we'd not ever expect to see this case outside of a malicious URL,
that's an acceptable outcome.
Reported-by: Felix Wilhelm <fwilhelm@google.com>
As noted by Junio:
Back when "git am" was written, it was not considered a bug that the
"git am --resolved" option did not offer the user a chance to update
the log message to match the adjustment of the code the user made,
but honestly, I'd have to say that it is a bug in "git am" in that
over time it wasn't adjusted to the new world order where we
encourage users to describe what they did when the automation
hiccuped by opening an editor. These days, even when automation
worked well (e.g. a clean auto-merge with "git merge"), we open an
editor. The world has changed, and so should the expectations.
Junio also suggested providing a workaround such as allowing --no-edit
together with git rebase --continue, but that should probably be done in
a patch after the git-2.26.0 release. For now, just document the known
difference in the Behavioral Differences section.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit e98c4269c8 ("rebase (interactive-backend): fix handling of
commits that become empty", 2020-02-15), the merge backend was changed
to drop commits that did not start empty but became so after being
applied (because their changes were a subset of what was already
upstream). This new code path did not need to go through the process of
creating a commit, since we were dropping the commit instead.
Unfortunately, this also means we bypassed the clearing of the
CHERRY_PICK_HEAD and MERGE_MSG files, which if there were no further
commits to cherry-pick would mean that the rebase would end but assume
there was still an operation in progress. Ensure that we clear such
state files when we decide to drop the commit.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit v2.25.0-4-ge98c4269c8 (rebase (interactive-backend): fix handling
of commits that become empty, 2020-02-15) marked "{drop,keep,ask}" for
translation, but this message should not be changed.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git pull accepts the options --dry-run, -p/--prune, --refmap, and
-t/--tags since a32975f516 (pull: pass git-fetch's options to git-fetch,
2015-06-18), -j/--jobs since 62104ba14a (submodules: allow parallel
fetching, add tests and documentation, 2015-12-15), and --set-upstream
since 24bc1a1292 (pull, fetch: add --set-upstream option, 2019-08-19).
Update its documentation to match.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both "git ls-remote -h" and "git grep -h" give short usage help,
like any other Git subcommand, but it is not unreasonable to expect
that the former would behave the same as "git ls-remote --head"
(there is no other sensible behaviour for the latter). The
documentation has been updated in an attempt to clarify this.
* jc/doc-single-h-is-for-help:
Documentation: clarify that `-h` alone stands for `help`
* 'master' of github.com:git/git: (27 commits)
Git 2.26-rc1
remote-curl: show progress for fetches over dumb HTTP
show_one_mergetag: print non-parent in hex form.
config.mak.dev: re-enable -Wformat-zero-length
rebase-interactive.c: silence format-zero-length warnings
mingw: workaround for hangs when sending STDIN
t6020: new test with interleaved lexicographic ordering of directories
t6022, t6046: test expected behavior instead of testing a proxy for it
t3035: prefer test_must_fail to bash negation for git commands
t6020, t6022, t6035: update merge tests to use test helper functions
t602[1236], t6034: modernize test formatting
merge-recursive: apply collision handling unification to recursive case
completion: add diff --color-moved[-ws]
t1050: replace test -f with test_path_is_file
am: support --show-current-patch=diff to retrieve .git/rebase-apply/patch
am: support --show-current-patch=raw as a synonym for--show-current-patch
am: convert "resume" variable to a struct
parse-options: convert "command mode" to a flag
parse-options: add testcases for OPT_CMDMODE()
stash push: support the --pathspec-from-file option
...
Often novice Git users forget to say "pull --rebase" and end up with an
unnecessary merge from upstream. What they usually want is either "pull
--rebase" in the simpler cases, or "pull --ff-only" to update the copy
of main integration branches, and rebase their work separately. The
pull.rebase configuration variable exists to help them in the simpler
cases, but there is no mechanism to make these users aware of it.
Issue a warning message when no --[no-]rebase option from the command
line and no pull.rebase configuration variable is given. This will
inconvenience those who never want to "pull --rebase", who haven't had
to do anything special, but the cost of the inconvenience is paid only
once per user, which should be a reasonable cost to help a number of new
users.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since 862e730ec1 (commit-slab: introduce slabname##_peek()
function, 2015-05-14) the slabname##_peek() function is documented as:
This function is similar to indegree_at(), but it will return NULL
until a call to indegree_at() was made for the commit.
This, however, is usually not the case. If indegree_at() allocates
memory, then it will do so not only for the single commit it got as
parameter, but it will allocate a whole new, ~512kB slab. Later on,
if any other commit's 'index' field happens to point into an already
allocated slab, then indegree_peek() for such a commit will return a
valid non-NULL pointer, pointing to a zero-initialized location in the
slab, even if no indegree_at() call has been made for that commit yet.
Update slabname##_peek()'s documentation to clarify this.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Together with the previous commits, this commit fully fixes the problem
of using shared buffer for `real_path()` in `get_superproject_working_tree()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.
There was an unfinished effort to abolish this [1].
Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.
This patch uses a local `strbuf` for most places where `real_path()` was
previously called.
However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
read_gitfile_gently()
get_superproject_working_tree()
[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python2 reached end of life, and we have been preparing our Python
scripts to work with Python3. 'git p4', the main in-tree user of
Python, has just received a number of compatibility updates. Our
other notable Python script 'contrib/svn-fe/svnrdump_sim.py' is only
used in 't9020-remote-svn.sh', and is apparently already compatible
with both Python2 and 3.
Our CI jobs currently only use Python2. We want to make sure that
these Python scripts do indeed work with Python3, and we also want to
make sure that these scripts keep working with Python2 as well, for
the sake of some older LTS/Enterprise setups.
Therefore, pick two jobs and use Python3 there, while leaving other
jobs to still stick to Python2 for now.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" over HTTP walker protocol did not show any progress
output. We inherently do not know how much work remains, but still
we can show something not to bore users.
* rs/show-progress-in-dumb-http-fetch:
remote-curl: show progress for fetches over dumb HTTP
"git show" and others gave an object name in raw format in its
error output, which has been corrected to give it in hex.
* hd/show-one-mergetag-fix:
show_one_mergetag: print non-parent in hex form.
Recently we inadvertently added a few instances of using 0-width
format string to functions that we mark as printf-like without any
developers noticing. The root cause was that the compiler warning
that is triggered by this is almost always useless and we disabled
the warning in our developer builds, but not for general public.
The new instances have been corrected, and the warning has been
resurrected in the developer builds.
* rt/format-zero-length-fix:
config.mak.dev: re-enable -Wformat-zero-length
rebase-interactive.c: silence format-zero-length warnings
Test cleanup.
* en/test-cleanup:
t6020: new test with interleaved lexicographic ordering of directories
t6022, t6046: test expected behavior instead of testing a proxy for it
t3035: prefer test_must_fail to bash negation for git commands
t6020, t6022, t6035: update merge tests to use test helper functions
t602[1236], t6034: modernize test formatting
Handling of conflicting renames in merge-recursive have further
been made consistent with how existing codepaths try to mimic what
is done to add/add conflicts.
* en/merge-path-collision:
merge-recursive: apply collision handling unification to recursive case
"git am --short-current-patch" is a way to show the piece of e-mail
for the stopped step, which is not suitable to directly feed "git
apply" (it is designed to be a good "git am" input). It learned a
new option to show only the patch part.
* pb/am-show-current-patch:
am: support --show-current-patch=diff to retrieve .git/rebase-apply/patch
am: support --show-current-patch=raw as a synonym for--show-current-patch
am: convert "resume" variable to a struct
parse-options: convert "command mode" to a flag
parse-options: add testcases for OPT_CMDMODE()
We grep for "File exists" in stderr of the failing `git sparse-checkout`
to make sure that it failed for the right reason. We expect the string
to show up there since we call `strerror(errno)` in
`unable_to_lock_message()` in lockfile.c.
On the NonStop platform, this fails because the error string is "File
already exists", which doesn't match our grepping.
See 9042140097 ("test-dir-iterator: do not assume errno values",
2019-07-30) for a somewhat similar fix. There, we patched a test helper,
which meant we had access to `errno` and could investigate it better in
the test helper instead of just outputting the numerical value and
evaluating it in the test script. The current situation is different,
since (short of modifying the lockfile machinery, e.g., to be more
verbose) we don't have more than the output from `strerror()` available.
Except we do: We prefix `strerror(errno)` with `_("Unable to create
'%s.lock': ")`. Let's grep for that part instead. It verifies that we
were indeed unable to create the lock file. (If that fails for some
other reason than the file existing, we really really should expect
other tests to fail as well.)
An alternative fix would be to loosen the expression a bit and grep for
"File.* exists" instead. There would be no guarantee that some other
implementation couldn't come up with another error string, That is, that
could be the first move in an endless game of whack-a-mole. Of course,
it could also take us from "99" to "100" percent of the platforms and
we'd never have this problem again. But since we have another way of
addressing this, let's not even try the "loosen it up a bit" strategy.
Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some parts of the workflow described in the document has got a bit
stale with the recent toolchain improvements. Update the procedure
a bit, and also describe the convention used around SQUASH??? fixups.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`real_path()` returns result from a shared buffer, inviting subtle
reentrance bugs. One of these bugs occur when invoked this way:
set_git_dir(real_path(git_dir))
In this case, `real_path()` has reentrance:
real_path
read_gitfile_gently
repo_set_gitdir
setup_git_env
set_git_dir_1
set_git_dir
Later, `set_git_dir()` uses its now-dead parameter:
!is_absolute_path(path)
Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the stash.useBuiltin setting which was added as an escape hatch
to disable the builtin version of stash first released with Git 2.22.
Carrying the legacy version is a maintenance burden, and has in fact
become out of date failing a test since the 2.23 release, without
anyone noticing until now. So users would be getting a hint to fall
back to a potentially buggy version of the tool.
We used to shell out to git config to get the useBuiltin configuration
to avoid changing any global state before spawning legacy-stash.
However that is no longer necessary, so just use the 'git_config'
function to get the setting instead.
Similar to what we've done in d03ebd411c ("rebase: remove the
rebase.useBuiltin setting", 2019-03-18), where we remove the
corresponding setting for rebase, we leave the documentation in place,
so people can refer back to it when searching for it online, and so we
can refer to it in the commit message.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 4 environment variables that can be used to configure the proxy
cert, proxy ssl key, the proxy cert password protected flag, and the
CA info for the proxy.
Documentation for the options was also updated.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git supports performing connections to HTTPS proxies, but we don't
support doing mutual authentication with them (through TLS).
Add the necessary options to be able to send a client certificate to
the HTTPS proxy.
A client certificate can provide an alternative way of authentication
instead of using 'ProxyAuthorization' or other more common methods of
authentication. Libcurl supports this functionality already, so changes
are somewhat minimal. The feature is guarded by the first available
libcurl version that supports these options.
4 configuration options are added and documented, cert, key, cert
password protected and CA info. The CA info should be used to specify a
different CA path to validate the HTTPS proxy cert.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We recently switched to using Perl instead of `sed` in the httpd-based
tests. Let's reflect that in the label we give the corresponding commit
hashes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge signed-tag" while lacking the public key started to say
"No signature", which was utterly wrong. This regression has been
reverted.
* hi/gpg-use-check-signature:
Revert "gpg-interface: prefer check_signature() for GPG verification"
Updates to the CI settings.
* js/ci-windows-update:
Azure Pipeline: switch to the latest agent pools
ci: prevent `perforce` from being quarantined
t/lib-httpd: avoid using macOS' sed
"git describe" in a repository with multiple root commits sometimes
gave up looking for the best tag to describe a given commit with
too early, which has been adjusted.
* be/describe-multiroot:
describe: don't abort too early when searching tags
"git clone --recurse-submodules --single-branch" now uses the same
single-branch option when cloning the submodules.
* es/recursive-single-branch-clone:
clone: pass --single-branch during --recurse-submodules
submodule--helper: use C99 named initializer
Code cleanup to use "struct object_id" more by replacing use of
"char *sha1"
* jk/nth-packed-object-id:
packfile: drop nth_packed_object_sha1()
packed_object_info(): use object_id internally for delta base
packed_object_info(): use object_id for returning delta base
pack-check: push oid lookup into loop
pack-check: convert "internal error" die to a BUG()
pack-bitmap: use object_id when loading on-disk bitmaps
pack-objects: use object_id struct in pack-reuse code
pack-objects: convert oe_set_delta_ext() to use object_id
pack-objects: read delta base oid into object_id struct
nth_packed_object_oid(): use customary integer return
"git rebase BASE BRANCH" rebased/updated the tip of BRANCH and
checked it out, even when the BRANCH is checked out in a different
worktree. This has been corrected.
* es/do-not-let-rebase-switch-to-protected-branch:
rebase: refuse to switch to branch already checked out elsewhere
t3400: make test clean up after itself
"git push" should stop from updating a branch that is checked out
when receive.denyCurrentBranch configuration is set, but it failed
to pay attention to checkouts in secondary worktrees. This has
been corrected.
* hv/receive-denycurrent-everywhere:
t2402: test worktree path when called in .git directory
receive.denyCurrentBranch: respect all worktrees
t5509: use a bare repository for test push target
get_main_worktree(): allow it to be called in the Git directory
In rare cases "git worktree add <path>" could think that <path>
was already a registered worktree even when it wasn't and refuse
to add the new worktree. This has been corrected.
* es/worktree-avoid-duplication-fix:
worktree: don't allow "add" validation to be fooled by suffix matching
worktree: add utility to find worktree by pathname
worktree: improve find_worktree() documentation
A configuration element used for credential subsystem can now use
wildcard pattern to specify for which set of URLs the entry
applies.
* bc/wildcard-credential:
credential: allow wildcard patterns when matching config
credential: use the last matching username in the config
t0300: add tests for some additional cases
t1300: add test for urlmatch with multiple wildcards
mailmap: add an additional email address for brian m. carlson
Underlying machinery of "git bisect--helper" is being refactored
into pieces that are more easily reused.
* mr/bisect-in-c-1:
bisect: libify `bisect_next_all`
bisect: libify `handle_bad_merge_base` and its dependents
bisect: libify `check_good_are_ancestors_of_bad` and its dependents
bisect: libify `check_merge_bases` and its dependents
bisect: libify `bisect_checkout`
bisect: libify `exit_if_skipped_commits` to `error_if_skipped*` and its dependents
bisect--helper: return error codes from `cmd_bisect__helper()`
bisect: add enum to represent bisect returning codes
bisect--helper: introduce new `decide_next()` function
bisect: use the standard 'if (!var)' way to check for 0
bisect--helper: change `retval` to `res`
bisect--helper: convert `vocab_*` char pointers to char arrays
"git sparse-checkout" learned a new "add" subcommand.
* ds/sparse-add:
sparse-checkout: allow one-character directories in cone mode
sparse-checkout: work with Windows paths
sparse-checkout: create 'add' subcommand
sparse-checkout: extract pattern update from 'set' subcommand
sparse-checkout: extract add_patterns_from_input()
change the advise call in tag library from advise() to
advise_if_enabled() to construct an example of the usage of
the new API.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently it's very easy for the advice library's callers to miss
checking the visibility step before printing an advice. Also, it makes
more sense for this step to be handled by the advice library.
Add a new advise_if_enabled function that checks the visibility of
advice messages before printing.
Add a new helper advise_enabled to check the visibility of the advice
if the caller needs to carry out complicated processing based on that
value.
A list of advice_settings is added to cache the config variables names
and values, it's intended to replace advice_config[] and the global
variables once we migrate all the callers to use the new APIs.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bug which reports an extra `/.git/.` in worktree path when called in
'.git' directory already has been fixed. But unfortunately, the regression
test to ensure this behavior has been forgotten.
Here is that test.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 9700fae5ee (for-each-ref: let upstream/push report the remote
ref name, 2017-11-07) added a remote_ref_for_branch() helper, which
is modeled after remote_for_branch(). This includes providing an
"explicit" out-parameter that tells the caller whether the remote
was configured by the user, or whether we picked a default name like
"origin".
But unlike remote names, there is no default name when the user
didn't configure one. The only way the "explicit" parameter is used
by the caller is to use the value returned from the helper when it
is set, and use an empty string otherwise, ignoring the returned
value from the helper.
Let's drop the "explicit" out-parameter, and return NULL when the
returned value from the helper should be ignored, to simplify the
function interface.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit we're adding another config variable to be read
from 'git_stash_config', that is valid for the top level command
instead of just a subset. Move the 'git_config' invocation for
'git_stash_config' to the top-level to prepare for that.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetching over dumb HTTP transport doesn't show any progress, even with
the option --progress. If the connection is slow or there is a lot of
data to get then this can take a long time while the user is left to
wonder if git got stuck.
We don't know the number of objects to fetch at the outset, but we can
count the ones we got. Show an open-ended progress indicator based on
that number if the user asked for it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix for a bug revealed by a recent change to make the protocol v2
the default.
* ds/partial-clone-fixes:
partial-clone: avoid fetching when looking for objects
partial-clone: demonstrate bugs in partial fetch
The merge-recursive machinery failed to refresh the cache entry for
a merge result in a couple of places, resulting in an unnecessary
merge failure, which has been fixed.
* en/t3433-rebase-stat-dirty-failure:
merge-recursive: fix the refresh logic in update_file_flags
t3433: new rebase testcase documenting a stat-dirty-like failure
"git rebase" has learned to use the merge backend (i.e. the
machinery that drives "rebase -i") by default, while allowing
"--apply" option to use the "apply" backend (e.g. the moral
equivalent of "format-patch piped to am"). The rebase.backend
configuration variable can be set to customize.
* en/rebase-backend:
rebase: rename the two primary rebase backends
rebase: change the default backend from "am" to "merge"
rebase: make the backend configurable via config setting
rebase tests: repeat some tests using the merge backend instead of am
rebase tests: mark tests specific to the am-backend with --am
rebase: drop '-i' from the reflog for interactive-based rebases
git-prompt: change the prompt for interactive-based rebases
rebase: add an --am option
rebase: move incompatibility checks between backend options a bit earlier
git-rebase.txt: add more details about behavioral differences of backends
rebase: allow more types of rebases to fast-forward
t3432: make these tests work with either am or merge backends
rebase: fix handling of restrict_revision
rebase: make sure to pass along the quiet flag to the sequencer
rebase, sequencer: remove the broken GIT_QUIET handling
t3406: simplify an already simple test
rebase (interactive-backend): fix handling of commits that become empty
rebase (interactive-backend): make --keep-empty the default
t3404: directly test the behavior of interest
git-rebase.txt: update description of --allow-empty-message
"git check-ignore" did not work when the given path is explicitly
marked as not ignored with a negative entry in the .gitignore file.
* en/check-ignore:
check-ignore: fix documentation and implementation to match
The object reachability bitmap machinery and the partial cloning
machinery were not prepared to work well together, because some
object-filtering criteria that partial clones use inherently rely
on object traversal, but the bitmap machinery is an optimization
to bypass that object traversal. There however are some cases
where they can work together, and they were taught about them.
* jk/object-filter-with-bitmap:
rev-list --count: comment on the use of count_right++
pack-objects: support filters with bitmaps
pack-bitmap: implement BLOB_LIMIT filtering
pack-bitmap: implement BLOB_NONE filtering
bitmap: add bitmap_unset() function
rev-list: use bitmap filters for traversal
pack-bitmap: basic noop bitmap filter infrastructure
rev-list: allow commit-only bitmap traversals
t5310: factor out bitmap traversal comparison
rev-list: allow bitmaps when counting objects
rev-list: make --count work with --objects
rev-list: factor out bitmap-optimized routines
pack-bitmap: refuse to do a bitmap traversal with pathspecs
rev-list: fallback to non-bitmap traversal when filtering
pack-bitmap: fix leak of haves/wants object lists
pack-bitmap: factor out type iterator initialization
fb6fbffbda (advice: keep config name in camelCase in advice_config[],
2018-05-26) changed the config names to camelCase, but one of the names
wasn't changed correctly. Fix it.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a new advice method, extract a version of advise()
that uses an explict 'va_list' parameter. Call it from advise() for a
functionally equivalent version.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a mergetag names a non-parent, which can occur after a shallow
clone, its hash was previously printed as raw data. Print it in hex form
instead.
Signed-off-by: Harald van Dijk <harald@gigawatt.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d9c6469 (git-gui: update status bar to track operations, 2019-12-01)
the call to 'ui_status' in 'do_gitk' was updated to create the newly
introduced "status bar operation". This allowed this status text to show
along with other operations happening in parallel, and removed a race
between all these operations.
But in that refactor, the fact that 'ui_status' checks for the existence
of 'main_status' was overlooked. This leads to an error message popping
up when the user selects "Show History Context" from the blame window
context menu on a source line. The error occurs because when running
"blame" 'main_status' is not initialized.
So, add a check for the existence of 'main_status' in 'do_gitk'. This
fix reverts to the original behaviour. In the future, we might want to
look into a better way of telling 'do_gitk' which status bar to use.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
When converting a repository using submodules from one hash algorithm to
another, it is necessary to rewrite the submodules from the old
algorithm to the new algorithm, since only references to submodules, not
their contents, are written to the fast-export stream. Without rewriting
the submodules, fast-import fails with an "Invalid dataref" error when
encountering a submodule in another algorithm.
Add a pair of options, --rewrite-submodules-from and
--rewrite-submodules-to, that take a list of marks produced by
fast-export and fast-import, respectively, when processing the
submodule. Use these marks to map the submodule commits from the old
algorithm to the new algorithm.
We read marks into two corresponding struct mark_set objects and then
perform a mapping from the old to the new using a hash table. This lets
us reuse the same mark parsing code that is used elsewhere and allows us
to efficiently read and match marks based on their ID, since mark files
need not be sorted.
Note that because we're using a khash table for the object IDs, and this
table copies values of struct object_id instead of taking references to
them, it's necessary to zero the struct object_id values that we use to
insert and look up in the table. Otherwise, we would end up with SHA-1
values that don't match because of whatever stack garbage might be left
in the unused area.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, we can iterate over marks only to dump them to a file. In the
future, we'll want to perform an arbitrary operation over the items of a
mark set. Add a function, for_each_mark, that iterates over marks in a
set and performs an arbitrary callback function for each mark. Switch
the mark dumping routine to use this function now that it's available.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll use multiple different mark sets with this
function, so make it take an argument that points to the mark set to
operate on.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, everything we want to insert into a mark set is an object
entry. However, in the future, we will want to insert objects of other
types. Teach read_mark_file to take a function pointer which helps us
insert the object we want into our mark set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll want to read marks files for submodules as well.
Refactor the existing code to make it possible to read multiple marks
files, each into their own marks set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 72b006f4bf, which
breaks the end-user experience when merging a signed tag without
having the public key. We should report "can't check because we
have no public key", but the code with this change claimed that
there was no signature.
We recently triggered some -Wformat-zero-length warnings in the code,
but no developers noticed because we suppress that warning in builds
with the DEVELOPER=1 Makefile knob set. But we _don't_ suppress them in
a non-developer build (and they're part of -Wall). So even though
non-developers probably aren't using -Werror, they see the annoying
warnings when they build.
We've had back and forth discussion over the years on whether this
warning is useful or not. In most cases we've seen, it's not true that
the call is a mistake, since we're using its side effects (like adding a
newline status_printf_ln()) or writing an empty string to a destination
which is handled by the function (as in write_file()). And so we end up
working around it in the source by passing ("%s", "").
There's more discussion in the subthread starting at:
https://lore.kernel.org/git/xmqqtwaod7ly.fsf@gitster.mtv.corp.google.com/
The short of it is that we probably can't just disable the warning for
everybody because of portability issues. And ignoring it for developers
puts us in the situation we're in now, where non-dev builds are annoyed.
Since the workaround is both rarely needed and fairly straight-forward,
let's just commit to doing it as necessary, and re-enable the warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes the following warnings:
rebase-interactive.c: In function ‘edit_todo_list’:
rebase-interactive.c:137:38: warning: zero-length gnu_printf format string [-Wformat-zero-length]
write_file(rebase_path_dropped(), "");
rebase-interactive.c:144:37: warning: zero-length gnu_printf format string [-Wformat-zero-length]
write_file(rebase_path_dropped(), "");
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explanation
-----------
The problem here is flawed `poll()` implementation. When it tries to
see if pipe can be written without blocking, it eventually calls
`NtQueryInformationFile()` and tests `WriteQuotaAvailable`. However,
the meaning of quota was misunderstood. The value of quota is reduced
when either some data was written to a pipe, *or* there is a pending
read on the pipe. Therefore, if there is a pending read of size >= than
the pipe's buffer size, poll() will think that pipe is not writable and
will hang forever, usually that means deadlocking both pipe users.
I have studied the problem and found that Windows pipes track two values:
`QuotaUsed` and `BytesInQueue`. The code in `poll()` apparently wants to
know `BytesInQueue` instead of quota. Unfortunately, `BytesInQueue` can
only be requested from read end of the pipe, while `poll()` receives
write end.
The git's implementation of `poll()` was copied from gnulib, which also
contains a flawed implementation up to today.
I also had a look at implementation in cygwin, which is also broken in a
subtle way. It uses this code in `pipe_data_available()`:
fpli.WriteQuotaAvailable = (fpli.OutboundQuota - fpli.ReadDataAvailable)
However, `ReadDataAvailable` always returns 0 for the write end of the pipe,
turning the code into an obfuscated version of returning pipe's total
buffer size, which I guess will in turn have `poll()` always say that pipe
is writable. The commit that introduced the code doesn't say anything about
this change, so it could be some debugging code that slipped in.
These are the typical sizes used in git:
0x2000 - default read size in `strbuf_read()`
0x1000 - default read size in CRT, used by `strbuf_getwholeline()`
0x2000 - pipe buffer size in compat\mingw.c
As a consequence, as soon as child process uses `strbuf_read()`,
`poll()` in parent process will hang forever, deadlocking both
processes.
This results in two observable behaviors:
1) If parent process begins sending STDIN quickly (and usually that's
the case), then first `poll()` will succeed and first block will go
through. MAX_IO_SIZE_DEFAULT is 8MB, so if STDIN exceeds 8MB, then
it will deadlock.
2) If parent process waits a little bit for any reason (including OS
scheduler) and child is first to issue `strbuf_read()`, then it will
deadlock immediately even on small STDINs.
The problem is illustrated by `git stash push`, which will currently
read the entire patch into memory and then send it to `git apply` via
STDIN. If patch exceeds 8MB, git hangs on Windows.
Possible solutions
------------------
1) Somehow obtain `BytesInQueue` instead of `QuotaUsed`
I did a pretty thorough search and didn't find any ways to obtain
the value from write end of the pipe.
2) Also give read end of the pipe to `poll()`
That can be done, but it will probably invite some dirty code,
because `poll()`
* can accept multiple pipes at once
* can accept things that are not pipes
* is expected to have a well known signature.
3) Make `poll()` always reply "writable" for write end of the pipe
Afterall it seems that cygwin (accidentally?) does that for years.
Also, it should be noted that `pump_io_round()` writes 8MB blocks,
completely ignoring the fact that pipe's buffer size is only 8KB,
which means that pipe gets clogged many times during that single
write. This may invite a deadlock, if child's STDERR/STDOUT gets
clogged while it's trying to deal with 8MB of STDIN. Such deadlocks
could be defeated with writing less than pipe's buffer size per
round, and always reading everything from STDOUT/STDERR before
starting next round. Therefore, making `poll()` always reply
"writable" shouldn't cause any new issues or block any future
solutions.
4) Increase the size of the pipe's buffer
The difference between `BytesInQueue` and `QuotaUsed` is the size
of pending reads. Therefore, if buffer is bigger than size of reads,
`poll()` won't hang so easily. However, I found that for example
`strbuf_read()` will get more and more hungry as it reads large inputs,
eventually surpassing any reasonable pipe buffer size.
Chosen solution
---------------
Make `poll()` always reply "writable" for write end of the pipe.
Hopefully one day someone will find a way to implement it properly.
Reproduction
------------
printf "%8388608s" X >large_file.txt
git stash push --include-untracked -- large_file.txt
I have decided not to include this as test to avoid slowing down the
test suite. I don't expect the specific problem to come back, and
chances are that `git stash push` will be reworked to avoid sending the
entire patch via STDIN.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We seem to be getting new users who get confused every 20 months or
so with this "-h consistently wants to give help, but the commands
to which `-h` may feel like a good short-form option want it to mean
something else." compromise.
Let's make sure that the readers know that `git cmd -h` (with no
other arguments) is a way to get usage text, even for commands like
ls-remote and grep.
Also extend the description that is already in gitcli.txt, as it is
clear that users still get confused with the current text.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a repository has two files:
foo/bar/baz
foo/bar-2/baz
then a simple lexicographic ordering of files and directories shows
...
foo/bar
foo/bar-2
foo/bar/baz
...
and the appearance of foo/bar-2 between foo/bar and foo/bar/baz can trip
up some codepaths. Add a test to catch such cases.
t6020 might be a slight misfit since this testcase does not test any
kind of file/directory conflict. However, it is similar in spirit to
some tests (4-6) already in t6020 that check cases where a *file* sorted
between a directory and the files underneath that directory. This
testcase differs in that now there is a *directory* that sorts in the
middle.
Although merge-recursive currently has no problems with this simple
testcase, I discovered that it's very possible to accidentally mess it
up. Further, we have no other merge or cherry-pick or rebase testcases
in the entire testsuite that cover such a case, so I felt like it would
be a worthwhile addition to the testsuite.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t6022, we were testing for file being overwritten (or not) based on
an output message instead of checking for the file being overwritten.
Since we can check for the file being overwritten via mtime updates,
check that instead.
In t6046, we were largely checking for both the expected behavior and a
proxy for it, which is unnecessary. The calls to test-tool also were a
bit cryptic. Make them a little clearer.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of test_path_is_file, test_write_lines, and similar helpers
in these old test files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Indent code, and include it inside test_expect* blocks.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the en/merge-path-collision topic (see commit ac193e0e0a, "Merge
branch 'en/merge-path-collision'", 2019-01-04), all the "file collision"
conflict types were modified for consistency. In particular,
rename/add, rename/rename(2to1) and each rename/add piece of a
rename/rename(1to2)/add[/add] conflict were made to behave like add/add
conflicts have always been handled.
However, this consistency was not enforced when opt->priv->call_depth >
0 for rename/rename conflicts. Update rename/rename(1to2) and
rename/rename(2to1) conflicts in the recursive case to also be
consistent. As an added bonus, this simplifies the code considerably.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The most recent Azure Pipelines macOS agents enable what Apple calls
"System Integrity Protection". This makes `p4d -V` hang: there is some
sort of GUI dialog waiting for the user to acknowledge that the copied
binaries are legit and may be executed, but on build agents, there is no
user who could acknowledge that.
Let's ask Homebrew specifically to _not_ quarantine the Perforce
binaries.
Helped-by: Aleksandr Chebotov
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Among other differences relative to GNU sed, macOS' sed always ends its
output with a trailing newline, even if the input did not have such a
trailing newline.
Surprisingly, this makes three httpd-based tests fail on macOS: t5616,
t5702 and t5703. ("Surprisingly" because those tests have been around
for some time, but apparently nobody runs them on macOS with a working
Apache2 setup.)
The reason is that we use `sed` in those tests to filter the response of
the web server. Apart from the fact that we use GNU constructs (such as
using a space after the `c` command instead of a backslash and a
newline), we have another problem: macOS' sed LF-only newlines while
webservers are supposed to use CR/LF ones.
Even worse, t5616 uses `sed` to replace a binary part of the response
with a new binary part (kind of hoping that the replaced binary part
does not contain a 0x0a byte which would be interpreted as a newline).
To that end, it calls on Perl to read the binary pack file and
hex-encode it, then calls on `sed` to prefix every hex digit pair with a
`\x` in order to construct the text that the `c` statement of the `sed`
invocation is supposed to insert. So we call Perl and sed to construct a
sed statement. The final nail in the coffin is that macOS' sed does not
even interpret those `\x<hex>` constructs.
Let's just replace all of that by Perl snippets. With Perl, at least, we
do not have to deal with GNU vs macOS semantics, we do not have to worry
about unwanted trailing newlines, and we do not have to spawn commands
to construct arguments for other commands to be spawned (i.e. we can
avoid a whole lot of shell scripting complexity).
The upshot is that this fixes t5616, t5702 and t5703 on macOS with
Apache2.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_commit_graphs() copies the (translated) progress message into a
strbuf and passes the copy to start_delayed_progress() at each loop
iteration. The latter function takes a string pointer, so let's avoid
the detour and hand the string to it directly. That's shorter, simpler
and slightly more efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When searching the commit graph for tag candidates, `git-describe`
will stop as soon as there is only one active branch left and
it already found an annotated tag as a candidate.
This works well as long as all branches eventually connect back
to a common root, but if the tags are found across branches
with no common ancestor
B
o----.
\
o-----o---o----x
A
it can happen that the search on one branch terminates prematurely
because a tag was found on another, independent branch. This scenario
isn't quite as obscure as it sounds, since cloning with a limited
depth often introduces many independent "dead ends" into the commit
graph.
The help text of `git-describe` states pretty clearly that when
describing a commit D, the number appended to the emitted tag X should
correspond to the number of commits found by `git log X..D`.
Thus, this commit modifies the stopping condition to only abort
the search when only one branch is left to search *and* all current
best candidates are descendants from that branch.
For repositories with a single root, this condition is always
true: When the search is reduced to a single active branch, the
current commit must be an ancestor of *all* tag candidates. This
means that in the common case, this change will have no negative
performance impact since the same number of commits as before will
be traversed.
Signed-off-by: Benno Evers <benno@bmevers.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `options.switch_to' is set, `options.orig_head' is populated right
after with the object name the ref/commit argument points at.
Therefore, there is no need to parse `switch_to' again.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote rename X Y" needs to adjust configuration variables
(e.g. branch.<name>.remote) whose value used to be X to Y.
branch.<name>.pushRemote is now also updated.
* bw/remote-rename-update-config:
remote rename/remove: gently handle remote.pushDefault config
config: provide access to the current line number
remote rename/remove: handle branch.<name>.pushRemote config values
remote: clean-up config callback
remote: clean-up by returning early to avoid one indentation
pull --rebase/remote rename: document and honor single-letter abbreviations rebase types
Previously, performing "git clone --recurse-submodules --single-branch"
resulted in submodules cloning all branches even though the superproject
cloned only one branch. Pipe --single-branch through the submodule
helper framework to make it to 'clone' later on.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start using a named initializer list for SUBMODULE_UPDATE_CLONE_INIT, as
the struct is becoming cumbersome for a typical struct initializer list.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Log graph comparision logic is duplicated many times in:
- t3430-rebase-merges.sh
- t4202-log.sh
- t4214-log-graph-octopus.sh
- t4215-log-skewed-merges.sh
Consolidate the core of the comparision and sanitization logic in
lib-log-graph, and use it to replace the existing tests.
While at it, lose the singular/plural transition magic from the
sanitize_output helper, which was necessary around 7f814632 ("Use
correct grammar in diffstat summary line", 2012-02-01), that has
long outlived its usefulness.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add <path>" performs various checks before approving
<path> as a valid location for the new worktree. Aside from ensuring
that <path> does not already exist, one of the questions it asks is
whether <path> is already a registered worktree. To perform this check,
it queries find_worktree() and disallows the "add" operation if
find_worktree() finds a match for <path>. As a convenience, however,
find_worktree() casts an overly wide net to allow users to identify
worktrees by shorthand in order to keep typing to a minimum. For
instance, it performs suffix matching which, given subtrees "foo/bar"
and "foo/baz", can correctly select the latter when asked only for
"baz".
"add" validation knows the exact path it is interrogating, so this sort
of heuristic-based matching is, at best, questionable for this use-case
and, at worst, may may accidentally interpret <path> as matching an
existing worktree and incorrectly report it as already registered even
when it isn't. (In fact, validate_worktree_add() already contains a
special case to avoid accidentally matching against the main worktree,
precisely due to this problem.)
Avoid the problem of potential accidental matching against an existing
worktree by instead taking advantage of find_worktree_by_path() which
matches paths deterministically, without applying any sort of magic
shorthand matching performed by find_worktree().
Reported-by: Cameron Gunnin <cameron.gunnin@synopsys.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
find_worktree() employs heuristics to match user provided input -- which
may be a pathname or some sort of shorthand -- with an actual worktree.
Although this convenience allows a user to identify a worktree with
minimal typing, the black-box nature of these heuristics makes it
potentially difficult for callers which already know the exact path of a
worktree to be confident that the correct worktree will be returned for
any specific pathname (particularly a relative one), especially as the
heuristics are enhanced and updated.
Therefore, add a companion function, find_worktree_by_path(), which
deterministically identifies a worktree strictly by pathname with no
interpretation and no magic matching.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do a better job of explaining that find_worktree()'s main purpose is to
locate a worktree based upon input from a user which may be some sort of
shorthand for identifying a worktree rather than an actual path. For
instance, one shorthand a user can use to identify a worktree is by
unique path suffix (i.e. given worktrees at paths "foo/bar" and
"foo/baz", the latter can be identified simply as "baz"). The actual
heuristics find_worktree() uses to select a worktree may be expanded in
the future (for instance, one day it may allow worktree selection by
<id> of the .git/worktrees/<id>/ administrative directory), thus the
documentation does not provide a precise description of how matching is
performed, instead leaving it open-ended to allow for future
enhancement.
While at it, drop mention of the non-NULL requirement of `prefix` since
NULL has long been allowed. For instance, prefix_filename() has
explicitly allowed NULL since 116fb64e43 (prefix_filename: drop length
parameter, 2017-03-20), and find_worktree() itself since e4da43b1f0
(prefix_filename: return newly allocated string, 2017-03-20).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, nth_packed_object_sha1() was the primary way to get
the oid of a packfile's index position. But these days we have the more
type-safe nth_packed_object_id() wrapper, and all callers have been
converted.
Let's drop the "sha1" version (turning the safer wrapper into a single
function) so that nobody is tempted to introduce new callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit changed the public interface of packed_object_info()
to return a struct object_id rather than a bare hash. That enables us to
convert our internal helper, as well. We can use nth_packed_object_id()
directly for OFS_DELTA, but we'll still have to use oidread() to pull
the hash for a REF_DELTA out of the packfile.
There should be no additional cost, since we're copying directly into
the object_id the caller provided us (just as we did before; it's just
happening now via nth_packed_object_id()).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a caller sets the object_info.delta_base_sha1 to a non-NULL pointer,
we'll write the oid of the object's delta base to it. But we can
increase our type safety by switching this to a real object_id struct.
All of our callers are just pointing into the hash member of an
object_id anyway, so there's no inconvenience.
Note that we do still keep it as a pointer-to-struct, because the NULL
sentinel value tells us whether the caller is even interested in the
information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're checking a pack with fsck or verify-pack, we first sort the
idx entries by offset, since accessing them in pack order is more
efficient. To do so, we loop over them and fill in an array of structs
with the offset, object_id, and index position of each, sort the result,
and only then do we iterate over the sorted array and process each
entry.
In order to avoid the memory cost of storing the hash of each object, we
just store a pointer into the copy in the mmap'd pack index file. To
keep that property even as the rest of the code converted to "struct
object_id", commit 9fd750461b (Convert the verify_pack callback to
struct object_id, 2017-05-06) introduced a union in order to type-pun
the pointer-to-hash into an object_id struct.
But we can make this even simpler by observing that the sort operation
doesn't need the object id at all! We only need them one at a time while
we actually process each entry. So we can just omit the oid from the
struct entirely and load it on the fly into a local variable in the
second loop.
This gets rid of the type-punning, and lets us directly use the more
type-safe nth_packed_object_id(), simplifying the code. And as a bonus,
it saves 8 bytes of memory per object.
Note that this does mean we'll do the offset lookup for each object
before the oid lookup. The oid lookup has more safety checks in it
(e.g., for looking past p->num_objects) which in theory protected the
offset lookup. But since violating those checks was already a BUG()
condition (as described in the previous commit), it's not worth worrying
about.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we fail to load the oid from the index of a packfile, we'll die()
with an "internal error". But this should never happen: we'd fail here
only if the idx needed to be lazily opened (but we've already opened it)
or if we asked for an out-of-range index (but we're iterating using the
same count that we'd check the range against). A corrupted index might
have a bogus count (e.g., too large for its size), but we'd have
complained and aborted already when opening the index initially.
While we're here, we can add a few details so that if this bug ever
_does_ trigger, we'll have a bit more information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A pack bitmap file contains the index position of the commit for each
bitmap, which we then translate into an object id via
nth_packed_object_sha1(). In preparation for that function going away,
we can switch to the more type-safe nth_packed_object_id().
Note that even though the result ends up in an object_id this does incur
an extra copy of the hash (into our temporary object_id, and then into
the final malloc'd stored_bitmap struct). This shouldn't make any
measurable difference. If it did, we could avoid this copy _and_ the
copy of the rest of the items by allocating the stored_bitmap struct
beforehand and reading directly into it from the bitmap file. Or better
still, if this is a bottleneck, we could introduce an on-disk index to
the bitmap file so we don't have to read every single entry to use just
one of them. So it's not worth worrying about micro-optimizing out this
one hash copy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the pack-reuse code is dumping an OFS_DELTA entry to a client that
doesn't support it, we re-write it as a REF_DELTA. To do so, we use
nth_packed_object_sha1() to get the oid, but that function is soon going
away in favor of the more type-safe nth_packed_object_id(). Let's switch
now in preparation.
Note that this does incur an extra hash copy (from the pack idx mmap to
the object_id and then to the output, rather than straight from mmap to
the output). But this is not worth worrying about. It's probably not
measurable even when it triggers, and this is fallback code that we
expect to trigger very rarely (since everybody supports OFS_DELTA these
days anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already store an object_id internally, and now our sole caller also
has one. Let's stop passing around the internal hash array, which adds a
bit of type safety.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're considering reusing an on-disk delta, we get the oid of the
base as a pointer to unsigned char bytes of the hash, either into the
packfile itself (for REF_DELTA) or into the pack idx (using the revindex
to convert the offset into an index entry).
Instead, we'd prefer to use a more type-safe object_id as much as
possible. We can get the pack idx using nth_packed_object_id() instead.
For the packfile bytes, we can copy them out using oidread().
This doesn't even incur an extra copy overall, since the next thing we'd
always do with that pointer is pass it to can_reuse_delta(), which needs
an object_id anyway (and called oidread() itself). So this patch also
converts that function to take the object_id directly.
Note that we did previously use NULL as a sentinel value when the object
isn't a delta. We could probably get away with using the null oid for
this, but instead we'll use an explicit boolean flag, which should make
things more obvious for people reading the code later.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our nth_packed_object_sha1() function returns NULL for error. So when we
wrapped it with nth_packed_object_oid(), we kept the same semantics. But
it's a bit funny, because the caller actually passes in an out
parameter, and the pointer we return is just that same struct they
passed to us (or NULL).
It's not too terrible, but it does make the interface a little
non-idiomatic. Let's switch to our usual "0 for success, negative for
error" return value. Most callers either don't check it, or are
trivially converted. The one that requires the biggest change is
actually improved, as we can ditch an extra aliased pointer variable.
Since we are changing the interface in a subtle way that the compiler
wouldn't catch, let's also change the name to catch any topics in
flight. We can drop the 'o' and make it nth_packed_object_id(). That's
slightly shorter, but also less redundant since the 'o' stands for
"object" already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code has been unused since fa099d2322 (worktree.c: kill parse_ref()
in favor of refs_resolve_ref_unsafe(), 2017-04-24), so drop it.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fingerprints member of struct blame_origin is a void pointer that is
only ever used to reference objects of type struct fingerprint. Declare
its type to allow the compiler to do type checks. We can keep its type
opaque in blame.h, though -- only functions in blame.c need to know the
actual definition of struct fingerprint.
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Barret Rhoden <brho@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The invocation "git rebase <upstream> <branch>" switches to <branch>
before performing the rebase operation. However, unlike git-switch,
git-checkout, and git-worktree which all refuse to switch to a branch
that is already checked out in some other worktree, git-rebase switches
to <branch> unconditionally. Curb this careless behavior by making
git-rebase also refuse to switch to a branch checked out elsewhere.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test intentionally creates a file which causes rebase to fail, thus
it is important that this file be deleted before subsequent tests are
run which are not expecting such a failure. In the past, the common way
to ensure cleanup (regardless of whether the test succeeded or failed)
was either for the next test to perform the previous test's cleanup as
its first step or to do the cleanup at global scope outside of any
tests. With the introduction of 'test_when_finished', however, tests can
be responsible for their own cleanup. Therefore, update this test to
clean up after itself.
A bit of history: This 'rm' invocation was moved from within the body of
the following test to global scope by bffd750adf (rebase: improve error
message when upstream argument is missing, 2010-05-31), which postdates,
by about a month, introduction of 'test_when_finished' in 3bf7886705
(test-lib: Let tests specify commands to be run at end of test,
2010-05-02).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We `cat` files, but don't inspect or grab the contents in any way.
Unlike in an earlier commit, there is no reason to suspect that these
files could be missing, so `cat`-ing them is just wasted effort.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We `cat` kwdelfile.c, but don't inspect or grab the contents in any way.
This looks like a remnant from a debug session. Similar to the previous
commit, one could argue that `cat`-ing the file verifies that it didn't
disappear somehow. But because the very next thing we do after `cat`-ing
the file is to `grep` in it, we can safely drop the call to `cat`.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We `cat` files, but don't inspect or grab the contents in any way. These
`cat` calls look like remnants from a debug session, so it's tempting to
get rid of them. But they do actually verify that the files exist, which
might not necessarily be the case for some failure modes of `git apply
--reject`. Let's not lose that.
Convert the `cat` calls to use `test_path_is_file` instead. This is of
course still a minor change since we no longer verify that the files can
be opened for reading, but that is not something we usually worry about.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The receive.denyCurrentBranch config option controls what happens if
you push to a branch that is checked out into a non-bare repository.
By default, it rejects it. It can be disabled via `ignore` or `warn`.
Another yet trickier option is `updateInstead`.
However, this setting was forgotten when the git worktree command was
introduced: only the main worktree's current branch is respected.
With this change, all worktrees are respected.
That change also leads to revealing another bug,
i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a
non-bare repository's unborn current branch using ref namespaces. As
`is_ref_checked_out()` returns 0 which means `receive-pack` does not get
into conditional statement to switch `deny_current_branch` accordingly
(ignore, warn, refuse, unconfigured, updateInstead).
receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()`
(called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but
that function fails when HEAD does not point at a valid commit.
As we replace the call to `refs_resolve_ref_unsafe()` with
`find_shared_symref()`, which has no problem finding the worktree for a
given branch even if it is unborn yet, this bug is fixed at the same
time: receive.denyCurrentBranch now also handles worktrees with unborn
branches as intended even while using ref namespaces.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`receive.denyCurrentBranch` currently has a bug where it allows pushing
into non-bare repository using namespaces as long as it does not have any
commits. This would cause t5509 to fail once that bug is fixed because it
pushes into an unborn current branch.
In t5509, no operations are performed inside `pushee`, as it is only a
target for `git push` and `git ls-remote` calls. Therefore it does not
need to have a worktree. So, it is safe to change `pushee` to a bare
repository.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When called in the Git directory of a non-bare repository, this function
would not return the directory of the main worktree, but of the Git
directory instead.
The reason: when the Git directory is the current working directory, the
absolute path of the common directory will be reported with a trailing
`/.git/.`, which the code of `get_main_worktree()` does not handle
correctly.
Let's fix this.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These options are available since git v2.15, but somehow
eluded from the completion script.
Note that while --color-moved-ws= accepts comma-separated
list of values, there is no (easy?) way to make it work
with completion (see e.g. [1]).
[1]: https://github.com/scop/bash-completion/issues/240
Acked-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transition plan anticipates that we will allow signatures using
multiple algorithms in a single commit. In order to do so, we need to
use a different header per algorithm so that it will be obvious over
which data to compute the signature.
The transition plan specifies that we should use "gpgsig-sha256", so
wire up the commit code such that it can write and parse the current
algorithm, and it can remove the headers for any algorithm when creating
a new commit. Add tests to ensure that we write using the right header
and that git fsck doesn't reject these commits.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git supports both repository versions 0 and 1. These formats are
identical except for the presence of extensions. When using an
extension, such as for a different hash algorithm, a check for only
version 0 causes the check to fail. Instead, call
verify_repository_format to verify that we have an appropriate version
and no unknown extensions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we perform a clone, we won't know the remote side's hash algorithm
until we've read the heads. Consequently, we'll need to rewrite the
repository format version and hash algorithm once we know what the
remote side has. Move the code that does this into its own function so
that we can call it from clone in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the foreseeable future, SHA-1 will be the default algorithm for Git.
However, when running the testsuite, we want to be able to test an
arbitrary algorithm. It would be quite burdensome and very untidy to
have to specify the algorithm we'd like to test every time we
initialized a new repository somewhere in the testsuite, so add an
environment variable to allow us to specify the default hash algorithm
for Git.
This has the benefit that we can set it once for the entire testsuite
and not have to think about it. In the future, users can also use it to
set the default for their repositories if they would like to do so.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the user to specify the hash algorithm on the command line by
using the --object-format option to git init. Validate that the user is
not attempting to reinitialize a repository with a different hash
algorithm. Ensure that if we are writing a non-SHA-1 repository that we
set the repository version to 1 and write the objectFormat extension.
Restrict this option to work only when ENABLE_SHA256 is set until the
codebase is in a situation to fully support this.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we will want to not only check the repository format, but
extract the information that we've gained. To do so, allow
check_repository_format to take a pointer to struct repository_format.
Allow passing NULL for this argument if we're not interested in the
information, and pass NULL for all existing callers. A future patch
will make use of this information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test currently hard-codes the hash algorithm as SHA-1 when calling
repo_set_hash_algo so that the_hash_algo is properly initialized.
However, this does not work with SHA-256 repositories. Read the
repository value that repo_init has read into the local repository
variable and set the algorithm based on that value.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repository helper is used in t5318 to read commit graphs whether
we're in a repository or not. However, without a repository, we have no
way to properly initialize the hash algorithm, meaning that the file is
misread.
Fix this by calling setup_git_directory_gently, which will read the
environment variable the testsuite sets to ensure that the correct hash
algorithm is set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test helper, we read the index. In order to have the proper
hash algorithm set up, we must call setup_git_directory. Do so, so that
the test works when extensions.objectFormat is set.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the git for-each-ref tests asks to sort by object ID. However,
when sorted, the order of the refs differs between SHA-1 and SHA-256.
Sort the expected output so that the test passes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll allow developers to run the testsuite with a hash
algorithm of their choice. To make this easier, compute the fixed
constants using test_oid. Move the constant initialization down below
the point where test-lib-functions.sh is loaded so the functions are
defined.
Note that we don't provide a value for the OID_REGEX value directly
because writing a large number of instances of "[0-9a-f]" in the
oid-info files is unwieldy and there isn't a way to compute it based on
those values. Instead, compute it based on ZERO_OID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At this point, SHA-256 support is experimental and some behavior may
change. To avoid surprising unsuspecting users, require a build flag,
ENABLE_SHA256, to allow use of a non-SHA-1 algorithm. Enable this flag
by default when the DEVELOPER make flag is set so that contributors can
test this case adequately.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some places where we need to parse a hex object ID in any
algorithm without knowing beforehand which algorithm is in use. An
example is when parsing fast-import marks.
Add a get_oid_hex_any to parse an object ID and return the algorithm it
belongs to, and additionally add parse_oid_hex_any which is the
equivalent change for parse_oid_hex. If the object is not parseable, we
return GIT_HASH_UNKNOWN.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce variants of get_oid_hex and parse_oid_hex that parse an
arbitrary hash algorithm, implementing internal functions to avoid
duplication. These functions can be used in the transport code to parse
refs properly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For all of our SHA-1 implementations and most of our SHA-256
implementations, the hash context we use is a real struct. For these
implementations, it's possible to copy a hash context by making a copy
of the struct.
However, for our libgcrypt implementation, our hash context is a
pointer. Consequently, copying it does not lead to an independent hash
context like we intended.
Fortunately, however, libgcrypt provides us with a handy function to
copy hash contexts. Let's add a cloning function to the hash algorithm
API, and use it in the one place we need to make a hash context copy.
With this change, our libgcrypt SHA-256 implementation is fully
functional with all of our other hash implementations.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid hard-coding a hash size, instead preferring to use the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can check if certain characters are present in a string by calling
strchr(3) on each of them, or we can pass them all to a single
strpbrk(3) call. The latter is shorter, less repetitive and slightly
more efficient, so let's do that instead.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
isalnum(c) is equivalent to isalpha(c) || isdigit(c), so use the
former instead. The result is shorter, simpler and slightly more
efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_path_is_file() instead of 'test -f' for better debugging
information.
Signed-off-by: Rasmus Jonsson <wasmus@zom.bi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using partial clone, find_non_local_tags() in builtin/fetch.c
checks each remote tag to see if its object also exists locally. There
is no expectation that the object exist locally, but this function
nevertheless triggers a lazy fetch if the object does not exist. This
can be extremely expensive when asking for a commit, as we are
completely removed from the context of the non-existent object and
thus supply no "haves" in the request.
6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
global variable that prevented these fetches in favor of a bitflag.
However, some object existence checks were not updated to use this flag.
Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
repreparing the pack-file structures. We need to be extremely careful
about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
due to updated refs.
This resolves a broken test in t5616-partial-clone.sh.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While testing partial clone, I noticed some odd behavior. I was testing
a way of running 'git init', followed by manually configuring the remote
for partial clone, and then running 'git fetch'. Astonishingly, I saw
the 'git fetch' process start asking the server for multiple rounds of
pack-file downloads! When tweaking the situation a little more, I
discovered that I could cause the remote to hang up with an error.
Add two tests that demonstrate these two issues.
In the first test, we find that when fetching with blob filters from
a repository that previously did not have any tags, the 'git fetch
--tags origin' command fails because the server sends "multiple
filter-specs cannot be combined". This only happens when using
protocol v2.
In the second test, we see that a 'git fetch origin' request with
several ref updates results in multiple pack-file downloads. This must
be due to Git trying to fault-in the objects pointed by the refs. What
makes this matter particularly nasty is that this goes through the
do_oid_object_info_extended() method, so there are no "haves" in the
negotiation. This leads the remote to send every reachable commit and
tree from each new ref, providing a quadratic amount of data transfer!
This test is fixed if we revert 6462d5eb9a (fetch: remove
fetch_if_missing=0, 2019-11-05), but that revert causes other test
failures. The real fix will need more care.
The tests are ordered in this way because if I swap the test order the
tag test will succeed instead of fail. I believe this is because somehow
we need the srv.bare repo to not have any tags when we clone, but then
have tags in our next fetch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An accidental conversion of a tab to 4 spaces snuck into 4c4066d95d
(run-command: move doc to run-command.h, 2019-11-17), messing up the
alignment when you have the project-recommended 8-width tabstops. Let's
revert that line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An annotated tag has two names: where it sits in the refs/tags
hierarchy and the tagname recorded in the "tag" field in the object
itself. They usually should match.
Since 212945d4 ("Teach git-describe to verify annotated tag names
before output", 2008-02-28), a commit described using an annotated
tag bases its name on the tagname from the object. While this was a
deliberate design decision to make it easier to converse about tags
with others, even if the tags happen to be fetched to a different
name than it was given by its creator, it had one downside.
The output from "git describe", at least in the modern Git, should
be usable as an object name to name the exact commit given to the
"git describe" command. Using the tagname, when two names differ,
breaks this property, when describing a commit that is directly
pointed at by such a tag. An annotated tag Bob made as "v1.0" may
sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from
"git describe v1.0-bob^0" would say "v1.0", but there may not be
any tag at "refs/tags/v1.0" locally or there may be another tag that
points at a different object.
Note that this won't be a problem if a commit being described is not
directly pointed at by such a mislocated tag. In the example in the
previous paragraph, describing a commit whose parent is v1.0-bob
would result in "v1.0" (i.e. the tagname taken from the tag object)
followed by "-1-gXXXXX" where XXXXX is the abbreviated object name,
and a string that ends with "-g" followed by a hexadecimal string is
an object name for the object whose name begins with hexadecimal
string (as long as it is unique), so it does not matter if the
leading part is "v1.0" or "v1.0-bob".
Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when
the name we give is based on a mislocated annotated tag to ensure
that the output can be used as the object name for the object
originally given to the command to fix the issue.
While at it, remove an overly cautious dead code to protect against
an annotated tag object without the tagname. Such a tag is filtered
out much earlier in the codeflow, and will not reach this part of
the code.
Helped-by: Matheus Tavares <matheus.bernardino@usp.br>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9e6d3e64 (sparse-checkout: detect short patterns, 2020-01-24), a
condition on the minimum length of a cone-mode pattern was introduced.
However, this condition was off-by-one.
If we have a directory with a single character, say "b", then the
command
git sparse-checkout set b
will correctly add the pattern "/b/" to the sparse-checkout file. When
this is interpeted in dir.c, the pattern is "/b" with the
PATTERN_FLAG_MUSTBEDIR flag. This string has length two, which satisfies
our inclusive inequality (<= 2).
The reason for this inequality is that we will start to read the pattern
string character-by-character using three char pointers: prev, cur,
next. In particular, next is set to the current pattern plus two. The
mistake was that next will still be a valid pointer when the pattern
length is two, since the string is null-terminated.
Make this inequality strict so these patterns work.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git am --show-current-patch" was added in commit 984913a210 ("am:
add --show-current-patch", 2018-02-12), "git am" started recommending it
as a replacement for .git/rebase-merge/patch. Unfortunately the suggestion
is somewhat misguided; for example, the output of "git am --show-current-patch"
cannot be passed to "git apply" if it is encoded as quoted-printable
or base64. Add a new mode to "git am --show-current-patch" in order to
straighten the suggestion.
Reported-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git am --show-current-patch" was added in commit 984913a210 ("am:
add --show-current-patch", 2018-02-12), "git am" started recommending it
as a replacement for .git/rebase-merge/patch. Unfortunately the suggestion
is somewhat misguided; for example, the output "git am --show-current-patch"
cannot be passed to "git apply" if it is encoded as quoted-printable or
base64. To simplify worktree operations and to avoid that users poke into
.git, it would be better if "git am" also provided a mode that copies
.git/rebase-merge/patch to stdout.
One possibility could be to have completely separate options, introducing
for example --show-current-message (for .git/rebase-apply/NNNN)
and --show-current-diff (for .git/rebase-apply/patch), while possibly
deprecating --show-current-patch.
That would even remove the need for the first two patches in the series.
However, the long common prefix would have prevented using an abbreviated
option such as "--show". Therefore, I chose instead to add a string
argument to --show-current-patch. The new argument is optional, so that
"git am --show-current-patch"'s behavior remains backwards-compatible.
The next choice to make is how to handle multiple --show-current-patch
options. Right now, something like "git am --abort --show-current-patch"
is rejected, and the previous suggestion would likewise have naturally
rejected a command line like
git am --show-current-message --show-current-diff
Therefore, I decided to also reject for example
git am --show-current-patch=diff --show-current-patch=raw
In other words the whole of --show-current-patch=xxx (including the
optional argument) is treated as the command mode. I found this to be
more consistent and intuitive, even though it differs from the usual
"last one wins" semantics of the git command line.
Add the code to parse submodes based on the above design, where for now
"raw" is the only valid submode. "raw" prints the full e-mail message
just like "git am --show-current-patch".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will allow stashing the submode of --show-current-patch from a
callback function. Using a struct will allow accessing both fields from
outside cmd_am (through container_of).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPTION_CMDMODE is essentially OPTION_SET_INT plus an extra check that
the variable had not set before. In order to allow custom processing
of the option, for example a "command mode" option that also has an
argument, it would be nice to use OPTION_CALLBACK and not have to rewrite
the extra check on incompatible options. In other words, making the
processing of the option orthogonal to the "only one of these" behavior
provided by OPTION_CMDMODE.
Add a new flag that takes care of the check, and modify OPT_CMDMODE to
use it together with OPTION_SET_INT. The new flag still requires that the
option value points to an int, but any OPTION_* value can be specified as
long as it does not require a non-int type for opt->value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before modifying the implementation, ensure that general operation of
OPT_CMDMODE() and detection of incompatible options are covered.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, a user will want to use a specific credential helper for
a wildcard pattern, such as https://*.corp.example.com. We have code
that handles this already with the urlmatch code, so let's use that
instead of our custom code.
Since the urlmatch code is a superset of our current matching in terms
of capabilities, there shouldn't be any cases of things that matched
previously that don't match now. However, in addition to wildcard
matching, we now use partial path matching, which can cause slightly
different behavior in the case that a helper applies to the prefix
(considering path components) of the remote URL. While different, this
is probably the behavior people were wanting anyway.
Since we're using the urlmatch code, we need to encode the components
we've gotten into a URL to match, so add a function to percent-encode
data and format the URL with it. We now also no longer need to the
custom code to match URLs, so let's remove it.
Additionally, the urlmatch code always looks for the best match, whereas
we want all matches for credential helpers to preserve existing
behavior. Let's add an optional field, select_fn, that lets us control
which items we want (in this case, all of them) and default it to the
best-match code that already exists for other users.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Everywhere else in the codebase, we use the rule that the last matching
configuration option is the one that takes effect. This is helpful
because it allows more specific configuration settings (e.g., per-repo
configuration) to override less specific settings (e.g., per-user
configuration).
However, in the credential code, we didn't honor this setting, and
instead picked the first setting we had, and stuck with it. This was
likely to ensure we picked the value from the URL, which we want to
honor over the configuration.
It's possible to do both, though, so let's check if the value is the one
we've gotten over our protocol connection, which if present will have
come from the URL, and keep it if so. Otherwise, let's overwrite the
value with the latest version we've got from the configuration, so we
keep the last configuration value.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some tricky cases in our credential helpers that we don't have
test cases for. To help prevent regressions, let's add some for these
cases:
* If there are multiple configured credential helpers, one without a
path and one with a path, we want to invoke both of them.
* If there are percent-encoded values in the URL, we handle them
properly.
* And finally, if there is a username in the remote URL, we want to
honor that over what the configuration tells us.
Finally, there's an additional case that we'd like to test for as well,
but that currently fails. In all other situations in our configuration,
we pick the last configuration setting that's provided. However, we
fail to do that for credential.username, where we pick the first setting
instead. Let's add a failing test that we have the consistent behavior
here, since that's the documented, expected behavior.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our urlmatch code handles multiple wildcards, but we don't currently
have a test that checks this code path. Add a test that we handle this
case correctly to avoid any regressions.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To more accurately track the provenance of contributions, brian uses a
work email address for commits created at work. Add this email address
to .mailmap so that contributions are properly attributed to the same
person.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit fixed a bug with the (no submodule) -> (nested
submodules) transition for commands in the unpack-trees machinery.
Let's add a test for the reverse transition (going from nested
submodules to no submodule), as it is not being tested currently.
While at it, uniformize the capitalization in the list of tests.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using `git checkout --recurse-submodules` to switch between a
branch with no submodules and a branch with initialized nested
submodules currently causes a fatal error:
$ git checkout --recurse-submodules branch-with-nested-submodules
fatal: exec '--super-prefix=submodule/nested/': cd to 'nested'
failed: No such file or directory
error: Submodule 'nested' could not be updated.
error: Submodule 'submodule/nested' cannot checkout new HEAD.
error: Submodule 'submodule' could not be updated.
M submodule
Switched to branch 'branch-with-nested-submodules'
The checkout succeeds but the worktree and index of the first level
submodule are left empty:
$ cd submodule
$ git -c status.submoduleSummary=1 status
HEAD detached at b3ce885
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: .gitmodules
deleted: first.t
deleted: nested
fatal: not a git repository: 'nested/.git'
Submodule changes to be committed:
* nested 1e96f59...0000000:
$ git ls-files -s
$ # empty
$ ls -A
.git
The reason for the fatal error during the checkout is that a child git
process tries to cd into the yet unexisting nested submodule directory.
The sequence is the following:
1. The main git process (the one running in the superproject) eventually
reaches write_entry() in entry.c, which creates the first level
submodule directory and then calls submodule_move_head() in submodule.c,
which spawns `git read-tree` in the submodule directory.
2. The first child git process (the one in the submodule of the
superproject) eventually calls check_submodule_move_head() at
unpack_trees.c:2021, which calls submodule_move_head in dry-run mode,
which spawns `git read-tree` in the nested submodule directory.
3. The second child git process tries to chdir() in the yet unexisting
nested submodule directory in start_command() at run-command.c:829 and
dies before exec'ing.
The reason why check_submodule_move_head() is reached in the first child
and not in the main process is that it is inside an
if(submodule_from_ce()) construct, and submodule_from_ce() returns a
valid struct submodule pointer, whereas it returns a null pointer in the
main git process.
The reason why submodule_from_ce() returns a null pointer in the main
git process is because the call to cache_lookup_path() in config_from()
(called from submodule_from_path() in submodule_from_ce()) returns a
null pointer since the hashmap "for_path" in the submodule_cache of
the_repository is not yet populated. It is not populated because both
repo_get_oid(repo, GITMODULES_INDEX, &oid) and repo_get_oid(repo,
GITMODULES_HEAD, &oid) in config_from_gitmodules() at
submodule-config.c:639-640 return -1, as at this stage of the operation,
neither the HEAD of the superproject nor its index contain any
.gitmodules file.
In contrast, in the first child the hashmap is populated because
repo_get_oid(repo, GITMODULES_HEAD, &oid) returns 0 as the HEAD of the
first level submodule, i.e. .git/modules/submodule/HEAD, points to a
commit where .gitmodules is present and records 'nested' as a submodule.
Fix this bug by checking that the submodule directory exists before
calling check_submodule_move_head() in merged_entry() in the `if(!old)`
branch, i.e. if going from a commit with no submodule to a commit with a
submodule present.
Also protect the other call to check_submodule_move_head() in
merged_entry() the same way as it is safer, even though the `else if
(!(old->ce_flags & CE_CONFLICTED))` branch of the code is not at play in
the present bug.
The other calls to check_submodule_move_head() in other functions in
unpack_trees.c are all already protected by calls to lstat() somewhere
in
the program flow so we don't need additional protection for them.
All commands in the unpack_trees machinery are affected, i.e. checkout,
reset and read-tree when called with the --recurse-submodules flag.
This bug was first reported in [1].
[1]
https://lore.kernel.org/git/7437BB59-4605-48EC-B05E-E2BDB2D9DABC@gmail.com/
Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reported-by: Damien Robert <damien.olivier.robert@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function verify_clean_submodule() learned to verify if a submodule
working tree is clean in a7bc845a9a (unpack-trees: check if we can
perform the operation for submodules, 2017-03-14), but the commented
description above it was not updated to reflect that, such that this
description has been outdated since then.
Since Git has now learned to optionnally recursively check out
submodules during a superproject checkout, remove this outdated
description.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test "$command: submodule branch is not changed, detach HEAD
instead" is in the "Appearing submodule" section of
test_submodule_recursing_with_args_common(), but this test updates a
submodule; it does not test a transition from a state with no submodule
to a state with a submodule.
As such, for consistency, move it to the "Modified submodule" section of
the same function. While at it, add a comment describing the test.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commands in the unpack_trees machinery (checkout, reset, read-tree)
were fixed in 218c883783 (submodule: properly recurse for read-tree and
checkout, 2017-05-02) to correctly update nested submodules when called
with the `--recurse-submodules` flag.
However, a comment in t/lib-submodule-update.sh mentions that this use
case still doesn't work.
Remove this outdated comment.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The known failure mode KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED was
removed from lib-submodule-update.sh in 218c883783 (submodule: properly
recurse for read-tree and checkout, 2017-05-02) but at that time this
change was not ported over to topic sb/reset-recurse-submodules, such
that when this topic was merged in 5f074ca7e8 (Merge branch
'sb/reset-recurse-submodules', 2017-05-29), t7112-reset-submodules.sh
kept a mention of this removed failure mode.
Remove it now, as it does not mean anything anymore.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--patch`, even when <file> is not `-`. Such use case is not
really expected.
2) It is not allowed to pass pathspec in both args and file.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate crude option parsing and rely on real parsing instead, because
1) Crude parsing is crude, for example it's not capable of
handling things like `git stash -m Message`
2) Adding options in two places is inconvenient and prone to bugs
As a side result, the case of `git stash -m Message` gets fixed.
Also give a good error message instead of just throwing usage at user.
----
Some review of what's been happening to this code:
Before [1], `git-stash.sh` only verified that all args begin with `-` :
# The default command is "push" if nothing but options are given
seen_non_option=
for opt
do
case "$opt" in
--) break ;;
-*) ;;
*) seen_non_option=t; break ;;
esac
done
Later, [1] introduced the duplicate code I'm now removing, also making
the previous test more strict by white-listing options.
----
[1] Commit 40af1468 ("stash: convert `stash--helper.c` into `stash.c`" 2019-02-26)
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch continues the effort that is already applied to
`git commit`, `git reset`, `git checkout` etc.
1) Added reference to 'linkgit:gitglossary[7]'.
2) Fixed mentions of incorrectly plural "pathspecs".
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Together with the previous patch, this brings docs for `git stash` to
the common layout used for most other commands (see for example docs
for `git add`, `git commit`, `git checkout`, `git reset`) where all
options are documented in a separate list.
After some thinking and having a look at docs for `git svn` and
`git `submodule`, I have arrived at following conclusions:
* Options should be described in a list rather then text to
facilitate lookup for user.
* Single list is better then multiple lists because it avoids
copy&pasting descriptions between subcommands (or, without
copy&pasting, user will have to look up missing options in other
subcommands).
* As a consequence, commands section should only give brief info and
list possible options. Since options have good enough names, user
will only need to look up the "interesting" options.
* Every option should list which subcommands support it.
I have decided to use alphabetical sorting in the list of options to
facilitate lookup for user.
There is some text editing done to make old descriptions better fit
into the list-style format.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch moves blocks of text as-is to make it easier to review the
next patch.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) It is not allowed to pass pathspec in both args and file.
Adjustments were needed for `if (!argc)` block:
This code actually means "pathspec is not present". Previously, pathspec
could only come from commandline arguments, so testing for `argc` was a
valid way of testing for the presence of pathspec. But this is no longer
true with `--pathspec-from-file`.
During the entire `--pathspec-from-file` story, I tried to keep its
behavior very close to giving pathspec on commandline, so that switching
from one to another doesn't involve any surprises.
However, throwing usage at user in the case of empty
`--pathspec-from-file` would puzzle because there's nothing wrong with
"usage" (that is, argc/argv array).
On the other hand, throwing usage in the old case also feels bad to me.
While it's less of a puzzle, I (as user) never liked the experience of
comparing my commandline to "usage", trying to spot a difference. Since
it's already known what the error is, it feels a lot better to give that
specific error to user.
Judging from [1] it doesn't seem that showing usage in this case was
important (the patch was to avoid segfault), and it doesn't fit into how
other commands react to empty pathspec (see for example `git add` with a
custom message).
Therefore, I decided to show new error text in both cases. In order to
continue testing for error early, I moved `parse_pathspec()` higher. Now
it happens before `read_cache()` / `hold_locked_index()` /
`setup_work_tree()`, which shouldn't cause any issues.
[1] Commit 7612a1ef ("git-rm: honor -n flag" 2006-06-09)
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we need to delete a higher stage entry in the index to place the file
at stage 0, then we'll lose that file's stat information. In such
situations we may still be able to detect that the file on disk is the
version we want (as noted by our comment in the code:
/* do not overwrite file if already present */
), but we do still need to update the mtime since we are creating a new
cache_entry for that file. Update the logic used to determine whether
we refresh a file's mtime.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A user discovered a case where they had a stack of 20 simple commits to
rebase, and the rebase would succeed in picking the first commit and
then error out with a pair of "Could not execute the todo command" and
"Your local changes to the following files would be overwritten by
merge" messages.
Their steps actually made use of the -i flag, but I switched it over to
-m to make it simpler to trigger the bug. With that flag, it bisects
back to commit 68aa495b59 (rebase: implement --merge via the
interactive machinery, 2018-12-11), but that's misleading. If you
change the -m flag to --keep-empty, then the problem persists and will
bisect back to 356ee4659b (sequencer: try to commit without forking
'git commit', 2017-11-24)
After playing with the testcase for a bit, I discovered that added
--exec "sleep 1" to the command line makes the rebase succeed, making me
suspect there is some kind of discard and reloading of caches that lead
us to believe that something is stat dirty, but I didn't succeed in
digging any further than that.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
All the functions calling `bisect_next_all()` are already able to
handle return values from it.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
Update all callers to handle the error returns.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
Code that turns BISECT_INTERNAL_SUCCESS_MERGE_BASE (-11)
to BISECT_OK (0) from `check_good_are_ancestors_of_bad()` has been moved to
`cmd_bisect__helper()`.
Update all callers to handle the error returns.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
In `check_merge_bases()` there is an early success special case,
so we have introduced special error code
BISECT_INTERNAL_SUCCESS_MERGE_BASE (-11) which indicates early
success. This BISECT_INTERNAL_SUCCESS_MERGE_BASE is converted back
to BISECT_OK (0) in `check_good_are_ancestors_of_bad()`.
Update all callers to handle the error returns.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
Turn `exit()` to `return` calls in `bisect_checkout()`.
Changes related to return values have no bad side effects on the
code that calls `bisect_checkout()`.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
Update all callers to handle the error returns.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary
to convert bisect.c exit() calls to return statements so
that errors can be reported. Let's prepare for that by making
it possible to return different error codes than just 0 or 1.
Different error codes might enable a bisecting script calling the
bisect command that uses this function to do different things
depending on the exit status of the bisect command.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Create an enum called `bisect_error` with the bisecting return codes
to use in `bisect.c` libification process.
Change bisect_next_all() to make it return this enum.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor code from bisect_next_check() into a new
decide_next() helper function.
This removes some goto statements and makes the code simpler,
clearer and easier to understand.
While at it `bad_ref` and `good_glob` are not const any more
to void casting them inside `free()`.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using 'var == 0' in an if condition, let's use '!var' and
make 'bisect.c' more consistent with the rest of the code.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's rename variable retval to res, so that variable names
in bisect--helper.c are more consistent.
After this change, there are 110 occurrences of res in the file
and zero of retval, while there were 26 instances of retval before.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a pointer that points at a constant string,
just give name directly to the constant string; this way, we
do not have to allocate a pointer variable in addition to
the string we want to use.
Let's convert `vocab_bad` and `vocab_good` char pointers to char arrays.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
check-ignore has two different modes, and neither of these modes has an
implementation that matches the documentation. These modes differ in
whether they just print paths or whether they also print the final
pattern matched by the path. The fix is different for both modes, so
I'll discuss both separately.
=== First (default) mode ===
The first mode is documented as:
For each pathname given via the command-line or from a file via
--stdin, check whether the file is excluded by .gitignore (or other
input files to the exclude mechanism) and output the path if it is
excluded.
However, it fails to do this because it did not account for negated
patterns. Commands other than check-ignore verify exclusion rules via
calling
... -> treat_one_path() -> is_excluded() -> last_matching_pattern()
while check-ignore has a call path of the form:
... -> check_ignore() -> last_matching_pattern()
The fact that the latter does not include the call to is_excluded()
means that it is susceptible to to messing up negated patterns (since
that is the only significant thing is_excluded() adds over
last_matching_pattern()). Unfortunately, we can't make it just call
is_excluded(), because the same codepath is used by the verbose mode
which needs to know the matched pattern in question. This brings us
to...
=== Second (verbose) mode ===
The second mode, known as verbose mode, references the first in the
documentation and says:
Also output details about the matching pattern (if any) for each
given pathname. For precedence rules within and between exclude
sources, see gitignore(5).
The "Also" means it will print patterns that match the exclude rules as
noted for the first mode, and also print which pattern matches. Unless
more information is printed than just pathname and pattern (which is not
done), this definition is somewhat ill-defined and perhaps even
self-contradictory for negated patterns: A path which matches a negated
exclude pattern is NOT excluded and thus shouldn't be printed by the
former logic, while it certainly does match one of the explicit patterns
and thus should be printed by the latter logic.
=== Resolution ==
Since the second mode exists to find out which pattern matches given
paths, and showing the user a pattern that begins with a '!' is
sufficient for them to figure out whether the pattern is excluded, the
existing behavior is desirable -- we just need to update the
documentation to match the implementation (i.e. it is about printing
which pattern is matched by paths, not about showing which paths are
excluded).
For the first or default mode, users just want to know whether a pattern
is excluded. As such, the existing documentation is desirable; change
the implementation to match the documented behavior.
Finally, also adjust a few tests in t0008 that were caught up by this
discrepancy in how negated paths were handled.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rendering the troff manpages to text via "man", we create an ad-hoc
Makefile and feed it to "make". The purpose here is two-fold:
- reuse results from a prior interrupted render of the same tree
- use make's -j option to build in parallel
But the second part doesn't seem to work (at least with my version of
GNU make, 4.2.1). It just runs one render at a time.
We use a double-colon "all" rule for each file, like:
all:: foo
foo:
...actual render recipe...
all:: bar
bar:
...actual render recipe...
...and so on...
And it's this double-colon that seems to inhibit the parallelism. We can
just switch to a regular single-colon rule. Even though we do have
multiple rules for "all" here, we don't have any recipe to execute for
"all" (we only care about triggering its dependencies), so the
distinction is irrelevant.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The example for the push.pushOption config tries to create a
preformatted section, but uses only two dashes in its "--" line. In
AsciiDoc this is an "open block", with no type; the lines end up jumbled
because they're formatted as paragraphs. We need four or more dashes to
make it a "listing block" that will respect the linebreaks.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* jk/mailinfo-cleanup:
mailinfo: factor out some repeated header handling
mailinfo: be more liberal with header whitespace
mailinfo: simplify parsing of header values
mailinfo: treat header values as C strings
"git config" learned to show in which "scope", in addition to in
which file, each config setting comes from.
* mr/show-config-scope:
config: add '--show-scope' to print the scope of a config value
submodule-config: add subomdule config scope
config: teach git_config_source to remember its scope
config: preserve scope in do_git_config_sequence
config: clarify meaning of command line scoping
config: split repo scope to local and worktree
config: make scope_name non-static and rename it
t1300: create custom config file without special characters
t1300: fix over-indented HERE-DOCs
config: fix typo in variable name
Preparation for SHA-256 migration continues.
* bc/hash-independent-tests-part-8: (21 commits)
t6024: update for SHA-256
t6006: make hash size independent
t6000: abstract away SHA-1-specific constants
t5703: make test work with SHA-256
t5607: make hash size independent
t5318: update for SHA-256
t5515: make test hash independent
t5321: make test hash independent
t5313: make test hash independent
t5309: make test hash independent
t5302: make hash size independent
t4060: make test work with SHA-256
t4211: add test cases for SHA-256
t4211: move SHA-1-specific test cases into a directory
t4013: make test hash independent
t3311: make test work with SHA-256
t3310: make test work with SHA-256
t3309: make test work with SHA-256
t3308: make test work with SHA-256
t3206: make hash size independent
...
Memory footprint and performance of "git name-rev" has been
improved.
* rs/name-rev-memsave:
name-rev: sort tip names before applying
name-rev: release unused name strings
name-rev: generate name strings only if they are better
name-rev: pre-size buffer in get_parent_name()
name-rev: factor out get_parent_name()
name-rev: put struct rev_name into commit slab
name-rev: don't _peek() in create_or_update_name()
name-rev: don't leak path copy in name_ref()
name-rev: respect const qualifier
name-rev: remove unused typedef
name-rev: rewrite create_or_update_name()
In d9c6469 (git-gui: update status bar to track operations, 2019-12-01),
the status bar was refactored to allow multiple overlapping operations.
Since the refactor changed the status bar interface, all callsites had
to be refactored to use the new interface. During that refactoring, this
closing bracket was missed. This leads to an error message popping up
when doing 'Branch->Reset...'.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Update the German translation and extend glossary.
* cs/german-translation:
git-gui: update German translation
git-gui: extend translation glossary template with more terms
git-gui: update pot template and German translation to current source code
Update German translation (glossary and final translation) with
recent additions, but also switch several terms from uncommon
translations back to English vocabulary.
This most prominently concerns "commit" (noun, verb), "repository",
"branch", and some more. These uncommon translations have been introduced
long ago and never been changed since. In fact, the whole German
translation here hasn't been touched for a long time. However, in German
literature and magazines, git-gui is regularly noted for its uncommon
choice of translated vocabulary. This somewhat distracts from the actual
benefits of this tool. So it is probably better to abandon the uncommon
translations and rather stick to the common English vocabulary in git
version control.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The English glossary template was missing some terms, some of them
not only for git-gui, but also gitk and/or git core. Many such terms
have been added.
Also, the list has been sorted alphabetically so that comparison to
other glossary lists are easier.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
No content changes so far, only the preparation for subsequent edits.
Signed-off-by: Christian Stimming <christian@cstimming.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Two related changes, with separate rationale for each:
Rename the 'interactive' backend to 'merge' because:
* 'interactive' as a name caused confusion; this backend has been used
for many kinds of non-interactive rebases, and will probably be used
in the future for more non-interactive rebases than interactive ones
given that we are making it the default.
* 'interactive' is not the underlying strategy; merging is.
* the directory where state is stored is not called
.git/rebase-interactive but .git/rebase-merge.
Rename the 'am' backend to 'apply' because:
* Few users are familiar with git-am as a reference point.
* Related to the above, the name 'am' makes sentences in the
documentation harder for users to read and comprehend (they may read
it as the verb from "I am"); avoiding this difficult places a large
burden on anyone writing documentation about this backend to be very
careful with quoting and sentence structure and often forces
annoying redundancy to try to avoid such problems.
* Users stumble over pronunciation ("am" as in "I am a person not a
backend" or "am" as in "the first and thirteenth letters in the
alphabet in order are "A-M"); this may drive confusion when one user
tries to explain to another what they are doing.
* While "am" is the tool driving this backend, the tool driving git-am
is git-apply, and since we are driving towards lower-level tools
for the naming of the merge backend we may as well do so here too.
* The directory where state is stored has never been called
.git/rebase-am, it was always called .git/rebase-apply.
For all the reasons listed above:
* Modify the documentation to refer to the backends with the new names
* Provide a brief note in the documentation connecting the new names
to the old names in case users run across the old names anywhere
(e.g. in old release notes or older versions of the documentation)
* Change the (new) --am command line flag to --apply
* Rename some enums, variables, and functions to reinforce the new
backend names for us as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to ensure the merge/interactive backend gets similar coverage
to the am one, add some tests for cases where previously only the am
backend was tested.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have many rebase tests in the testsuite, and often the same test is
repeated multiple times just testing different backends. For those
tests that were specifically trying to test the am backend, add the --am
flag.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A large variety of rebase types are supported by the interactive
machinery, not just the explicitly interactive ones. These all share
the same code and write the same reflog messages, but the "-i" moniker
in those messages doesn't really have much meaning. It also becomes
somewhat distracting once we switch the default from the am-backend to
the interactive one. Just remove the "-i" from these messages.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the past, we had different prompts for different types of rebases:
REBASE: for am-based rebases
REBASE-m: for merge-based rebases
REBASE-i: for interactive-based rebases
It's not clear why this distinction was necessary or helpful; when the
prompt was added in commit e75201963f ("Improve bash prompt to detect
various states like an unfinished merge", 2007-09-30), it simply added
these three different types. Perhaps there was a useful purpose back
then, but there have been some changes:
* The merge backend was deleted after being implemented on top of the
interactive backend, causing the prompt for merge-based rebases to
change from REBASE-m to REBASE-i.
* The interactive backend is used for multiple different types of
non-interactive rebases, so the "-i" part of the prompt doesn't
really mean what it used to.
* Rebase backends have gained more abilities and have a great deal of
overlap, sometimes making it hard to distinguish them.
* Behavioral differences between the backends have also been ironed
out.
* We want to change the default backend from am to interactive, which
means people would get "REBASE-i" by default if we didn't change
the prompt, and only if they specified --am or --whitespace or -C
would they get the "REBASE" prompt.
* In the future, we plan to have "--whitespace", "-C", and even "--am"
run the interactive backend once it can handle everything the
am-backend can.
For all these reasons, make the prompt for any type of rebase just be
"REBASE".
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, this option doesn't do anything except error out if any
options requiring the interactive-backend are also passed. However,
when we make the default backend configurable later in this series, this
flag will provide a way to override the config setting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the past, we dis-allowed rebases using the interactive backend from
performing a fast-forward to short-circuit the rebase operation. This
made sense for explicitly interactive rebases and some implicitly
interactive rebases, but certainly became overly stringent when the
merge backend was re-implemented via the interactive backend.
Just as the am-based rebase has always had to disable the fast-forward
based on a variety of conditions or flags (e.g. --signoff, --whitespace,
etc.), we need to do the same but now with a few more options. However,
continuing to use REBASE_FORCE for tracking this is problematic because
the interactive backend used it for a different purpose. (When
REBASE_FORCE wasn't set, the interactive backend would not fast-forward
the whole series but would fast-forward individual "pick" commits at the
beginning of the todo list, and then a squash or something would cause
it to start generating new commits.) So, introduce a new
allow_preemptive_ff flag contained within cmd_rebase() and use it to
track whether we are going to allow a pre-emptive fast-forward that
short-circuits the whole rebase.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3432 had several stress tests for can_fast_forward(), whose intent was
to ensure we were using the optimization of just fast forwarding when
possible. However, these tests verified that fast forwards had happened
based on the output that rebase printed to the terminal. We can instead
test more directly that we actually fast-forwarded by checking the
reflog, which also has the side effect of making the tests applicable
for the merge/interactive backend.
This change does lose the distinction between "noop" and "noop-force",
but as stated in commit c9efc21683 ("t3432: test for --no-ff's
interaction with fast-forward", 2019-08-27) which introduced that
distinction: "These tests aren't supposed to endorse the status quo,
just test for what we're currently doing.".
This change does not actually run these tests with the merge/interactive
backend; instead this is just a preparatory commit. A subsequent commit
which fixes can_fast_forward() to work with that backend will then also
change t3432 to add tests of that backend as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
restrict_revision in the original shell script was an excluded revision
range. It is also treated that way by the am-backend. In the
conversion from shell to C (see commit 6ab54d17be ("rebase -i:
implement the logic to initialize $revisions in C", 2018-08-28)), the
interactive-backend accidentally treated it as a positive revision
rather than a negated one.
This was missed as there were no tests in the testsuite that tested an
interactive rebase with fork-point behavior.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_QUIET environment variable was used to signal the non-am
backends that the rebase should perform quietly. The preserve-merges
backend does not make use of the quiet flag anywhere (other than to
write out its state whenever it writes state), and this mechanism was
broken in the conversion from shell to C. Since this environment
variable was specifically designed for scripts and the only backend that
would still use it is no longer a script, just gut this code.
A subsequent commit will fix --quiet for the interactive/merge backend
in a different way.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the merge backend was re-implemented on top of the interactive
backend, the output of rebase --merge changed a little. This change
allowed this test to be simplified, though it wasn't noticed until now.
Simplify the testcase a little.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As established in the previous commit and commit b00bf1c9a8
(git-rebase: make --allow-empty-message the default, 2018-06-27), the
behavior for rebase with different backends in various edge or corner
cases is often more happenstance than design. This commit addresses
another such corner case: commits which "become empty".
A careful reader may note that there are two types of commits which would
become empty due to a rebase:
* [clean cherry-pick] Commits which are clean cherry-picks of upstream
commits, as determined by `git log --cherry-mark ...`. Re-applying
these commits would result in an empty set of changes and a
duplicative commit message; i.e. these are commits that have
"already been applied" upstream.
* [become empty] Commits which are not empty to start, are not clean
cherry-picks of upstream commits, but which still become empty after
being rebased. This happens e.g. when a commit has changes which
are a strict subset of the changes in an upstream commit, or when
the changes of a commit can be found spread across or among several
upstream commits.
Clearly, in both cases the changes in the commit in question are found
upstream already, but the commit message may not be in the latter case.
When cherry-mark can determine a commit is already upstream, then
because of how cherry-mark works this means the upstream commit message
was about the *exact* same set of changes. Thus, the commit messages
can be assumed to be fully interchangeable (and are in fact likely to be
completely identical). As such, the clean cherry-pick case represents a
case when there is no information to be gained by keeping the extra
commit around. All rebase types have always dropped these commits, and
no one to my knowledge has ever requested that we do otherwise.
For many of the become empty cases (and likely even most), we will also
be able to drop the commit without loss of information -- but this isn't
quite always the case. Since these commits represent cases that were
not clean cherry-picks, there is no upstream commit message explaining
the same set of changes. Projects with good commit message hygiene will
likely have the explanation from our commit message contained within or
spread among the relevant upstream commits, but not all projects run
that way. As such, the commit message of the commit being rebased may
have reasoning that suggests additional changes that should be made to
adapt to the new base, or it may have information that someone wants to
add as a note to another commit, or perhaps someone even wants to create
an empty commit with the commit message as-is.
Junio commented on the "become-empty" types of commits as follows[1]:
WRT a change that ends up being empty (as opposed to a change that
is empty from the beginning), I'd think that the current behaviour
is desireable one. "am" based rebase is solely to transplant an
existing history and want to stop much less than "interactive" one
whose purpose is to polish a series before making it publishable,
and asking for confirmation ("this has become empty--do you want to
drop it?") is more appropriate from the workflow point of view.
[1] https://lore.kernel.org/git/xmqqfu1fswdh.fsf@gitster-ct.c.googlers.com/
I would simply add that his arguments for "am"-based rebases actually
apply to all non-explicitly-interactive rebases. Also, since we are
stating that different cases should have different defaults, it may be
worth providing a flag to allow users to select which behavior they want
for these commits.
Introduce a new command line flag for selecting the desired behavior:
--empty={drop,keep,ask}
with the definitions:
drop: drop commits which become empty
keep: keep commits which become empty
ask: provide the user a chance to interact and pick what to do with
commits which become empty on a case-by-case basis
In line with Junio's suggestion, if the --empty flag is not specified,
pick defaults as follows:
explicitly interactive: ask
otherwise: drop
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Different rebase backends have different treatment for commits which
start empty (i.e. have no changes relative to their parent), and the
--keep-empty option was added at some point to allow adjusting behavior.
The handling of commits which start empty is actually quite similar to
commit b00bf1c9a8 (git-rebase: make --allow-empty-message the default,
2018-06-27), which pointed out that the behavior for various backends is
often more happenstance than design. The specific change made in that
commit is actually quite relevant as well and much of the logic there
directly applies here.
It makes a lot of sense in 'git commit' to error out on the creation of
empty commits, unless an override flag is provided. However, once
someone determines that there is a rare case that merits using the
manual override to create such a commit, it is somewhere between
annoying and harmful to have to take extra steps to keep such
intentional commits around. Granted, empty commits are quite rare,
which is why handling of them doesn't get considered much and folks tend
to defer to existing (accidental) behavior and assume there was a reason
for it, leading them to just add flags (--keep-empty in this case) that
allow them to override the bad defaults. Fix the interactive backend so
that --keep-empty is the default, much like we did with
--allow-empty-message. The am backend should also be fixed to have
--keep-empty semantics for commits that start empty, but that is not
included in this patch other than a testcase documenting the failure.
Note that there was one test in t3421 which appears to have been written
expecting --keep-empty to not be the default as correct behavior. This
test was introduced in commit 00b8be5a4d ("add tests for rebasing of
empty commits", 2013-06-06), which was part of a series focusing on
rebase topology and which had an interesting original cover letter at
https://lore.kernel.org/git/1347949878-12578-1-git-send-email-martinvonz@gmail.com/
which noted
Your input especially appreciated on whether you agree with the
intent of the test cases.
and then went into a long example about how one of the many tests added
had several questions about whether it was correct. As such, I believe
most the tests in that series were about testing rebase topology with as
many different flags as possible and were not trying to state in general
how those flags should behave otherwise.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When developing a script, it can be painful to understand why Git thinks
something is outside the current repo, if the current repo isn't what
the user thinks it is. Since this can be tricky to diagnose, especially
in cases like submodules or nested worktrees, let's give the user a hint
about which repository is offended about that path.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The details of how credential helpers can be called or implemented were
originally covered in Documentation/technical/. Those are topics that
end users might care about (and we even referenced them in the
credentials manpage), but those docs typically don't ship as part of the
end user documentation, making them less useful.
This situation got slightly worse recently in f3b9055624 (credential:
move doc to credential.h, 2019-11-17), where we moved them into the C
header file, making them even harder to find.
So let's move put this information into the gitcredentials(7)
documentation, which is meant to describe the overall concepts of our
credential handling. This was already pointing to the API docs for these
concepts, so we can just include it inline instead.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to compute the commit-graph has been taught to use a more
robust way to tell if two object directories refer to the same
thing.
* tb/commit-graph-object-dir:
commit-graph.h: use odb in 'load_commit_graph_one_fd_st'
commit-graph.c: remove path normalization, comparison
commit-graph.h: store object directory in 'struct commit_graph'
commit-graph.h: store an odb in 'struct write_commit_graph_context'
t5318: don't pass non-object directory to '--object-dir'
The index-pack code now diagnoses a bad input packstream that
records the same object twice when it is used as delta base; the
code used to declare a software bug when encountering such an
input, but it is an input error.
* jk/index-pack-dupfix:
index-pack: downgrade twice-resolved REF_DELTA to die()
The code to automatically shrink the fan-out in the notes tree had
an off-by-one bug, which has been killed.
* jh/notes-fanout-fix:
notes.c: fix off-by-one error when decreasing notes fanout
t3305: check notes fanout more carefully and robustly
The way "git submodule status" reports an initialized but not yet
populated submodule has not been reimplemented correctly when a
part of the "git submodule" command was rewritten in C, which has
been corrected.
* pk/status-of-uncloned-submodule:
t7400: testcase for submodule status on unregistered inner git repos
submodule: fix status of initialized but not cloned submodules
t7400: add a testcase for submodule status on empty dirs
Some codepaths were given a repository instance as a parameter to
work in the repository, but passed the_repository instance to its
callees, which has been cleaned up (somewhat).
* mt/use-passed-repo-more-in-funcs:
sha1-file: allow check_object_signature() to handle any repo
sha1-file: pass git_hash_algo to hash_object_file()
sha1-file: pass git_hash_algo to write_object_file_prepare()
streaming: allow open_istream() to handle any repo
pack-check: use given repo's hash_algo at verify_packfile()
cache-tree: use given repo's hash_algo at verify_one()
diff: make diff_populate_filespec() honor its repo argument
The diff-* plumbing family of subcommands now pay attention to the
diff.wsErrorHighlight configuration, which has been ignored before;
this allows "git add -p" to also show the whitespace problems to
the end user.
* jk/diff-honor-wserrhighlight-in-plumbing:
diff: move diff.wsErrorHighlight to "basic" config
Some rough edges in the sparse-checkout feature, especially around
the cone mode, have been cleaned up.
* ds/sparse-checkout-harden:
sparse-checkout: fix cone mode behavior mismatch
sparse-checkout: improve docs around 'set' in cone mode
sparse-checkout: escape all glob characters on write
sparse-checkout: use C-style quotes in 'list' subcommand
sparse-checkout: unquote C-style strings over --stdin
sparse-checkout: write escaped patterns in cone mode
sparse-checkout: properly match escaped characters
sparse-checkout: warn on globs in cone patterns
sparse-checkout: detect short patterns
sparse-checkout: cone mode does not recognize "**"
sparse-checkout: fix documentation typo for core.sparseCheckoutCone
clone: fix --sparse option with URLs
sparse-checkout: create leading directories
t1091: improve here-docs
t1091: use check_files to reduce boilerplate
p4 updates.
* ld/p4-cleanup-processes:
git-p4: avoid leak of file handle when cloning
git-p4: check for access to remote host earlier
git-p4: cleanup better on error exit
git-p4: create helper function importRevisions()
git-p4: disable some pylint warnings, to get pylint output to something manageable
git-p4: add P4CommandException to report errors talking to Perforce
git-p4: make closeStreams() idempotent
Unneeded connectivity check is now disabled in a partial clone when
fetching into it.
* jt/connectivity-check-optim-in-partial-clone:
fetch: forgo full connectivity check if --filter
connected: verify promisor-ness of partial clone
A low-level API function get_oid(), that accepts various ways to
name an object, used to issue end-user facing error messages
without l10n, which has been updated to be translatable.
* jk/get-oid-error-message-i18n:
sha1-name: mark get_oid() error messages for translation
t1506: drop space after redirection operator
t1400: avoid "test" string comparisons
Allow the rebase.missingCommitsCheck configuration to kick in when
"rebase --edit-todo" and "rebase --continue" restarts the procedure.
* ag/edit-todo-drop-check:
rebase-interactive: warn if commit is dropped with `rebase --edit-todo'
sequencer: move check_todo_list_from_file() to rebase-interactive.c
Test updates.
* dl/test-must-fail-fixes-2:
t4124: only mark git command with test_must_fail
t3507: use test_path_is_missing()
t3507: fix indentation
t3504: do check for conflict marker after failed cherry-pick
t3419: stop losing return code of git command
t3415: increase granularity of test_auto_{fixup,squash}()
t3415: stop losing return codes of git commands
t3310: extract common notes_merge_files_gone()
t3030: use test_path_is_missing()
t2018: replace "sha" with "oid"
t2018: don't lose return code of git commands
t2018: teach do_checkout() to accept `!` arg
t2018: be more discerning when checking for expected exit codes
t2018: improve style of if-statement
t2018: add space between function name and ()
t2018: remove trailing space from test description
"git rebase -i" (and friends) used to unnecessarily check out the
tip of the branch to be rebased, which has been corrected.
* ag/rebase-avoid-unneeded-checkout:
rebase -i: stop checking out the tip of the branch to rebase
"git rebase -i" identifies existing commits in its todo file with
their abbreviated object name, which could become ambigous as it
goes to create new commits, and has a mechanism to avoid ambiguity
in the main part of its execution. A few other cases however were
not covered by the protection against ambiguity, which has been
corrected.
* js/rebase-i-with-colliding-hash:
rebase -i: also avoid SHA-1 collisions with missingCommitsCheck
rebase -i: re-fix short SHA-1 collision
parse_insn_line(): improve error message when parsing failed
A new version of fsmonitor-watchman hook has been introduced, to
avoid races.
* kw/fsmonitor-watchman-racefix:
fsmonitor: update documentation for hook version and watchman hooks
fsmonitor: add fsmonitor hook scripts for version 2
fsmonitor: handle version 2 of the hooks that will use opaque token
fsmonitor: change last update timestamp on the index_state to opaque token
Traditionally, we avoided threaded grep while searching in objects
(as opposed to files in the working tree) as accesses to the object
layer is not thread-safe. This limitation is getting lifted.
* mt/threaded-grep-in-object-store:
grep: use no. of cores as the default no. of threads
grep: move driver pre-load out of critical section
grep: re-enable threads in non-worktree case
grep: protect packed_git [re-]initialization
grep: allow submodule functions to run in parallel
submodule-config: add skip_if_read option to repo_read_gitmodules()
grep: replace grep_read_mutex by internal obj read lock
object-store: allow threaded access to object reading
replace-object: make replace operations thread-safe
grep: fix racy calls in grep_objects()
grep: fix race conditions at grep_submodule()
grep: fix race conditions on userdiff calls
The transport protocol version 2 becomes the default one.
* jn/promote-proto2-to-default:
fetch: default to protocol version 2
protocol test: let protocol.version override GIT_TEST_PROTOCOL_VERSION
test: request GIT_TEST_PROTOCOL_VERSION=0 when appropriate
config doc: protocol.version is not experimental
fetch test: use more robust test for filtered objects
Two help messages given when "git add" notices the user gave it
nothing to add have been updated to use advise() API.
* hw/advice-add-nothing:
add: change advice config variables used by the add API
add: use advise function to display hints
Technical details of the bundle format has been documented.
I think this is in a good enough shape.
* ms/doc-bundle-format:
doc: describe Git bundle format
"git grep --no-index" should not get affected by the contents of
the .gitmodules file but when "--recurse-submodules" is given or
the "submodule.recurse" variable is set, it did. Now these
settings are ignored in the "--no-index" mode.
* pb/do-not-recurse-grep-no-index:
grep: ignore --recurse-submodules if --no-index is given
Corner case bugs in "git clean" that stems from a (necessarily for
performance reasons) awkward calling convention in the directory
enumeration API has been corrected.
* en/fill-directory-fixes-more:
dir: point treat_leading_path() warning to the right place
dir: restructure in a way to avoid passing around a struct dirent
dir: treat_leading_path() and read_directory_recursive(), round 2
clean: demonstrate a bug with pathspecs
Clarify documentation on committer/author identities.
* bc/author-committer-doc:
doc: provide guidance on user.name format
docs: expand on possible and recommended user config options
doc: move author and committer information to git-commit(1)
Work around test breakages caused by custom regex engine used in
libasan, when address sanitizer is used with more recent versions
of gcc and clang.
* jk/asan-build-fix:
Makefile: use compat regex with SANITIZE=address
The code recently added in this release to move to the entry beyond
the ones in the same directory in the index in the sparse-cone mode
did not count the number of entries to skip over incorrectly, which
has been corrected.
* ds/sparse-cone:
.mailmap: fix GGG authoship screwup
unpack-trees: correctly compute result count
"git restore --staged" did not correctly update the cache-tree
structure, resulting in bogus trees to be written afterwards, which
has been corrected.
* nd/switch-and-restore:
restore: invalidate cache-tree when removing entries with --staged
Reduce unnecessary round-trip when running "ls-remote" over the
stateless RPC mechanism.
* jk/no-flush-upon-disconnecting-slrpc-transport:
transport: don't flush when disconnecting stateless-rpc helper
Complete an update to tutorial that encourages "git switch" over
"git checkout" that was done only half-way.
* hw/tutorial-favor-switch-over-checkout:
doc/gitcore-tutorial: fix prose to match example command
The code that tries to skip over the entries for the paths in a
single directory using the cache-tree was not careful enough
against corrupt index file.
* es/unpack-trees-oob-fix:
unpack-trees: watch for out-of-range index position
has_object_file() said "no" given an object registered to the
system via pretend_object_file(), making it inconsistent with
read_object_file(), causing lazy fetch to attempt fetching an
empty tree from promisor remotes.
* jt/sha1-file-remove-oi-skip-cached:
sha1-file: remove OBJECT_INFO_SKIP_CACHED
"git commit" gives output similar to "git status" when there is
nothing to commit, but without honoring the advise.statusHints
configuration variable, which has been corrected.
* hw/commit-advise-while-rejecting:
commit: honor advice.statusHints when rejecting an empty commit
Just as rev-list recently learned to combine filters and bitmaps, let's
do the same for pack-objects. The infrastructure is all there; we just
need to pass along our filter options, and the pack-bitmap code will
decide to use bitmaps or not.
This unsurprisingly makes things faster for partial clones of large
repositories (here we're cloning linux.git):
Test HEAD^ HEAD
------------------------------------------------------------------------------
5310.11: simulated partial clone 38.94(37.28+5.87) 11.06(11.27+4.07) -71.6%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just as the previous commit implemented BLOB_NONE, we can support
BLOB_LIMIT filters by looking at the sizes of any blobs in the result
and unsetting their bits as appropriate. This is slightly more expensive
than BLOB_NONE, but still produces a noticeable speedup (these results
are on git.git):
Test HEAD~2 HEAD
------------------------------------------------------------------------------------
5310.9: rev-list count with blob:none 1.80(1.77+0.02) 0.22(0.20+0.02) -87.8%
5310.10: rev-list count with blob:limit=1k 1.99(1.96+0.03) 0.29(0.25+0.03) -85.4%
The implementation is similar to the BLOB_NONE one, with the exception
that we have to go object-by-object while walking the blob-type bitmap
(since we can't mask out the matches, but must look up the size
individually for each blob). The trick with using ctz64() is taken from
show_objects_for_type(), which likewise needs to find individual bits
(but wants to quickly skip over big chunks without blobs).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can easily support BLOB_NONE filters with bitmaps. Since we know the
types of all of the objects, we just need to clear the result bits of
any blobs.
Note two subtleties in the implementation (which I also called out in
comments):
- we have to include any blobs that were specifically asked for (and
not reached through graph traversal) to match the non-bitmap version
- we have to handle in-pack and "ext_index" objects separately.
Arguably prepare_bitmap_walk() could be adding these ext_index
objects to the type bitmaps. But it doesn't for now, so let's match
the rest of the bitmap code here (it probably wouldn't be an
efficiency improvement to do so since the cost of extending those
bitmaps is about the same as our loop here, but it might make the
code a bit simpler).
Here are perf results for the new test on git.git:
Test HEAD^ HEAD
--------------------------------------------------------------------------------
5310.9: rev-list count with blob:none 1.67(1.62+0.05) 0.22(0.21+0.02) -86.8%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've never needed to unset an individual bit in a bitmap until now.
Typically they start with all bits unset and we bitmap_set() them, or we
are applying another bitmap as a mask. But the easiest way to apply an
object filter to a bitmap result will be to unset the individual bits.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This just passes the filter-options struct to prepare_bitmap_walk().
Since the bitmap code doesn't actually support any filters yet, it will
fallback to the non-bitmap code if any --filter is specified. But this
lets us exercise that rejection code path, as well as getting us ready
to test filters via rev-list when we _do_ support them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently you can't use object filters with bitmaps, but we plan to
support at least some filters with bitmaps. Let's introduce some
infrastructure that will help us do that:
- prepare_bitmap_walk() now accepts a list_objects_filter_options
parameter (which can be NULL for no filtering; all the current
callers pass this)
- we'll bail early if the filter is incompatible with bitmaps (just as
we would if there were no bitmaps at all). Currently all filters are
incompatible.
- we'll filter the resulting bitmap; since there are no supported
filters yet, this is always a noop.
There should be no behavior change yet, but we'll support some actual
filters in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since we added reachability bitmap support, we've been able to use
it with rev-list to get the full list of objects, like:
git rev-list --objects --use-bitmap-index --all
But you can't do so without --objects, since we weren't ready to just
show the commits. However, the internals of the bitmap code are mostly
ready for this: they avoid opening up trees when walking to fill in the
bitmaps. We just need to actually pass in the rev_info to
traverse_bitmap_commit_list() so it knows which types to bother
triggering our callback for.
For completeness, the perf test now covers both the existing --objects
case, as well as the new commits-only behavior (the objects one got way
faster when we introduced bitmaps, but obviously isn't improved now).
Here are numbers for linux.git:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 8.29(8.10+0.19) 1.76(1.72+0.04) -78.8%
5310.8: rev-list (objects) 8.06(7.94+0.12) 8.14(7.94+0.13) +1.0%
That run was cheating a little, as I didn't have any commit-graph in the
repository, and we'd built it by default these days when running git-gc.
Here are numbers with a commit-graph:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 0.70(0.58+0.12) 0.51(0.46+0.04) -27.1%
5310.8: rev-list (objects) 6.20(6.09+0.10) 6.27(6.16+0.11) +1.1%
Still an improvement, but a lot less impressive.
We could have the perf script remove any commit-graph to show the
out-sized effect, but it probably makes sense to leave it in what would
be a more typical setup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check the results of "rev-list --use-bitmap-index" by comparing it to
the same traversal without the bitmap option. However, this is a little
tricky to do because of some output differences (see the included
comment for details). Let's pull this out into a helper function, since
we'll be adding some similar tests.
While we're at it, let's also try to confirm that the bitmap output did
indeed use bitmaps. Since the code internally falls back to the
non-bitmap path in some cases, the tests are at risk of becoming trivial
noops.
This is a bit fragile, as not all outputs will differ (e.g., looking at
only the commits from a fully-bitmapped pack will end up exactly the
same as the normal traversal order, since it also matches the pack
order). So we'll provide an escape hatch by which tests can disable this
check (which should only be used after manually confirming that bitmaps
kicked in).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prior commit taught "--count --objects" to work without bitmaps. We
should be able to get the same answer much more quickly with bitmaps.
Note that we punt on the max_count case here. This perhaps _could_ be
made to work if we find all of the boundary commits and treat them as
UNINTERESTING, subtracting them (and their reachable objects) from the
set we return. That implies an actual commit traversal, but we'd still
be faster due to avoiding opening up any trees. Given the complexity and
the fact that anyone is unlikely to want this, it makes sense to just
fall back to the non-bitmap case for now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current behavior from "rev-list --count --objects" is nonsensical:
we enumerate all of the objects except commits, but then give a count of
commits. This wasn't planned, and is just what the code happens to do.
Instead, let's give the answer the user almost certainly wanted: the
full count of objects.
Note that there are more complicated cases around cherry-marking, etc.
We'll punt on those for now, but let the user know that we can't produce
an answer (rather than giving them something useless).
We'll test both the new feature as well as a vanilla --count of commits,
since that surprisingly doesn't seem to be covered in the existing
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few operations in rev-list that are optimized for bitmaps.
Rather than having the code inline in cmd_rev_list(), let's move them
into helpers. This not only makes the flow of the main function simpler,
but it lets us replace the complex "can we do the optimization?"
conditionals with a series of early returns from the functions. That
also makes it easy to add comments explaining those conditions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev-list has refused to use bitmaps with pathspec limiting since
c8a70d3509 (rev-list: disable --use-bitmap-index when pruning commits,
2015-07-01). But this is true not just for rev-list, but for anyone who
calls prepare_bitmap_walk(); the code isn't equipped to handle this
case. We never noticed because the only other callers would never pass
a pathspec limiter.
But let's push the check down into prepare_bitmap_walk() anyway. That's
a more logical place for it to live, as callers shouldn't need to know
the details (and must be prepared to fall back to a regular traversal
anyway, since there might not be bitmaps in the repository).
It would also prepare us for a day where this case _is_ handled, but
that's pretty unlikely. E.g., we could use bitmaps to generate the set
of commits, and then diff each commit to see if it matches the pathspec.
That would be slightly faster than a naive traversal that actually walks
the commits. But you'd probably do better still to make use of the newer
commit-graph feature to make walking the commits very cheap.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging Git, the criss-cross spawning of processes can make
things quite a bit difficult, especially when a Unix shell script is
thrown in the mix that calls a `git.exe` that then segfaults.
To help debugging such things, we introduce the `open_in_gdb()` function
which can be called at a code location where the segfault happens (or as
close as one can get); This will open a new MinTTY window with a GDB
that already attached to the current process.
Inspired by Derrick Stolee.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, it is quite common to work with network drives. The format
of the paths to network drives (or "network shares", or UNC paths) is:
\\<server>\<share>\...
We already have a couple regression tests revolving around those types
of paths, but we missed cloning and fetching from UNC paths without
leading `file://` (and with backslashes instead of forward slashes).
This lil' patch closes that gap.
It gets a bit silly to add the commands to the name of the test script,
so let's just rename it while we're testing more UNC stuff.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When grepping through the output of a command in the test suite, there
is always a chance that something goes wrong, in which case there would
not be anything useful to debug.
Let's redirect the output into a file instead, and grep that file, so
that the log can be inspected easily if the grep fails.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make an effort not to discourage new users from mailing questions to
git@vger.kernel.org, and explain the purpose of the mentoring list in
contrast to the main list.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During the p4 submit process, git-p4 will attempt to apply a patch
to the files found in the p4 workspace. However, if P4 uses RCS
keyword expansion, this patch may fail.
When the patch fails, the user is alerted to the failure and that git-p4
will attempt to clear the expanded text from the files and re-apply
the patch. The current version of git-p4 does not tell the user the
result of the re-apply attempt after the RCS expansion has been removed
which can be confusing.
Add a new print statement after the git patch has been successfully
applied when the RCS keywords have been cleansed.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git command "commit" supports a number of hooks that support
changing the behavior of the commit command. The git-p4.py program only
has one existing hook, "p4-pre-submit". This command occurs early in
the process. There are no hooks in the process flow for modifying
the P4 changelist text programmatically.
Adds 3 new hooks to git-p4.py to the submit option.
The new hooks are:
* p4-prepare-changelist - Execute this hook after the changelist file
has been created. The hook will be executed even if the
--prepare-p4-only option is set. This hook ignores the --no-verify
option in keeping with the existing behavior of git commit.
* p4-changelist - Execute this hook after the user has edited the
changelist. Do not execute this hook if the user has selected the
--prepare-p4-only option. This hook will honor the --no-verify,
following the conventions of git commit.
* p4-post-changelist - Execute this hook after the P4 submission process
has completed successfully. This hook takes no parameters and is
executed regardless of the --no-verify option. It's return value will
not be checked.
The calls to the new hooks: p4-prepare-changelist, p4-changelist,
and p4-post-changelist should all be called inside the try-finally
block.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding new hooks to the submit method of git-p4,
restructure the applyCommit function in the P4Submit class. Make the
following changes:
* Move all the code after the definition of submitted = False into the
Try-Finally block. This ensures that any error that occurs will
properly recover. This is not necessary with the current code because
none of it should throw an exception, however the next set of changes
will introduce exceptional code.
Existing flow control can remain as defined - the if-block for
prepare-p4-only sets the local variable "submitted" to True and exits
the function. New early exits, leave submitted set to False so the
Finally block will undo changes to the P4 workspace.
* Make the small usability change of adding an empty string to the
print statements displayed to the user when the prepare-p4-only option
is selected. On Windows, the command print() may display a set of
parentheses "()" to the user when the print() function is called with
no parameters. By supplying an empty string, the intended blank line
will print as expected.
* Fix a small bug when the submittedTemplate is edited by the user
and all content in the file is removed. The existing code will throw
an exception if the separateLine is not found in the file. Change this
code to test for the separator line using a find() test first and only
split on the separator if it is found.
* Additionally, add the new behavior that if the changelist file has
been completely emptied that the Submit action for this changelist
will be aborted.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--use-bitmap-index" option is usually aspirational: if we have
bitmaps and the request can be fulfilled more quickly using them we'll
do so, but otherwise fall back to a non-bitmap traversal.
The exception is object filtering, which explicitly dies if the two
options are combined. Let's convert this to the usual fallback behavior.
This is a minor convenience for now (since the caller can easily know
that --filter and --use-bitmap-index don't combine), but will become
much more useful as we start to support _some_ filters with bitmaps, but
not others.
The test infrastructure here is bigger than necessary for checking this
one small feature. But it will serve as the basis for more filtering
bitmap tests in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we do a bitmap-aware revision traversal, we create an object_list
for each of the "haves" and "wants" tips. After creating the result
bitmaps these are no longer needed or used, but we never free the list
memory.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When count_object_type() wants to iterate over the bitmap of all objects
of a certain type, we have to pair up OBJ_COMMIT with bitmap->commits,
and so forth. Since we're about to add more code to iterate over these
bitmaps, let's pull the initialization into its own function.
We can also use this to simplify traverse_bitmap_commit_list(). It
accomplishes the same thing by manually passing the object type and the
bitmap to show_objects_for_type(), but using our helper we just need the
object type.
Note there's one small code change here: previously we'd simply return
zero when counting an unknown object type, and now we'll BUG(). This
shouldn't matter in practice, as all of the callers pass in only usual
commit/tree/blob/tag types.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep --no-index" should not get affected by the contents of
the .gitmodules file but when "--recurse-submodules" is given or
the "submodule.recurse" variable is set, it did. Now these
settings are ignored in the "--no-index" mode.
* pb/do-not-recurse-grep-no-index:
grep: ignore --recurse-submodules if --no-index is given
One effect of specifying where the GIT_DIR is (either with the
environment variable, or with the "git --git-dir=<where> cmd"
option) is to disable the repository discovery. This has been
placed a bit more stress in the documentation, as new users often
get confused.
* hw/doc-git-dir:
git: update documentation for --git-dir
Running "git rm" on a submodule failed unnecessarily when
.gitmodules is only cache-dirty, which has been corrected.
* dt/submodule-rm-with-stale-cache:
git rm submodule: succeed if .gitmodules index stat info is zero
Disambiguation logic to tell revisions and pathspec apart has been
tweaked so that backslash-escaped glob special characters do not
count in the "wildcards are pathspec" rule.
* jk/escaped-wildcard-dwim:
verify_filename(): handle backslashes in "wildcards are pathspecs" rule
Warn programmers about pretend_object_file() that allows the code
to tentatively use in-core objects.
* jn/pretend-object-doc:
sha1-file: document how to use pretend_object_file
In t0000, more precisely in its `test_bool_env` test case, there are two
subshells that are supposed to fail. To be even _more_ precise, they
fail by calling the `error` function, and that is okay, because it is in
a subshell, and it is expected that those two subshell invocations fail.
However, the `error` function also tries to finalize the JUnit XML (if
that XML was asked for, via `--write-junit-xml`. As a consequence, the
XML is edited to add a `time` attribute for the `testsuite` tag. And
since there are two expected `error` calls in addition to the final
`test_done`, the `finalize_junit_xml` function is called three times and
naturally the `time` attribute is added _three times_.
Azure Pipelines is not happy with that, complaining thusly:
##[warning]Failed to read D:\a\1\s\t\out\TEST-t0000-basic.xml. Error : 'time' is a duplicate attribute name. Line 2, position 82..
One possible way to address this would be to unset `write_junit_xml` in
the `test_bool_env` test case.
But that would be fragile, as other `error` calls in subshells could be
introduced.
So let's just modify `finalize_junit_xml` to remove any `time` attribute
before adding the authoritative one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add new command line option --no-verify:
Add a new command line option "--no-verify" to the Submit command of
git-p4.py. This option will function in the spirit of the existing
--no-verify command line option found in git commit. It will cause the
P4 Submit function to ignore the existing p4-pre-submit.
Change the execution of the existing trigger p4-pre-submit to honor the
--no-verify option. Before exiting on failure of this hook, display
text to the user explaining which hook has failed and the impact
of using the --no-verify option.
Change the call of the p4-pre-submit hook to use the new run_git_hook
function. This is in preparation of additional hooks to be added.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the p4-pre-submit exits with a non-zero exit code, the application
will abort the process with no additional information presented to the
user. This can be confusing for new users as it may not be clear that
the p4-pre-submit action caused the error.
Add text to explain why the program aborted the submit action.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is in preparation of introducing new p4 submit hooks.
The current code in the python script git-p4.py makes the assumption
that the git hooks can be executed by subprocess.call() function.
However, when git is run on Windows, this may not work as expected.
The subprocess.call() does not cover all the use cases for properly
executing the various types of executable files on Windows.
Prepare for remediation by adding a new function, run_git_hook, that
takes 2 parameters:
* the short filename of an optionally registered git hook
* an optional list of parameters
The run_git_hook function will honor the existing behavior seen in the
current code for executing the p4-pre-submit hook:
* Hooks are looked for in core.hooksPath directory.
* If core.hooksPath is not set, then the current .git/hooks directory
is checked.
* If the hook does not exist, the function returns True.
* If the hook file is not accessible, the function returns True.
* If the hook returns a zero exit code when executed, the function
return True.
* If the hook returns a non-zero exit code, the function returns False.
Add the following additional functionality if git-p4.py is run on
Windows.
* If hook file is not located without an extension, search for
any file in the associated hook directory (from the list above) that
has the same name but with an extension.
* If the file is still not found, return True (the hook is missing)
Add a new function run_hook_command() that wraps the OS dependent
functionality for actually running the subprocess.call() with OS
dependent behavior:
If a hook file exists on Windows:
* If there is no extension, set the launch executable to be SH.EXE
- Look for SH.EXE under the environmental variable EXEPATH in the
bin/ directory.
- If %EXEPATH%/bin/sh.exe exists, use this as the actual executable.
- If %EXEPATH%/bin/sh.exe does not exist, use sh.exe
- Execute subprocess.call() without the shell (shell=False)
* If there is an extension, execute subprocess.call() with teh shell
(shell=True) and consider the file to be the executable.
The return value from run_hook_command() is the subprocess.call()
return value.
These functions are added in this commit, but are only staged and not
yet used.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing function prompt(prompt_text) does not work correctly when
run on Windows 10 bash terminal when launched from the sourcetree
GUI application. The stdout is not flushed properly so the prompt text
is not displayed to the user until the next flush of stdout, which is
quite confusing.
Change this method by:
* Adding flush to stderr, stdout, and stdin
* Use readline from sys.stdin instead of raw_input.
The existing strip().lower() are retained.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This results in shorter output, and is _probably_ more portable. There
is at least one environment (GitHub Actions) which supports 16-color
mode but not 256-color mode. It's possible there are environments
which go the other way, but it seems unlikely.
Signed-off-by: Eyal Soha <shawarmakarma@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These colors are the bright variants of the 3-bit colors. Instead of
30-37 range for the foreground and 40-47 range for the background,
they live in 90-97 and 100-107 range, respectively.
Signed-off-by: Eyal Soha <shawarmakarma@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
color_output() takes a "type" parameter, which is either '3' or '4',
and that byte is shown in front of '0'-'7' to form "30"-"37" or
"40"-"47" in ANSI output mode for fore-ground and back-ground
colors.
Clarify the purpose of the parameter by renaming it to the
"background" that is a boolean.
Also, change the .value field in the color struct from storing 0-7
for basic 8 colors to storing 30-37 for ANSI colors. This aligns
the code to show ANSI colors to the code for the 256 color scheme,
which already uses the actual value to be sent to the terminal.
Signed-off-by: Eyal Soha <shawarmakarma@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do the same thing for each header: match it, copy it to a strbuf, and
decode it. Let's put that in a helper function to avoid repetition.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
RFC822 and friends allow arbitrary whitespace after the colon of a
header and before the values. I.e.:
Subject:foo
Subject: foo
Subject: foo
all have the subject "foo". But mailinfo requires exactly one space.
This doesn't seem to be bothering anybody, but it is pickier than the
standard specifies. And we can easily just soak up arbitrary whitespace
there in our parser, so let's do so.
Note that the test covers both too little and too much whitespace, but
the "too much" case already works fine (because we later eat leading and
trailing whitespace from the values).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our code to parse header values first checks to see if a line starts
with a header, and then manually skips past the matched string to find
the value. We can do this all in one step by modeling after
skip_prefix(), which returns a pointer into the string after the
parsing.
This lets us remove some repeated strings, and will also enable us to
parse more flexibly in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read each header line into a strbuf, which means that we could
in theory handle header values with embedded NUL bytes. But in practice,
the values we parse out are passed to decode_header(), which uses
strstr(), strchr(), etc. And we would not expect such bytes anyway; they
are forbidden by RFC822, etc. and any non-ASCII characters should be
encoded with RFC2047 encoding.
So let's switch to using strbuf_addstr(), which saves us some length
computations (and will enable further cleanups in this code).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --fork-point master" used to work OK, as it internally
called "git merge-base --fork-point" that knew how to handle short
refname and dwim it to the full refname before calling the
underlying get_fork_point() function.
This is no longer true after the command was rewritten in C, as its
internall call made directly to get_fork_point() does not dwim a
short ref.
Move the "dwim the refname argument to the full refname" logic that
is used in "git merge-base" to the underlying get_fork_point()
function, so that the other caller of the function in the
implementation of "git rebase" behaves the same way to fix this
regression.
Signed-off-by: Alex Torok <alext9@gmail.com>
[jc: revamped the fix and used Alex's tests]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using Windows, a user may run 'git sparse-checkout set A\B\C'
to add the Unix-style path A/B/C to their sparse-checkout patterns.
Normalizing the input path converts the backslashes to slashes before we
add the string 'A/B/C' to the recursive hashset.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the sparse-checkout feature, a user may want to incrementally
grow their sparse-checkout pattern set. Allow adding patterns using a
new 'add' subcommand. This is not much different from the 'set'
subcommand, because we still want to allow the '--stdin' option and
interpret inputs as directories when in cone mode and patterns
otherwise.
When in cone mode, we are growing the cone. This may actually reduce the
set of patterns when adding directory A when A/B is already a directory
in the cone. Test the different cases: siblings, parents, ancestors.
When not in cone mode, we can only assume the patterns should be
appended to the sparse-checkout file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of adding "add" and "remove" subcommands to the
sparse-checkout builtin, extract a modify_pattern_list() method from the
sparse_checkout_set() method. This command will read input from the
command-line or stdin to construct a set of patterns, then modify the
existing sparse-checkout patterns after a successful update of the
working directory.
Currently, the only way to modify the patterns is to replace all of the
patterns. This will be extended in a later update.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of extending the sparse-checkout builtin with "add"
and "remove" subcommands, extract the code that fills a pattern list
based on the input values. The input changes depending on the
presence of "--stdin" or the value of core.sparseCheckoutCone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming a remote with
git remote rename X Y
git remote remove X
Git already renames or removes any branch.<name>.remote and
branch.<name>.pushRemote configurations if their value is X.
However remote.pushDefault needs a more gentle approach, as this may be
set in a non-repo configuration file. In such a case only a warning is
printed, such as:
warning: The global configuration remote.pushDefault in:
$HOME/.gitconfig:35
now names the non-existent remote origin
It is changed to remote.pushDefault = Y or removed when set in a repo
configuration though.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users are nowadays trained to see message from CLI tools in the form
<file>:<lno>: …
To be able to give such messages when notifying the user about
configurations in any config file, it is currently only possible to get
the file name (if the value originates from a file to begin with) via
`current_config_name()`. Now it is also possible to query the current line
number for the configuration.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming or removing a remote with
git remote rename X Y
git remote remove X
Git already renames/removes any config values from
branch.<name>.remote = X
to
branch.<name>.remote = Y
As branch.<name>.pushRemote also names a remote, it now also renames
or removes these config values from
branch.<name>.pushRemote = X
to
branch.<name>.pushRemote = Y
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some minor clean-ups in function `config_read_branches`:
* remove hardcoded length in `key += 7`
* call `xmemdupz` only once
* use a switch to handle the configuration type and add a `BUG()`
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 46af44b07d (pull --rebase=<type>: allow single-letter abbreviations
for the type, 2018-08-04) landed in Git, it had the side effect that
not only 'pull --rebase=<type>' accepted the single-letter abbreviations
but also the 'pull.rebase' and 'branch.<name>.rebase' configurations.
However, 'git remote rename' did not honor these single-letter
abbreviations when reading the 'branch.*.rebase' configurations.
We now document the single-letter abbreviations and both code places
share a common function to parse the values of 'git pull --rebase=*',
'pull.rebase', and 'branches.*.rebase'.
The only functional change is the handling of the `branch_info::rebase`
value. Before it was an unsigned enum, thus the truth value could be
checked with `branch_info::rebase != 0`. But `enum rebase_type` is
signed, thus the truth value must now be checked with
`branch_info::rebase >= REBASE_TRUE`
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user queries config values with --show-origin, often it's
difficult to determine what the actual "scope" (local, global, etc.) of
a given value is based on just the origin file.
Teach 'git config' the '--show-scope' option to print the scope of all
displayed config values. Note that we should never see anything of
"submodule" scope as that is only ever used by submodule-config.c when
parsing the '.gitmodules' file.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before the changes to teach git_config_source to remember scope
information submodule-config.c never needed to consider the question of
config scope. Even though zeroing out git_config_source is still
correct and preserved the previous behavior of setting the scope to
CONFIG_SCOPE_UNKNOWN, it's better to be explicit about such situations
by explicitly setting the scope. As none of the current config_scope
enumerations make sense we create CONFIG_SCOPE_SUBMODULE to describe the
situation.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many situations where the scope of a config command is known
beforehand, such as passing of '--local', '--file', etc. to an
invocation of git config. However, this information is lost when moving
from builtin/config.c to /config.c. This historically hasn't been a big
deal, but to prepare for the upcoming --show-scope option we teach
git_config_source to keep track of the source and the config machinery
to use that information to set current_parsing_scope appropriately.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
do_git_config_sequence operated under the assumption that it was correct
to set current_parsing_scope to CONFIG_SCOPE_UNKNOWN as part of the
cleanup it does after it finishes execution. This is incorrect, as it
blows away the current_parsing_scope if do_git_config_sequence is called
recursively. As such situations are rare (git config running with the
'--blob' option is one example) this has yet to cause a problem, but the
upcoming '--show-scope' option will experience issues in that case, lets
teach do_git_config_sequence to preserve the current_parsing_scope from
before it started execution.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CONFIG_SCOPE_CMDLINE is generally used in the code to refer to config
values passed in via the -c option. Options passed in using this
mechanism share similar scoping characteristics with the --file and
--blob options of the 'config' command, namely that they are only in use
for that single invocation of git, and that they supersede the normal
system/global/local hierarchy. This patch introduces
CONFIG_SCOPE_COMMAND to reflect this new idea, which also makes
CONFIG_SCOPE_CMDLINE redundant.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously when iterating through git config variables, worktree config
and local config were both considered "CONFIG_SCOPE_REPO". This was
never a problem before as no one had needed to differentiate between the
two cases, but future functionality may care whether or not the config
options come from a worktree or from the repository's actual local
config file. For example, the planned feature to add a '--show-scope'
to config to allow a user to see which scope listed config options come
from would confuse users if it just printed 'repo' rather than 'local'
or 'worktree' as the documentation would lead them to expect. As well
as the additional benefit of making the implementation look more like
how the documentation describes the interface.
To accomplish this we split out what was previously considered repo
scope to be local and worktree.
The clients of 'current_config_scope()' who cared about
CONFIG_SCOPE_REPO are also modified to similarly care about
CONFIG_SCOPE_WORKTREE and CONFIG_SCOPE_LOCAL to preserve previous behavior.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prepare for the upcoming --show-scope option, we require the ability
to convert a config_scope enum to a string. As this was originally
implemented as a static function 'scope_name()' in
t/helper/test-config.c, we expose it via config.h and give it a less
ambiguous name 'config_scope_name()'
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent update in the Linux VM images used by Azure Pipelines surfaced
a new problem in the "Documentation" job. Apparently, this warning
appears 396 times on `stderr` when running `make doc`:
/usr/lib/ruby/vendor_ruby/rubygems/defaults/operating_system.rb:10: warning: constant Gem::ConfigMap is deprecated
This problem was already reported to the `rubygems` project via
https://github.com/rubygems/rubygems/issues/3068.
As there is nothing Git can do about this warning, and as the
"Documentation" job reports this warning as a failure, let's just
silence it and move on.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify parse_options_dup() by making it a trivial wrapper of
parse_options_concat() by making use of the facts that the latter
duplicates its input as well and that appending an empty set is a no-op.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the fact that the function doesn't modify the two option arrays
passed to it by adding the keyword const to each parameter.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function to count the number of options (excluding the
final OPT_END()) and use it to simplify parse_options_dup() and
parse_options_concat().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use COPY_ARRAY to copy whole arrays instead of iterating through the
elements. That's shorter, simpler and bit more efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
handle_content_type() only cares about the value after "Content-Type: ";
there is no need to insert that string for it.
Suggested-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function for inserting a C string into a strbuf. Use it
throughout the source to get rid of magic string length constants and
explicit strlen() calls.
Like strbuf_addstr(), implement it as an inline function to avoid the
implicit strlen() calls to cause runtime overhead.
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of the multi-pack-index contains a small bug,
if all offsets are < 2^32 then there will be no LOFF chunk,
not only if they're all < 2^31 (since the highest bit is only
needed as the "LOFF-escape" when that's actually needed.)
Correct this, and clarify that in that case only offsets up
to 2^31-1 can be stored in the OOFF chunk.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we exemplify the difference between `-G` and `-S` (using
`--pickaxe-regex`), we do so using an example diff and git-diff
invocation involving "regexec", "regexp", "regmatch", ...
The example is correct, but we can make it easier to untangle by
avoiding writing "regex.*" unless it's really needed to make our point.
Use some made-up, non-regexy words instead.
Reported-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bundle format was not documented. Describe the format with ABNF and
explain the meaning of each part.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make this test work with SHA-256, compute two of the items in the
conflicted index entry. The other entry is a conflict within a conflict
and computing it is difficult, so use test_oid_cache to specify the
proper values for both hash algorithms.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the length of an object ID when creating a tree,
compute it for the hash in use using the translation tables.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test used an object ID which was 40 hex characters in length,
causing the test not only not to pass, but to hang, when run with
SHA-256 as the hash. Change this value to a fixed dummy object ID using
test_oid_init and test_oid.
Furthermore, ensure we extract an object ID of the appropriate length
using cut with fields instead of a fixed length.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch two tests to use $ZERO_OID to represent the all-zeros object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test contains a large number of data files, mostly using the same
object ID values for refs. Instead of producing two separate sets of
test files, keep the test files using SHA-1 and translate them on the
fly by replacing the SHA-1 values with the values for the current hash
algorithm in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the proper pack constants defined in lib-pack.sh to make this test
work with SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make this test hash independent by computing the length of the object
offsets and looking up values which will hash to object IDs with the
right properties.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the proper pack constants defined in lib-pack.sh to make this test
work with SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute the length of object IDs and pack offsets instead of hard-coding
constants.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, there are two main types of object IDs we see in the
diffs: the ones for the submodules, which we care about, and the ones
for the individual files, which are unrelated to what we're testing.
Much of the test already computes the former, so extend the rest of the
test to do so as well. Add a diff comparison function that normalizes
the differences in the latter, since they're not explicitly what we're
testing.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are already files containing example output for SHA-1. Add test
files providing example output for SHA-256 as well and adjust the test
to look up the appropriate ones based on the algorithm in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding SHA-256 support to this test, let's move the
SHA-1-specific expected output into a directory called "sha1". This
will allow us to add a similar directory for SHA-256 as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test produces a large number of diff formats and compares the
output with test files that have content specific to SHA-1. Since we are
more interested in the format of the diffs, and not their specific
values, which are tested elsewhere, add a function which uses sed to
transform these specific object IDs into generic ones of the right size,
which we can then compare.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the hard-coded SHA-1 constants with the use of test_oid to look
up an appropriate constant for each hash algorithm. In addition, adjust
the fanout checks to look for either zero or one slashes in the filename
without needing to check for an explicit length.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the hard-coded SHA-1 constants with the use of test_oid to look
up an appropriate constant for each hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the hard-coded SHA-1 constants with the use of test_oid to look
up an appropriate constant for each hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the hard-coded SHA-1 constants with the use of test_oid to look
up an appropriate constant for each hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the one assertion in this test that still uses SHA-1 to use test_oid
to be independent of the hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the support routines for generating packs to support both SHA-1
and SHA-256. Compute the trailing pack checksum and its length
correctly depending on the algorithm, and look up the object names based
on the algorithm as well. Ensure we initialize the algorithm facts so
that our callers need not do so.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 'err' contains output for multiple submodules and is printed all
at once by fetch_populated_submodules(), errors for each submodule
should be newline separated for readability. The same strbuf is added to
with a newline in the other half of the conditional where this error is
detected, so make the two consistent.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
advice.addNothing config variable is used to control the visibility of
two advice messages in the add library. This config variable is
replaced by two new variables, whose names are more clear and relevant
to the two cases.
Also add the two new variables to the documentation.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--recurse-submodules" option of various subcommands did not
work well when run in an alternate worktree, which has been
corrected.
* pb/recurse-submodule-in-worktree-fix:
submodule.c: use get_git_dir() instead of get_git_common_dir()
t2405: clarify test descriptions and simplify test
t2405: use git -C and test_commit -C instead of subshells
t7410: rename to t2405-worktree-submodule.sh
A fetch that is told to recursively fetch updates in submodules
inevitably produces reams of output, and it becomes hard to spot
error messages. The command has been taught to enumerate
submodules that had errors at the end of the operation.
* es/fetch-show-failed-submodules-atend:
fetch: emphasize failure during submodule fetch
Corner case bugs in "git clean" that stems from a (necessarily for
performance reasons) awkward calling convention in the directory
enumeration API has been corrected.
* en/fill-directory-fixes-more:
dir: point treat_leading_path() warning to the right place
dir: restructure in a way to avoid passing around a struct dirent
dir: treat_leading_path() and read_directory_recursive(), round 2
clean: demonstrate a bug with pathspecs
Preparation of test scripts for the day when the object names will
use SHA-256 continues.
* bc/hash-independent-tests-part-7:
t5604: make hash independent
t5601: switch into repository to hash object
t5562: use $ZERO_OID
t5540: make hash size independent
t5537: make hash size independent
t5530: compute results based on object length
t5512: abstract away SHA-1-specific constants
t5510: make hash size independent
t5504: make hash algorithm independent
t5324: make hash size independent
t5319: make test work with SHA-256
t5319: change invalid offset for SHA-256 compatibility
t5318: update for SHA-256
t4300: abstract away SHA-1-specific constants
t4204: make hash size independent
t4202: abstract away SHA-1-specific constants
t4200: make hash size independent
t4134: compute appropriate length constant
t4066: compute index line in diffs
t4054: make hash-size independent
"git checkout X" did not correctly fail when X is not a local
branch but could name more than one remote-tracking branches
(i.e. to be dwimmed as the starting point to create a corresponding
local branch), which has been corrected.
* am/checkout-file-and-ref-ref-ambiguity:
checkout: don't revert file on ambiguous tracking branches
parse_branchname_arg(): extract part as new function
The final leg of rewriting "add -i/-p" in C.
* js/add-p-leftover-bits:
ci: include the built-in `git add -i` in the `linux-gcc` job
built-in add -p: handle Escape sequences more efficiently
built-in add -p: handle Escape sequences in interactive.singlekey mode
built-in add -p: respect the `interactive.singlekey` config setting
terminal: add a new function to read a single keystroke
terminal: accommodate Git for Windows' default terminal
terminal: make the code of disable_echo() reusable
built-in add -p: handle diff.algorithm
built-in add -p: support interactive.diffFilter
t3701: adjust difffilter test
The effort to move "git-add--interactive" to C continues.
* js/patch-mode-in-others-in-c:
commit --interactive: make it work with the built-in `add -i`
built-in add -p: implement the "worktree" patch modes
built-in add -p: implement the "checkout" patch modes
built-in stash: use the built-in `git add -p` if so configured
legacy stash -p: respect the add.interactive.usebuiltin setting
built-in add -p: implement the "stash" and "reset" patch modes
built-in add -p: prepare for patch modes other than "stage"
Test clean-up.
* dl/test-must-fail-fixes:
t1507: inline full_name()
t1507: run commands within test_expect_success
t1507: stop losing return codes of git commands
t1501: remove use of `test_might_fail cp`
t1409: use test_path_is_missing()
t1409: let sed open its own input file
t1307: reorder `nongit test_must_fail`
t1306: convert `test_might_fail rm` to `rm -f`
t0020: use ! check_packed_refs_marked
t0020: don't use `test_must_fail has_cr`
t0003: don't use `test_must_fail attr_check`
t0003: use test_must_be_empty()
t0003: use named parameters in attr_check()
t0000: replace test_must_fail with run_sub_test_lib_test_err()
t/lib-git-p4: use test_path_is_missing()
name_ref() is called for each ref and checks if its a better name for
the referenced commit. If that's the case it remembers it and checks if
a name based on it is better for its ancestors as well. This in done in
the the order for_each_ref() imposes on us.
That might not be optimal. If bad names happen to be encountered first
(as defined by is_better_name()), names derived from them may spread to
a lot of commits, only to be replaced by better names later. Setting
better names first can avoid that.
is_better_name() prefers tags, short distances and old references. The
distance is a measure that we need to calculate for each candidate
commit, but the other two properties are not dependent on the
relationships of commits. Sorting the refs by them should yield better
performance than the essentially random order we currently use.
And applying older references first should also help to reduce rework
due to the fact that older commits have less ancestors than newer ones.
So add all details of names to the tip table first, then sort them
to prefer tags and older references and then apply them in this order.
Here's the performance as measures by hyperfine for the Linux repo
before:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 851.1 ms ± 4.5 ms [User: 806.7 ms, System: 44.4 ms]
Range (min … max): 845.9 ms … 859.5 ms 10 runs
... and with this patch:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 736.2 ms ± 8.7 ms [User: 688.4 ms, System: 47.5 ms]
Range (min … max): 726.0 ms … 755.2 ms 10 runs
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
name_rev() assigns a name to a commit and its parents and grandparents
and so on. Commits share their name string with their first parent,
which in turn does the same, recursively to the root. That saves a lot
of allocations. When a better name is found, the old name is replaced,
but its memory is not released. That leakage can become significant.
Can we release these old strings exactly once even though they are
referenced multiple times? Yes, indeed -- we can make use of the fact
that name_rev() visits the ancestors of a commit after it set a new name
for it and tries to update their names as well.
Members of the first ancestral line have the same taggerdate and
from_tag values, but a higher distance value than their child commit at
generation 0. These are the only criteria used by is_better_name().
Lower distance values are considered better, so a name that is better
for a child will also be better for its parent and grandparent etc.
That means we can free(3) an inferior name at generation 0 and rely on
name_rev() to replace all references in ancestors as well.
If we do that then we need to stop using the string pointer alone to
distinguish new empty rev_name slots from initialized ones, though, as
it technically becomes invalid after the free(3) call -- even though its
value is still different from NULL.
We can check the generation value first, as empty slots will have it
initialized to 0, and for the actual generation 0 we'll set a new valid
name right after the create_or_update_name() call that releases the
string.
For the Chromium repo, releasing superceded names reduces the memory
footprint of name-rev --all significantly. Here's the output of GNU
time before:
0.98user 0.48system 0:01.46elapsed 99%CPU (0avgtext+0avgdata 2601812maxresident)k
0inputs+0outputs (0major+571470minor)pagefaults 0swaps
... and with this patch:
1.01user 0.26system 0:01.28elapsed 100%CPU (0avgtext+0avgdata 1559196maxresident)k
0inputs+0outputs (0major+314370minor)pagefaults 0swaps
It also gets faster; hyperfine before:
Benchmark #1: ./git -C ../chromium/src name-rev --all
Time (mean ± σ): 1.534 s ± 0.006 s [User: 1.039 s, System: 0.494 s]
Range (min … max): 1.522 s … 1.542 s 10 runs
... and with this patch:
Benchmark #1: ./git -C ../chromium/src name-rev --all
Time (mean ± σ): 1.338 s ± 0.006 s [User: 1.047 s, System: 0.291 s]
Range (min … max): 1.327 s … 1.346 s 10 runs
For the Linux repo it doesn't pay off; memory usage only gets down from:
0.76user 0.03system 0:00.80elapsed 99%CPU (0avgtext+0avgdata 292848maxresident)k
0inputs+0outputs (0major+44579minor)pagefaults 0swaps
... to:
0.78user 0.03system 0:00.81elapsed 100%CPU (0avgtext+0avgdata 284696maxresident)k
0inputs+0outputs (0major+44892minor)pagefaults 0swaps
The runtime actually increases slightly from:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 828.8 ms ± 5.0 ms [User: 797.2 ms, System: 31.6 ms]
Range (min … max): 824.1 ms … 838.9 ms 10 runs
... to:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 847.6 ms ± 3.4 ms [User: 807.9 ms, System: 39.6 ms]
Range (min … max): 843.4 ms … 854.3 ms 10 runs
Why is that? In the Chromium repo, ca. 44000 free(3) calls in
create_or_update_name() release almost 1GB, while in the Linux repo
240000+ calls release a bit more than 5MB, so the average discarded
name is ca. 1000x longer in the latter.
Overall I think it's the right tradeoff to make, as it helps curb the
memory usage in repositories with big discarded names, and the added
overhead is small.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leave setting the tip_name member of struct rev_name to callers of
create_or_update_name(). This avoids allocations for names that are
rejected by that function. Here's how this affects the runtime when
working with a fresh clone of Git's own repository; performance numbers
by hyperfine before:
Benchmark #1: ./git -C ../git-pristine/ name-rev --all
Time (mean ± σ): 437.8 ms ± 4.0 ms [User: 422.5 ms, System: 15.2 ms]
Range (min … max): 432.8 ms … 446.3 ms 10 runs
... and with this patch:
Benchmark #1: ./git -C ../git-pristine/ name-rev --all
Time (mean ± σ): 408.5 ms ± 1.4 ms [User: 387.2 ms, System: 21.2 ms]
Range (min … max): 407.1 ms … 411.7 ms 10 runs
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can calculate the size of new name easily and precisely. Open-code
the xstrfmt() calls and grow the buffers as needed before filling them.
This provides a surprisingly large benefit when working with the
Chromium repository; here are the numbers measured using hyperfine
before:
Benchmark #1: ./git -C ../chromium/src name-rev --all
Time (mean ± σ): 5.822 s ± 0.013 s [User: 5.304 s, System: 0.516 s]
Range (min … max): 5.803 s … 5.837 s 10 runs
... and with this patch:
Benchmark #1: ./git -C ../chromium/src name-rev --all
Time (mean ± σ): 1.527 s ± 0.003 s [User: 1.015 s, System: 0.511 s]
Range (min … max): 1.524 s … 1.535 s 10 runs
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce nesting by moving code to come up with a name for the parent into
its own function.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit slab commit_rev_name contains a pointer to a struct rev_name,
and the actual struct is allocated separatly. Avoid that allocation and
pointer indirection by storing the full struct in the commit slab. Use
the tip_name member pointer to determine if the returned struct is
initialized.
Performance in the Linux repository measured with hyperfine before:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 953.5 ms ± 6.3 ms [User: 901.2 ms, System: 52.1 ms]
Range (min … max): 945.2 ms … 968.5 ms 10 runs
... and with this patch:
Benchmark #1: ./git -C ../linux/ name-rev --all
Time (mean ± σ): 851.0 ms ± 3.1 ms [User: 807.4 ms, System: 43.6 ms]
Range (min … max): 846.7 ms … 857.0 ms 10 runs
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Look up the commit slab slot for the commit once using
commit_rev_name_at() and populate it in case it is empty, instead of
checking for emptiness in a separate step using commit_rev_name_peek()
via get_commit_rev_name().
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
name_ref() duplicates the path string and passes it to name_rev(), which
either puts it into a commit slab or ignores it if there is already a
better name, leaking it. Move the duplication to name_rev() and release
the copy in the latter case.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Keep the const qualifier of the first parameter of get_rev_name() even
when casting the object pointer to a commit pointer, and further for the
parameter of get_commit_rev_name(), as all these uses are read-only.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The type alias became unused with bf43abc6e6 (name-rev: use sizeof(*ptr)
instead of sizeof(type) in allocation, 2019-11-12); remove it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code was moved straight out of name_rev(). As such, we inherited
the "goto" to jump from an if into an else-if. We also inherited the
fact that "nothing to do -- return NULL" is handled last.
Rewrite the function to first handle the "nothing to do" case. Then we
can handle the conditional allocation early before going on to populate
the struct. No need for goto-ing.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're resolving a REF_DELTA, we compare-and-swap its type from
REF_DELTA to whatever real type the base object has, as discussed in
ab791dd138 (index-pack: fix race condition with duplicate bases,
2014-08-29). If the old type wasn't a REF_DELTA, we consider that a
BUG(). But as discussed in that commit, we might see this case whenever
we try to resolve an object twice, which may happen because we have
multiple copies of the base object.
So this isn't a bug at all, but rather a sign that the input pack is
broken. And indeed, this case is triggered already in t5309.5 and
t5309.6, which create packs with delta cycles and duplicate bases. But
we never noticed because those tests are marked expect_failure.
Those tests were added by b2ef3d9ebb (test index-pack on packs with
recoverable delta cycles, 2013-08-23), which was leaving the door open
for cases that we theoretically _could_ handle. And when we see an
already-resolved object like this, in theory we could keep going after
confirming that the previously resolved child->real_type matches
base->obj->real_type. But:
- enforcing the "only resolve once" rule here saves us from an
infinite loop in other parts of the code. If we keep going, then the
delta cycle in t5309.5 causes us to loop infinitely, as
find_ref_delta_children() doesn't realize which objects have already
been resolved. So there would be more changes needed to make this
case work, and in the meantime we'd be worse off.
- any pack that triggers this is broken anyway. It either has a
duplicate base object, or it has a cycle which causes us to bring in
a duplicate via --fix-thin. In either case, we'd end up rejecting
the pack in write_idx_file(), which also detects duplicates.
So the tests have little value in documenting what we _could_ be doing
(and have been neglected for 6+ years). Let's switch them to confirming
that we handle this case cleanly (and switch out the BUG() for a more
informative die() so that we do so).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in the previous commit, the nature of the fanout heuristic
in the notes code causes the exact point at which we increase or
decrease the notes fanout to vary with the objects being annotated.
Since the object ids generated by the test environment are
deterministic (by design), the notes generated and tested by t3305
are always the same, and we therefore happen to see the same fanout
behavior from one run to the next.
Coincidentally, if we were to change the test environment slightly
(say by making a test commit on an unrelated branch before we start
the t3305 test proper), we not only see the fanout switch happen at
different points, we also manage to trigger a _bug_ in the notes
code where the fanout 1 -> 0 switch is not applied uniformly across
the notes tree, but instead yields a notes tree like this:
...
bdeafb301e44b0e4db0f738a2d2a7beefdb70b70
bff2d39b4f7122bd4c5caee3de353a774d1e632a
d3/8ec8f851adf470131178085bfbaab4b12ad2a7
e0b173960431a3e692ae929736df3c9b73a11d5b
eb3c3aede523d729990ac25c62a93eb47c21e2e3
...
The bug occurs when we are writing out a notes tree with a newly
decreased fanout, and the notes tree contains unexpanded subtrees
that should be consolidated into the parent tree as a consequence of
the decreased fanout):
Subtrees that happen to sit at an _even_ level in the internal notes
16-tree structure (in other words: subtrees whose path - "d3" in the
example above - is unique in the first nibble - i.e. there are no
other note paths that start with "d") are _not_ unpacked as part of
the tree writeout. This error will repeat itself in subsequent note
trees until the subtree is forced to be unpacked. In t3305 this only
happens when the d38ec8f8 note is itself removed from the tree.
The error is not severe (no information is lost, and the notes code
is able to read/decode this tree and manipulate it correctly), but
this is nonetheless a bug in the current implementation that should
be fixed.
That said, fixing the off-by-one error is not without complications:
We must take into account that the load_subtree() call from
for_each_note_helper() (that is now done to correctly unpack the
subtree while we're writing out the notes tree) may end up inserting
unpacked non-notes into the linked list of non_note entries held by
the struct notes_tree. Since we are in the process of writing out the
notes tree, this linked list is currently in the process of being
traversed by write_each_non_note_until(). The unpacked non-notes are
necessarily inserted between the last non-note we wrote out, and the
next non-note to be written. Hence, we cannot simply hold the
next_non_note to write in struct write_each_note_data (as we would
then silently skip these newly inserted notes), but must instead
always follow the ->next pointer from the last non-note we wrote.
(This part was caught by an existing test in t3304.)
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In short, before this patch, this test script:
- creates many notes
- verifies that all notes in the notes tree has a fanout of 1
- removes most notes
- verifies that the notes in the notes tree now has a fanout of 0
The fanout verification only happened twice: after creating all the
notes, and after removing most of them.
This patch strengthens the test by checking the fanout after _each_
added/removed note: We assert that the switch from fanout 0 -> 1
happens exactly once while adding notes (and that the switch pervades
the entire notes tree). Likewise, we assert that the switch from
fanout 1 -> 0 happens exactly once while removing notes.
Additionally, we decrease the number of notes left after removal,
from 50 to 15 notes, in order to ensure that fanout 1 -> 0 transition
keeps happening regardless of external factors[1].
[1]: Currently (with the SHA1 hash function and the deterministic
object ids of the test environment) the fanout heuristic in the notes
code happens to switch from 0 -> 1 at 109 notes, and from 1 -> 0 at
59 notes. However, changing the hash function or other external
factors will vary these numbers, and the latter may - in theory - go
as low as 15. For more details, please see the discussion at
https://public-inbox.org/git/20200125230035.136348-4-sandals@crustytoothpaste.net/
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Brian M. Carlson <sandals@crustytoothpaste.net>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this paragraph, we have a few instances of the '^' character, which
we give as "\^". This renders well with AsciiDoc ("^"), but Asciidoctor
renders it literally as "\^". Dropping the backslashes renders fine
with Asciidoctor, but not AsciiDoc...
An earlier version of this patch used "{caret}" instead of "^", which
avoided these escaping problems. The rendering was still so-so, though
-- these expressions end up set as normal text, similarly to when one
provides, e.g., computer code in the middle of running text, without
properly marking it with `backticks` to be monospaced.
As noted by Jeff King, this suggests actually wrapping these
expressions in backticks, setting them in monospace.
The lone "5" could be left as is or wrapped as `5`. Spell it out as
"five" instead -- this generally looks better anyway for small numbers
in the middle of text like this.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply a similar treatment as in the previous patch to pass a 'struct
object_directory *' through the 'load_commit_graph_one_fd_st'
initializer, too.
This prevents a potential bug where a pointer comparison is made to a
NULL 'g->odb', which would cause the commit-graph machinery to think
that a pair of commit-graphs belonged to different alternates when in
fact they do not (i.e., in the case of no '--object-dir').
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of the previous patch, all calls to 'commit-graph.c' functions which
perform path normalization (for e.g., 'get_commit_graph_filename()') are
of the form 'ctx->odb->path', which is always in normalized form.
Now that there are no callers passing non-normalized paths to these
functions, ensure that future callers are bound by the same restrictions
by making these functions take a 'struct object_directory *' instead of
a 'const char *'. To match, replace all calls with arguments of the form
'ctx->odb->path' with 'ctx->odb' To recover the path, functions that
perform path manipulation simply use 'odb->path'.
Further, avoid string comparisons with arguments of the form
'odb->path', and instead prefer raw pointer comparisons, which
accomplish the same effect, but are far less brittle.
This has a pleasant side-effect of making these functions much more
robust to paths that cannot be normalized by 'normalize_path_copy()',
i.e., because they are outside of the current working directory.
For example, prior to this patch, Valgrind reports that the following
uninitialized memory read [1]:
$ ( cd t && GIT_DIR=../.git valgrind git rev-parse HEAD^ )
because 'normalize_path_copy()' can't normalize '../.git' (since it's
relative to but above of the current working directory) [2].
By using a 'struct object_directory *' directly,
'get_commit_graph_filename()' does not need to normalize, because all
paths are relative to the current working directory since they are
always read from the '->path' of an object directory.
[1]: https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net.
[2]: The bug here is that 'get_commit_graph_filename()' returns the
result of 'normalize_path_copy()' without checking the return
value.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous patch, the 'char *object_dir' in 'struct commit_graph' was
replaced with a 'struct object_directory'. This patch applies the same
treatment to 'struct commit_graph', which is another intermediate step
towards getting rid of all path normalization in 'commit-graph.c'.
Instead of taking a 'char *object_dir', functions that construct a
'struct commit_graph' now take a 'struct object_directory *'. Any code
that needs an object directory path use '->path' instead.
This ensures that all calls to functions that perform path normalization
are given arguments which do not themselves require normalization. This
prepares those functions to drop their normalization entirely, which
will occur in the subsequent patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are lots of places in 'commit-graph.h' where a function either has
(or almost has) a full 'struct object_directory *', accesses '->path',
and then throws away the rest of the struct.
This can cause headaches when comparing the locations of object
directories across alternates (e.g., in the case of deciding if two
commit-graph layers can be merged). These paths are normalized with
'normalize_path_copy()' which mitigates some comparison issues, but not
all [1].
Replace usage of 'char *object_dir' with 'odb->path' by storing a
'struct object_directory *' in the 'write_commit_graph_context'
structure. This is an intermediate step towards getting rid of all path
normalization in 'commit-graph.c'.
Resolving a user-provided '--object-dir' argument now requires that we
compare it to the known alternates for equality. Prior to this patch,
an unknown '--object-dir' argument would silently exit with status zero.
This can clearly lead to unintended behavior, such as verifying
commit-graphs that aren't in a repository's own object store (or one of
its alternates), or causing a typo to mask a legitimate commit-graph
verification failure. Make this error non-silent by 'die()'-ing when the
given '--object-dir' does not match any known alternate object store.
[1]: In my testing, for example, I can get one side of the commit-graph
code to fill object_dir with "./objects" and the other with just
"objects".
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have test coverage for "git submodule status" output in
various cases, i.e.
1) not-init, not-cloned: status should initially be "missing"
2) init, not-cloned: status should be "missing"
3) not-init, cloned: status should ignore the inner git-repo
4) init, cloned: status should be "up-to-date" after update
4.1) + modified: status should be "modified" after submodule commit
4.2) + modified, committed: status should be "up-to-date" after update
the case 3) is not covered yet.
Test that submodule status reports an inner git repo as unknown, while
it is not added to the superproject. This covers case (3).
Signed-off-by: Peter Kaestle <peter@piie.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unpack-trees API depends on the tree-walk API. But we've recently
introduced a dependency in tree-walk.c on MAX_UNPACK_TREES, which
doesn't otherwise care about unpack-trees at all.
Let's break that dependency by reversing the constants: we'll introduce
a new MAX_TRAVERSE_TREES which belongs to the tree-walk API. And then we
can define MAX_UNPACK_TREES in terms of that (since unpack-trees cannot
possibly work with more trees than it can traverse at once via
tree-walk).
The value for both will remain at 8. This is somewhat arbitrary and
probably more than is necessary, per ca885a4fe6 (read-tree() and
unpack_trees(): use consistent limit, 2008-03-13), but there's not
really any pressing need to reduce it.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The intention of the special "cone mode" in the sparse-checkout
feature is to always match the same patterns that are matched by the
same sparse-checkout file as when cone mode is disabled.
When a file path is given to "git sparse-checkout set" in cone mode,
then the cone mode improperly matches the file as a recursive path.
When setting the skip-worktree bits, files were not expecting the
MATCHED_RECURSIVE response, and hence these were left out of the
matched cone.
Fix this bug by checking for MATCHED_RECURSIVE in addition to MATCHED
and add a test that prevents regression.
Reported-by: Finn Bryant <finnbryant@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing documentation does not clarify how the 'set' subcommand
changes when core.sparseCheckoutCone is enabled. Correct this by
changing some language around the "A/B/C" example. Also include a
description of the input format matching the output of 'git ls-tree
--name-only'.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout patterns allow special globs according to
fnmatch(3). When writing cone-mode patterns for paths containing
these characters, they must be escaped.
Use is_glob_special() to check which characters must be escaped
this way, and add a path to the tests that contains all glob
characters at once. Note that ']' is not special, since the
initial bracket '[' is escaped.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When in cone mode, the 'git sparse-checkout list' subcommand lists
the directories included in the sparse cone. When these directories
contain odd characters, such as a backslash, then we need to use
C-style quotes similar to 'git ls-tree'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.
Even more specifically, the goal is to always allow the following from
the root of a repo:
git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
The ls-tree command provides directory names with an unescaped asterisk.
It also quotes the directories that contain an escaped backslash. We
must remove these quotes, then keep the escaped backslashes.
Use unquote_c_style() when parsing lines from stdin. Command-line
arguments will be parsed as-is, assuming the user can do the correct
level of escaping from their environment to match the exact directory
names.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.
However, there is some care needed for the timing of these escapes. The
in-memory pattern list is used to update the working directory before
writing the patterns to disk. Thus, we need the command to have the
unescaped names in the hashsets for the cone comparisons, then escape
the patterns later.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cone mode, the sparse-checkout feature uses hashset containment
queries to match paths. Make this algorithm respect escaped asterisk
(*) and backslash (\) characters.
Create dup_and_filter_pattern() method to convert a pattern by
removing escape characters and dropping an optional "/*" at the end.
This method is available in dir.h as we will use it in
builtin/sparse-checkout.c in a later change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cone mode, the sparse-checkout commmand will write patterns that
allow faster pattern matching. This matching only works if the patterns
in the sparse-checkout file are those written by that command. Users
can edit the sparse-checkout file and create patterns that cause the
cone mode matching to fail.
The cone mode patterns may end in "/*" but otherwise an un-escaped
asterisk or other glob character is invalid. Add checks to disable
cone mode when seeing these values.
A later change will properly handle escaped globs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We often skip an optional prefix in a string with a hardcoded
constant, e.g.
if (starts_with(string, "prefix"))
string += 6;
which is less error prone when written
skip_prefix(string, "prefix", &string);
Note that this changes a few error messages from "git reflog expire
--expire=nonsense.timestamp", which used to complain by saying
'--expire=nonsense.timestamp' is not a valid timestamp
but with this change, we say
'nonsense.timestamp' is not a valid timestamp
which is more technically correct (the string with --expire= as
a prefix obviously cannot be a valid timestamp, but the error is
about the part of the input without that prefix).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f0fd0dc5c5 (submodule foreach: document '$sm_path' instead of '$path',
2018-05-08) updated the documentation to advise callers to favor
$sm_path over the deprecated synonym $path. However, the example in
that section still uses $path. Update it to use $sm_path.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In f237c8b6fe (commit-graph: implement git-commit-graph write,
2018-04-02) the test t5318.3 was introduced to ensure that calling 'git
commit-graph write' in a repository with no packfiles does not write any
commit-graph file(s).
To exercise more paths in 'builtin/commit-graph.c', this test passes
'--object-dir' to 'git commit-graph write', but the given argument
refers to the working copy, not the object directory.
Since the commit-graph sub-commands currently swallow these errors, this
does not result in a test failure. But, it is only lucky that the test
ends with no commit-graphs, since there were none to begin with.
In preparation for a future commit where an '--object-dir' argument that
does not match a known object directory will print out a failure, let's
fix the test to still use '--object-dir', but pass the correct location
to the object store instead of '.'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We parse diff.wsErrorHighlight in git_diff_ui_config(), meaning that it
doesn't take effect for plumbing commands, only for porcelains like
git-diff itself. This is mildly annoying as it means scripts like
add--interactive, which produce a user-visible diff with color, don't
respect the option.
We could teach that script to parse the config and pass it along as
--ws-error-highlight to the diff plumbing. But there's a simpler
solution.
It should be reasonably safe for plumbing to respect this option, as it
only kicks in when color is otherwise enabled. And anybody parsing
colorized output must already deal with the fact that color.diff.* may
change the exact output they see; those options have been part of
git_diff_basic_config() since its inception in 9a1805a872 (add a "basic"
diff config callback, 2008-01-04).
So we can just move it to the "basic" config, which fixes
add--interactive, along with any other script in the same boat, with a
very low risk of hurting any plumbing users.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers of check_object_signature() can work on arbitrary
repositories, but the repo does not get passed to this function.
Instead, the_repository is always used internally. To fix possible
inconsistencies, allow the function to receive a struct repository and
make those callers pass on the repo being handled.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow hash_object_file() to work on arbitrary repos by introducing a
git_hash_algo parameter. Change callers which have a struct repository
pointer in their scope to pass on the git_hash_algo from the said repo.
For all other callers, pass on the_hash_algo, which was already being
used internally at hash_object_file(). This functionality will be used
in the following patch to make check_object_signature() be able to work
on arbitrary repos (which, in turn, will be used to fix an
inconsistency at object.c:parse_object()).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow write_object_file_prepare() to receive arbitrary 'struct
git_hash_algo's instead of always using the_hash_algo. The added
parameter will be used in the next commit to make hash_object_file() be
able to work with arbitrary git_hash_algo's, as well.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers of open_istream() at archive-tar.c and archive-zip.c are
capable of working on arbitrary repositories but the repo struct is not
passed down to open_istream(), which uses the_repository internally. For
now, that's not a problem since the said callers are only being called
with the_repository. But to be consistent and avoid future problems,
let's allow open_istream() to receive a struct repository and use that
instead of the_repository. This parameter addition will also be used in
a future patch to make sha1-file.c:check_object_signature() be able to
work on arbitrary repos.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At verify_packfile(), use the git_hash_algo from the provided repository
instead of the_hash_algo, for consistency. Like the previous patch, this
shouldn't bring any behavior changes, since this function is currently
only receiving the_repository.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify_one() takes a struct repository argument but uses the_hash_algo
internally. Replace it with the provided repo's git_hash_algo, for
consistency. For now, this is mainly a cosmetic change, as all callers
of this function currently only pass the_repository to it.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_populate_filespec() takes a struct repository argument but it
doesn't get passed down to read_object_file().
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify documentation on committer/author identities.
* bc/author-committer-doc:
doc: provide guidance on user.name format
docs: expand on possible and recommended user config options
doc: move author and committer information to git-commit(1)
Minor bugfixes to "git add -i" that has recently been rewritten in C.
* js/builtin-add-i-cmds:
built-in add -i: accept open-ended ranges again
built-in add -i: do not try to `patch`/`diff` an empty list of files
Work around test breakages caused by custom regex engine used in
libasan, when address sanitizer is used with more recent versions
of gcc and clang.
* jk/asan-build-fix:
Makefile: use compat regex with SANITIZE=address
The command line completion (in contrib/) learned to complete
subcommands and arguments to "git worktree".
* sg/completion-worktree:
completion: list paths and refs for 'git worktree add'
completion: list existing working trees for 'git worktree' subcommands
completion: simplify completing 'git worktree' subcommands and options
completion: return the index of found word from __git_find_on_cmdline()
completion: clean up the __git_find_on_cmdline() helper function
t9902-completion: add tests for the __git_find_on_cmdline() helper
The test-lint machinery knew to check "VAR=VAL shell_function"
construct, but did not check "VAR= shell_funciton", which has been
corrected.
* jn/test-lint-one-shot-export-to-shell-function:
fetch test: mark test of "skipping" haves as v0-only
t/check-non-portable-shell: detect "FOO= shell_func", too
fetch test: avoid use of "VAR= cmd" with a shell function
gpg.minTrustLevel configuration variable has been introduced to
tell various signature verification codepaths the required minimum
trust level.
* hi/gpg-mintrustlevel:
gpg-interface: add minTrustLevel as a configuration option
Rendering by "git log --graph" of ancestry lines leading to a merge
commit were made suboptimal to waste vertical space a bit with a
recent update, which has been corrected.
* ds/graph-horizontal-edges:
graph: fix collapse of multiple edges
graph: add test to demonstrate horizontal line bug
The code recently added in this release to move to the entry beyond
the ones in the same directory in the index in the sparse-cone mode
did not count the number of entries to skip over incorrectly, which
has been corrected.
* ds/sparse-cone:
.mailmap: fix GGG authoship screwup
unpack-trees: correctly compute result count
Tell .editorconfig that in this project, *.txt files are indented
with tabs.
* hi/indent-text-with-tabs-in-editorconfig:
editorconfig: indent text files with tabs
We heap-allocate our arrays of name_entry structs, etc, with one entry
per tree we're asked to traverse. The code does a raw multiplication in
the xmalloc() call, which I find when auditing for integer overflows
during allocation.
We could "fix" this by using ALLOC_ARRAY() instead. But as it turns out,
the maximum size of these arrays is limited at compile time:
- merge_trees() always passes in 3 trees
- unpack_trees() and its brethren never pass in more than
MAX_UNPACK_TREES
So we can simplify even further by just using a stack array and bounding
it with MAX_UNPACK_TREES. There should be no concern with overflowing
the stack, since MAX_UNPACK_TREES is only 8 and the structs themselves
are small.
Note that since we're replacing xcalloc(), we have to move one of the
NULL initializations into a loop.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We compute the length of an array of object_id's with a raw
multiplication. In theory this could trigger an integer overflow which
would cause an under-allocation (and eventually an out of bounds write).
I doubt this can be triggered in practice, since you'd need to feed it
an enormous number of target objects, which would typically come from
the ref advertisement and be using proportional memory. And even on
64-bit systems, where "int" is much smaller than "size_t", that should
hold: even though "targets" is an int, the multiplication will be done
as a size_t because of the use of sizeof().
But we can easily fix it by using ALLOC_ARRAY(), which uses st_mult()
under the hood.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a "dst" buffer to write into, but there's no matching "len"
parameter. The hidden assumption is that normalizing always makes things
smaller, so we're OK as long as "dst" is at least as big as "src". Let's
document that explicitly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check we can talk to the remote host before starting the git-fastimport
subchild.
Otherwise we fail to connect, and then exit, leaving git-fastimport
still running since we did not wait() for it.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After an error, git-p4 calls die(). This just exits, and leaves child
processes still running.
Instead of calling die(), raise an exception and catch it where the
child process(es) (git-fastimport) are created.
This was analyzed in detail here:
https://public-inbox.org/git/20190227094926.GE19739@szeder.dev/
This change does not address the particular issue of p4CmdList()
invoking a subchild and not waiting for it on error.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it easier to try/catch around this block of code to ensure
cleanup following p4 failures is handled properly.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pylint is incredibly useful for finding bugs, but git-p4 has never used
it, so there are a lot of warnings that while important, don't actually
result in bugs.
Let's turn those off for now, so we can get some useful output.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently when there is a P4 error, git-p4 calls die() which just exits.
This then leaves the git-fast-import process still running, and can even
leave p4 itself still running.
As a result, git-p4 fails to exit cleanly. This is a particular problem
for people running the unit tests in regression.
Use this exception to report errors upwards, cleaning up as the error
propagates.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure that we can safely call self.closeStreams() multiple times, and
can also call it even if there is no git fast-import stream at all.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several error messages in get_oid() and its children that are
clearly intended for humans, but aren't marked for translation. E.g.:
$ git show :1:foo
fatal: Path 'foo' is in the index, but not at stage 1.
Did you mean ':0:foo'?
Let's mark these for translation. While we're at it, let's switch the
style to be more like our usual error messages: start with a lowercase
letter and omit a period at the end of the line.
This does mean that multi-line messages like the one above don't have
any punctuation between the two sentences. I solved that by adding a
"hint" marker like we'd see from advise(). So the result is:
$ git show :1:foo
fatal: path 'foo' is in the index, but not at stage 1
hint: Did you mean ':0:foo'?
A few tests had to be switched to test_i18ngrep and test_i18ncmp. Since
we were touching them anyway, I also simplified the ones using i18ngrep
a bit for readability.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a filter is specified, we do not need a full connectivity check on
the contents of the packfile we just fetched; we only need to check that
the objects referenced are promisor objects.
This significantly speeds up fetches into repositories that have many
promisor objects, because during the connectivity check, all promisor
objects are enumerated (to mark them UNINTERESTING), and that takes a
significant amount of time.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit dfa33a298d ("clone: do faster object check for partial clones",
2019-04-21) optimized the connectivity check done when cloning with
--filter to check only the existence of objects directly pointed to by
refs. But this is not sufficient: they also need to be promisor objects.
Make this check more robust by instead checking that these objects are
promisor objects, that is, they appear in a promisor pack.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git --git-dir <path> is a bit confusing and sometimes doesn't work as
the user would expect it to.
For example, if the user runs `git --git-dir=<path> status`, git
will skip the repository discovery algorithm and will assign the
work tree to the user's current work directory unless otherwise
specified. When this assignment is wrong, the output will not match
the user's expectations.
This patch updates the documentation to make it clearer.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 13185fd241 (l10n: zh_TW.po: update translation for v2.25.0 round 1,
2019-12-31), the author mistakenly used their GitHub username for
authorship information instead of their real name. However, a commit
with their real name exists prior to this: 9917eca794 (l10n: zh_TW: add
translation for v2.24.0, 2019-11-20).
Map their email to their real name so that these contributions can be
counted together.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since grep learned to recurse into submodules in 0281e487fd
(grep: optionally recurse into submodules, 2016-12-16),
using --recurse-submodules along with --no-index makes Git
die().
This is unfortunate because if submodule.recurse is set in a user's
~/.gitconfig, invoking `git grep --no-index` either inside or outside
a Git repository results in
fatal: option not supported with --recurse-submodules
Let's allow using these options together, so that setting submodule.recurse
globally does not prevent using `git grep --no-index`.
Using `--recurse-submodules` should not have any effect if `--no-index`
is used inside a repository, as Git will recurse into the checked out
submodule directories just like into regular directories.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for push.default mentions that it is used if no
refspec is "explicitly given". Let's drop the notion of "explicit" here,
since it's vague, and just mention that any refspec from anywhere is
sufficient to override this.
I've dropped the mention of "explicitly given" from the definition of
the "nothing" value right below, too. It's close enough to our
clarification that it should be obvious we mean the same type of "given"
here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the previous two commits, UBSan with clang-11 complains about
computing offsets from a NULL pointer. The failures in t4013 (and
elsewhere) look like this:
kwset.c:102:23: runtime error: applying non-zero offset 107820859019600 to null pointer
...
not ok 79 - git log -SF master # magic is (not used)
That line is not enlightening:
... = obstack_alloc(&kwset->obstack, sizeof (struct trie));
because obstack is implemented almost entirely in macros, and the actual
problem is five macros deep (I temporarily converted them to inline
functions to get better compiler errors, which was tedious but worked
reasonably well).
The actual problem is in these pointer-alignment macros:
/* If B is the base of an object addressed by P, return the result of
aligning P to the next multiple of A + 1. B and P must be of type
char *. A + 1 must be a power of 2. */
#define __BPTR_ALIGN(B, P, A) ((B) + (((P) - (B) + (A)) & ~(A)))
/* Similar to _BPTR_ALIGN (B, P, A), except optimize the common case
where pointers can be converted to integers, aligned as integers,
and converted back again. If PTR_INT_TYPE is narrower than a
pointer (e.g., the AS/400), play it safe and compute the alignment
relative to B. Otherwise, use the faster strategy of computing the
alignment relative to 0. */
#define __PTR_ALIGN(B, P, A) \
__BPTR_ALIGN (sizeof (PTR_INT_TYPE) < sizeof (void *) ? (B) : (char *) 0, \
P, A)
If we have a sufficiently-large integer pointer type, then we do the
computation using a NULL pointer constant. That turns __BPTR_ALIGN()
into something like:
NULL + (P - NULL + A) & ~A
and UBSan is complaining about adding the full value of P to that
initial NULL. We can fix this by doing our math as an integer type, and
then casting the result back to a pointer. The problem case only happens
when we know that the integer type is large enough, so there should be
no issue with truncation.
Another option would be just simplify out all the 0's from
__BPTR_ALIGN() for the NULL-pointer case. That probably wouldn't work
for a platform where the NULL pointer isn't all-zeroes, but Git already
wouldn't work on such a platform (due to our use of memset to set
pointers in structs to NULL). But I tried here to keep as close to the
original as possible.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the previous commit, clang-11's UBSan complains about computing
offsets from a NULL pointer, causing some tests to fail. In this case,
though, we're actually computing a non-zero offset, which is even more
dubious. From t7810:
xdiff-interface.c:268:14: runtime error: applying non-zero offset 1 to null pointer
...
not ok 131 - grep -p with userdiff
The problem is our parsing of the funcname config. We count the number
of lines in the string, allocate an array, and then loop over our
allocated entries, parsing each line and moving our cursor to one past
the trailing newline for the next iteration.
But the final line will not generally have a trailing newline (since
it's a config value), and hence we go to one past NULL. In practice this
is OK, since our loop should terminate before we look at the value. But
even computing such an invalid pointer technically violates the
standard.
We can fix it by leaving the pointer at NULL if we're at the end, rather
than one-past. And while we're thinking about it, we can also document
the variant by asserting that our initial line-count matches the
second-pass of parsing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Undefined Behavior Sanitizer in clang-11 seems to have learned a new
trick: it complains about computing offsets from a NULL pointer, even if
that offset is 0. This causes numerous test failures. For example, from
t1090:
unpack-trees.c:1355:41: runtime error: applying zero offset to null pointer
...
not ok 6 - in partial clone, sparse checkout only fetches needed blobs
The code in question looks like this:
struct cache_entry **cache_end = cache + nr;
...
while (cache != cache_end)
and we sometimes pass in a NULL and 0 for "cache" and "nr". This is
conceptually fine, as "cache_end" would be equal to "cache" in this
case, and we wouldn't enter the loop at all. But computing even a zero
offset violates the C standard. And given the fact that UBSan is
noticing this behavior, this might be a potential problem spot if the
compiler starts making unexpected assumptions based on undefined
behavior.
So let's just avoid it, which is pretty easy. In some cases we can just
switch to iterating with a numeric index (as we do in sequencer.c here).
In other cases (like the cache_end one) the use of an end pointer is
more natural; we can keep that by just explicitly checking for the
NULL/0 case when assigning the end pointer.
Note that there are two ways you can write this latter case, checking
for the pointer:
cache_end = cache ? cache + nr : cache;
or the size:
cache_end = nr ? cache + nr : cache;
For the case of a NULL/0 ptr/len combo, they are equivalent. But writing
it the second way (as this patch does) has the property that if somebody
were to incorrectly pass a NULL pointer with a non-zero length, we'd
continue to notice and segfault, rather than silently pretending the
length was zero.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bug was that ie_match_stat() was used to compare if the stat info
for the file was compatible with the stat info in the index, rather
using ie_modified() to check if the file was in fact different from the
version in the index.
A version of this (with deinit instead of rm) was reported here:
https://lore.kernel.org/git/CAPOqYV+C-P9M2zcUBBkD2LALPm4K3sxSut+BjAkZ9T1AKLEr+A@mail.gmail.com/
It seems that in that case, the user's clone command left the index
with empty stat info. The mailing list was unable to reproduce this.
But we (Two Sigma) hit the bug while using some plumbing commands, so
I'm fixing it. I manually confirmed that the fix also repairs deinit
in this scenario.
Signed-off-by: David Turner <dturner@twosigma.com>
Reported-by: Thomas Bétous <th.betous@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some (but not all!) redirections in this file are spelled "2> error".
Let's switch them to our usual style.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the shell "test" here is inflexible, because we can't easily swap
it out for an i18n-aware version like we can with test_cmp and
test_i18ncmp. And it's not even saving us any processes, since we have
to use "cat" to get the output. So let's switch to using test_cmp, which
has the added bonus that it will produce better output if there's a
failure.
Note that not all of the changed outputs here are candidates for
translation, but I've converted all of them for consistency and to
benefit from the better output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When set to "warn" or "error", `rebase.missingCommitsCheck' would make
`rebase -i' warn if the user removed commits from the todo list to
prevent mistakes. Unfortunately, `rebase --edit-todo' and `rebase
--continue' don't take it into account.
This adds the ability for `rebase --edit-todo' and `rebase --continue'
to check if commits were dropped by the user. As both edit_todo_list()
and complete_action() parse the todo list and check for dropped commits,
the code doing so in the latter is removed to reduce duplication.
`edit_todo_list_advice' is removed from sequencer.c as it is no longer
used there.
This changes when a backup of the todo list is made. Until now, it was
saved only once, before the initial edit. Now, it is also made if the
original todo list has no errors or no dropped commits. Thus, the
backup should be error-free. Without this, sequencer_continue()
(`rebase --continue') could only compare the current todo list against
the original, unedited list. Before this change, this file was only
used by edit_todo_list() and `rebase -p' to create the backup before
the initial edit, and check_todo_list_from_file(), only used by
`rebase -p' to check for dropped commits after its own initial edit.
If the edited list has an error, a file, `dropped', is created to
report the issue. Otherwise, it is deleted. Usually, the edited list
is compared against the list before editing, but if this file exists,
it will be compared to the backup. Also, if the file exists,
sequencer_continue() checks the list for dropped commits. If the
check was performed every time, it would fail when resuming a rebase
after resolving a conflict, as the backup will contain commits that
were picked, but they will not be in the new list. It's safe to
ignore this check if `dropped' does not exist, because that means that
no errors were found at the last edition, so any missing commits here
have already been picked.
Five tests are added to t3404. The tests for
`rebase.missingCommitsCheck = warn' and `rebase.missingCommitsCheck =
error' have a similar structure. First, we start a rebase with an
incorrect command on the first line. Then, we edit the todo list,
removing the first and the last lines. This demonstrates that
`--edit-todo' notices dropped commits, but not when the command is
incorrect. Then, we restore the original todo list, and edit it to
remove the last line. This demonstrates that if we add a commit after
the initial edit, then remove it, `--edit-todo' will notice that it
has been dropped. Then, the actual rebase takes place. In the third
test, it is also checked that `--continue' will refuse to resume the
rebase if commits were dropped. The fourth test checks that no errors
are raised when resuming a rebase after resolving a conflict, the fifth
checks that no errors are raised when editing the todo list after
pausing the rebase.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The message contained in `edit_todo_list_advice' (sequencer.c) is
printed after the initial edit of the todo list if it can't be parsed or
if commits were dropped. This is done either in complete_action() for
`rebase -i', or in check_todo_list_from_file() for `rebase -p'.
Since we want to add this check when editing the list, we also want to
use this message from edit_todo_list() (rebase-interactive.c). To this
end, check_todo_list_from_file() is moved to rebase-interactive.c, and
`edit_todo_list_advice' is copied there. In the next commit,
complete_action() will stop using it, and `edit_todo_list_advice' will
be removed from sequencer.c.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since apply_patch
wraps a sed and git invocation, rewrite it to accept an `!` argument
which would cause only the git command to be prefixed with
`test_must_fail`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we should assume that external commands work sanely. Replace
`test_must_fail test_path_exists` with `test_path_is_missing` since we
expect these paths to not exist.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have some test cases which are indented 7-spaces instead of a tab.
Reindent with a tab instead.
This patch should appear empty with `--ignore-all-space`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test with disabled rerere should make sure that the cherry-picked
result does not have the conflict replaced with a recorded resolution.
It attempts to do so by ensuring that the file content is _not_ equal
to some other file. That by itself is a very dubious check because just
about every random result of an incomplete cherry-pick would satisfy
the condition.
In this case, the intent was to check that the conflicting file does
_not_ contain the resolved content. But the check actually uses the
wrong reference file, but not the resolved content. Needless to say
that the non-equality is satisfied. And, on top of it, it uses a commit
that does not even touch the file that is checked.
Do check for the expected result, which is content from both sides of
the merge and merge conflicts. (The latter we check for just the
middle separator for brevity.)
As a side-effect, this also removes an incorrect use of test_must_fail.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix invocation of git command so its exit codes is not lost within
a non-assignment command substitution.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are using `test_must_fail test_auto_{fixup,squash}` which would
ensure that the function failed. However, this is a little ham-fisted as
there is no way of ensuring that the expected part of the function
actually fails.
Increase the granularity by accepting an optional `!` first argument
which will check that the rebase does not squash in any commits by
ensuring that there are still 4 commits. If `!` is not provided, the old
logic is run.
This patch may be better reviewed with `--ignore-all-space`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix invocations of git commands so their exit codes are not lost
within non-assignment command substitutions.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have many statements which are duplicated. Extract and replace these
duplicate statements with notes_merge_files_gone().
While we're at it, replace the test_might_fail(), which should only be
used on git commands.
Also, remove the redirection from stderr to /dev/null. This is because
the test scripts automatically suppress output already. Otherwise, if a
developer asks for verbose output via the `-v` flag, the stderr output
may be useful for debugging.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use `test_must_fail test -d` to ensure that the directory is removed.
However, test_must_fail() should only be used for git commands. Use
test_path_is_missing() instead to check that the directory has been
removed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of the effort to become more hash-agnostic, replace all
instances of "sha" with "oid".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix invocations of git commands so their exit codes are not lost
within non-assignment command substitutions.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are running `test_must_fail do_checkout`. However,
`test_must_fail` should only be used on git commands. Teach
do_checkout() to accept `!` as a potential first argument which will
cause the function to expect the "git checkout" to fail.
This increases the granularity of the test as, instead of blindly
checking that do_checkout() failed, we check that only the specific
expected invocation of git fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Functions test_dirty_unmergeable() and test_dirty_mergeable()
expect git-diff to exit with the specific code 1. However, rather
than checking for that value explicitly, they instead negate the
exit code. Unfortunately, this negation makes it impossible to
distinguish the expected code from some other unexpected non-zero
code, for instance, from a segmentation fault. Therefore, be more
discerning by checking the exit code explicitly using
test_expect_code().
Furthermore, some callers of those functions want to negate the
result again, and do so with test_must_fail(). However,
test_must_fail() should only be used with git commands. Address
this by introducing a couple new tiny helper functions which test
the exact condition expected (without the unnecessarily confusing
double-negation).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6462d5eb9a ("fetch: remove fetch_if_missing=0", 2019-11-08)
contains a test that relies on having to lazily fetch the delta base of
a blob, but assumes that the tree being fetched (as part of the test) is
sent as a non-delta object. This assumption may not hold in the future;
for example, a change in the length of the object hash might result in
the tree being sent as a delta instead.
Make the test more robust by relying on having to lazily fetch the delta
base of the tree instead, and by making no assumptions on whether the
blobs are sent as delta or non-delta.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flip_stage() helper uses a bit-flipping xor to switch between "2"
and "3". While clever, this relies on a property of those two numbers
that is mostly coincidence. Let's write it as a subtraction; that's more
clear and would extend to other numbers if somebody copies the logic.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge-recursive code uses stage number constants like this:
add = &ci->ren1->dst_entry->stages[2 ^ 1];
...
add = &ci->ren2->dst_entry->stages[3 ^ 1];
The xor has the effect of flipping the "1" bit, so that "2 ^ 1" becomes
"3" and "3 ^ 1" becomes "2", which correspond to the "ours" and "theirs"
stages respectively.
Unfortunately, clang-10 and up issue a warning for this code:
merge-recursive.c:1759:40: error: result of '2 ^ 1' is 3; did you mean '1 << 1' (2)? [-Werror,-Wxor-used-as-pow]
add = &ci->ren1->dst_entry->stages[2 ^ 1];
~~^~~
1 << 1
merge-recursive.c:1759:40: note: replace expression with '0x2 ^ 1' to silence this warning
We could silence it by using 0x2, as the compiler mentions. Or by just
using the constants "2" and "3" directly. But after digging into it, I
do think this bit-flip is telling us something. If we just wrote:
add = &ci->ren2->dst_entry->stages[2];
for the second one, you might think that "ren2" and "2" correspond. But
they don't. The logic is: ren2 is theirs, which is stage 3, but we
are interested in the opposite side's stage, so flip it to 2.
So let's keep the bit-flipping, but let's also put it behind a named
function, which will make its purpose a bit clearer. This also has the
side effect of suppressing the warning (and an optimizing compiler
should be able to easily turn it into a constant as before).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 28fcc0b71a (pathspec: avoid the need of "--" when wildcard is
used, 2015-05-02) allowed:
git rev-parse '*.c'
without the double-dash. But the rule it uses to check for wildcards
actually looks for any glob special. This is overly liberal, as it means
that a pattern that doesn't actually do any wildcard matching, like
"a\b", will be considered a pathspec.
If you do have such a file on disk, that's presumably what you wanted.
But if you don't, the results are confusing: rather than say "there's no
such path a\b", we'll quietly accept it as a pathspec which very likely
matches nothing (or at least not what you intended). Likewise, looking
for path "a\*b" doesn't expand the search at all; it would only find a
single entry, "a*b".
This commit switches the rule to trigger only when glob metacharacters
would expand the search, meaning both of those cases will now report an
error (you can still disambiguate using "--", of course; we're just
tightening the DWIM heuristic).
Note that we didn't test the original feature in 28fcc0b71a at all. So
this patch not only tests for these corner cases, but also adds a
regression test for the existing behavior.
Reported-by: David Burström <davidburstrom@spotify.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 49e268e23e (mingw: safeguard better against backslashes in file
names, 2020-01-09), the commit author is listed as
"Johannes Schindelin via GitGitGadget <gitgitgadget@gmail.com>", which
is erroneous. Fix the authorship by mapping the erroneous authorship to
his canonical authorship information.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Original bash helper for "submodule status" was doing a check for
initialized but not cloned submodules and prefixed the status with
a minus sign in case no .git file or folder was found inside the
submodule directory.
This check was missed when the original port of the functionality
from bash to C was done.
Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have test coverage for "git submodule status" output in
various cases, i.e.
1) not-init, not-cloned: status should initially be "missing"
2) init, not-cloned: status should be "missing"
3) not-init, cloned:
4) init, cloned: status should be "up-to-date" after update
4.1) + modified: status should be "modified" after submodule commit
4.2) + modified, committed: status should be "up-to-date" after update
but the cases 2) and 3) are not covered.
Test that submodule status reports initialized but not cloned
submodules as missing to fill the gap in test coverage; this covers
case (2) above, but case (3) remains uncovered.
Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With https://lore.kernel.org/git/20191114194708.GD60198@google.com/ we
now have a mentoring mailing list, to which we should direct new
contributors who have questions.
Mention #git-devel, which is targeted for Git contributors; asking for
help with getting a first contribution together is on-topic for that
channel. Also mention some of the conventions in case folks are
unfamiliar with IRC.
Because the mentoring list and #git-devel are both a subset of Git
contributors, finally list the main Git list and mention some of the
posting conventions.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cone mode, the shortest pattern the sparse-checkout command will
write into the sparse-checkout file is "/*". This is handled carefully
in add_pattern_to_hashsets(), so warn if any other pattern is this
short. This will assist future pattern checks by allowing us to assume
there are at least three characters in the pattern.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command creates a restricted set of possible patterns that are used
by a custom algorithm to quickly match those patterns.
If a user manually edits the sparse-checkout file, then they could
create patterns that do not match these expectations. The cone-mode
matching algorithm can return incorrect results. The solution is to
detect these incorrect patterns, warn that we do not recognize them,
and revert to the standard algorithm.
Check each pattern for the "**" substring, and revert to the old
logic if seen. While technically a "/<dir>/**" pattern matches
the meaning of "/<dir>/", it is not one that would be written by
the sparse-checkout builtin in cone mode. Attempting to accept that
pattern change complicates the logic and instead we punt and do
not accept any instance of "**".
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.
Modify the test to use a "file://" URL, which would output this error
before the code change:
Cloning into 'clone'...
fatal: cannot change to 'file://.../repo': No such file or directory
error: failed to initialize sparse-checkout
These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.
Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git init' command creates the ".git/info" directory and fills it
with some default files. However, 'git worktree add' does not create
the info directory for that worktree. This causes a problem when running
"git sparse-checkout init" inside a worktree. While care was taken to
allow the sparse-checkout config to be specific to a worktree, this
initialization was untested.
Safely create the leading directories for the sparse-checkout file. This
is the safest thing to do even without worktrees, as a user could delete
their ".git/info" directory and expect Git to recover safely.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t1091-sparse-checkout-builtin.sh uses here-docs to populate the
expected contents of the sparse-checkout file. These do not use
shell interpolation, so use "-\EOF" instead of "-EOF". Also use
proper tabbing.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When testing the sparse-checkout feature, we need to compare the
contents of the working-directory against some expected output.
Using here-docs was useful in the beginning, but became repetetive
as the test script grew.
Create a check_files helper to make the tests simpler and easier
to extend. It also reduces instances of bad here-doc whitespace.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests that required a custom configuration file to be created previously
used a file with non-alphanumeric characters including escaped double
quotes. This is not really necessary for the majority of tests
involving custom config files, and decreases test coverage on systems
that dissallow such filenames (Windows, etc.).
Create two files, one appropriate for testing quoting and one
appropriate for general use.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for the following patches by removing extraneous indents from
HERE-DOCs used in config tests.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git config use of the end_null variable to determine if we should be
null terminating our output. While it is correct to say a string is
"null terminated" the character is actually the "nul" character, so this
malapropism is being fixed.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the first things done when using a sequencer-based
rebase (ie. `rebase -i', `rebase -r', or `rebase -m') is to make a todo
list. This requires knowledge of the commit range to rebase. To get
the oid of the last commit of the range, the tip of the branch to rebase
is checked out with prepare_branch_to_be_rebased(), then the oid of the
head is read. After this, the tip of the branch is not even modified.
The `am' backend, on the other hand, does not check out the branch.
On big repositories, it's a performance penalty: with `rebase -i', the
user may have to wait before editing the todo list while git is
extracting the branch silently, and "quiet" rebases will be slower than
`am'.
Since we already have the oid of the tip of the branch in
`opts->orig_head', it's useless to switch to this commit.
This removes the call to prepare_branch_to_be_rebased() in
do_interactive_rebase(), and adds a `orig_head' parameter to
get_revision_ranges(). prepare_branch_to_be_rebased() is removed as it
is no longer used.
This introduces a visible change: as we do not switch on the tip of the
branch to rebase, no reflog entry is created at the beginning of the
rebase for it.
Unscientific performance measurements, performed on linux.git, are as
follow:
Before this patch:
$ time git rebase -m --onto v4.18 463fa44eec2fef50~ 463fa44eec2fef50
real 0m8,940s
user 0m6,830s
sys 0m2,121s
After this patch:
$ time git rebase -m --onto v4.18 463fa44eec2fef50~ 463fa44eec2fef50
real 0m1,834s
user 0m0,916s
sys 0m0,206s
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new config value for core.fsmonitorHookVersion was added to be able
to force the version of the fsmonitor hook. Possible values are 1 or 2.
When this is not set the code will use a value of -1 and attempt to use
version 2 of the hook first and if that fails will attempt version 1.
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Version 2 of the fsmonitor hooks is passed the version and an update
token and must pass back a last update token to use for subsequent calls
to the hook.
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In af09ce2 ("sparse-checkout: init and set in cone mode", 2019-11-21),
the '--cone' option was added to 'git sparse-checkout init'.
Document it.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `rebase.missingCommitsCheck` is in effect, we use the backup of the
todo list that was copied just before the user was allowed to edit it.
That backup is, of course, just as susceptible to the hash collision as
the todo list itself: a reworded commit could make a previously
unambiguous short commit ID ambiguous all of a sudden.
So let's not just copy the todo list, but let's instead write out the
backup with expanded commit IDs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 66ae9a57b8 (t3404: rebase -i: demonstrate short SHA-1 collision,
2013-08-23), we added a test case that demonstrated how it is possible
that a previously unambiguous short commit ID could become ambiguous
*during* a rebase.
In 75c6976655 (rebase -i: fix short SHA-1 collision, 2013-08-23), we
fixed that problem simply by writing out the todo list with expanded
commit IDs (except *right* before letting the user edit the todo list,
in which case we shorten them, but we expand them right after the file
was edited).
However, the bug resurfaced as a side effect of 393adf7a6f (sequencer:
directly call pick_commits() from complete_action(), 2019-11-24): as of
this commit, the sequencer no longer re-reads the todo list after
writing it out with expanded commit IDs.
The only redeeming factor is that the todo list is already parsed at
that stage, including all the commits corresponding to the commands,
therefore the sequencer can continue even if the internal todo list has
short commit IDs.
That does not prevent problems, though: the sequencer writes out the
`done` and `git-rebase-todo` files incrementally (i.e. overwriting the
todo list with a version that has _short_ commit IDs), and if a merge
conflict happens, or if an `edit` or a `break` command is encountered, a
subsequent `git rebase --continue` _will_ re-read the todo list, opening
an opportunity for the "short SHA-1 collision" bug again.
To avoid that, let's make sure that we do expand the commit IDs in the
todo list as soon as we have parsed it after letting the user edit it.
Additionally, we improve the 'short SHA-1 collide' test case in t3404 to
test specifically for the case where the rebase is resumed. We also
hard-code the expected colliding short SHA-1s, to document the
expectation (and to make it easier on future readers).
Note that we specifically test that the short commit ID is used in the
`git-rebase-todo.tmp` file: this file is created by the fake editor in
the test script and reflects the state that would have been presented to
the user to edit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case that a `get_oid()` call failed, we showed some rather bogus
part of the line instead of the precise string we sent to said function.
That makes it rather hard for users to understand what is going wrong,
so let's fix that.
While at it, return a negative value from `parse_insn_line()` in case of
an error, as per our convention. This function's only caller,
`todo_list_parse_insn_buffer()`, cares only whether that return value is
non-zero or not, i.e. does not need to be changed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We no longer compute bitmap_git->reuse_objects, so we
cannot rely on it anymore to terminate the loop early;
we have to iterate to the end.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Additional checks are added in have_duplicate_entry() and
obj_is_packed() to avoid duplicate objects in the reuse
bitmap. It was probably buggy to not have such a check
before.
Git as a client would never both asks for a tag by sha1 and
specify "include-tag", but libgit2 will, so a libgit2 client
cloning from a Git server would trigger the bug.
If a client both asks for a tag by sha1 and specifies
"include-tag", we may end up including the tag in the reuse
bitmap (due to the first thing), and then later adding it to
the packlist (due to the second). This results in duplicate
objects in the pack, which git chokes on. We should notice
that we are already including it when doing the include-tag
portion, and avoid adding it to the packlist.
The simplest place to fix this is right in add_ref_tag(),
where we could avoid peeling the tag at all if we know that
we are already including it. However, this pushes the check
instead into have_duplicate_entry(). This fixes not only
this case, but also means that we cannot have any similar
problems lurking in other code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old code to reuse deltas from an existing packfile
just tried to dump a whole segment of the pack verbatim.
That's faster than the traditional way of actually adding
objects to the packing list, but it didn't kick in very
often. This new code is really going for a middle ground:
do _some_ per-object work, but way less than we'd
traditionally do.
The general strategy of the new code is to make a bitmap
of objects from the packfile we'll include, and then
iterate over it, writing out each object exactly as it is
in our on-disk pack, but _not_ adding it to our packlist
(which costs memory, and increases the search space for
deltas).
One complication is that if we're omitting some objects,
we can't set a delta against a base that we're not
sending. So we have to check each object in
try_partial_reuse() to make sure we have its delta.
About performance, in the worst case we might have
interleaved objects that we are sending or not sending,
and we'd have as many chunks as objects. But in practice
we send big chunks.
For instance, packing torvalds/linux on GitHub servers
now reused 6.5M objects, but only needed ~50k chunks.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor the way we check if an object is packed by
introducing obj_is_packed(). This function is now a simple
wrapper around packlist_find(), but it will evolve in a
following commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's make it possible to configure if we want pack reuse or not.
The main reason it might not be wanted is probably debugging and
performance testing, though pack reuse _might_ cause larger packs,
because we wouldn't consider the reused objects as bases for
finding new deltas.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will need this helper function in a following commit
to give us total number of bytes fed to the hashfile so far.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor bitmap_has_oid_in_uninteresting() using
bitmap_walk_contains().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
bitmap_has_oid_in_uninteresting() only used bitmap_position_packfile(),
not bitmap_position(). So it wouldn't find objects which weren't in the
bitmapped packfile (i.e., ones where we extended the bitmap to handle
loose objects, or objects in other packs).
As we could reuse a delta against such an object it is suboptimal not
to use bitmap_position(), so let's use it instead of
bitmap_position_packfile().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will use this helper function in a following commit to
tell us if an object is packed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a following commit we will need to allocate a variable
number of bitmap words, instead of always 32, so let's add
bitmap_word_alloc() for this purpose.
Note that we have to adjust the block growth in bitmap_set(),
since a caller could now use an initial size of "0" (we don't
plan to do that, but it doesn't hurt to be defensive).
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git restore --staged" did not correctly update the cache-tree
structure, resulting in bogus trees to be written afterwards, which
has been corrected.
* nd/switch-and-restore:
restore: invalidate cache-tree when removing entries with --staged
Reduce unnecessary round-trip when running "ls-remote" over the
stateless RPC mechanism.
* jk/no-flush-upon-disconnecting-slrpc-transport:
transport: don't flush when disconnecting stateless-rpc helper
Complete an update to tutorial that encourages "git switch" over
"git checkout" that was done only half-way.
* hw/tutorial-favor-switch-over-checkout:
doc/gitcore-tutorial: fix prose to match example command
The code that tries to skip over the entries for the paths in a
single directory using the cache-tree was not careful enough
against corrupt index file.
* es/unpack-trees-oob-fix:
unpack-trees: watch for out-of-range index position
has_object_file() said "no" given an object registered to the
system via pretend_object_file(), making it inconsistent with
read_object_file(), causing lazy fetch to attempt fetching an
empty tree from promisor remotes.
* jt/sha1-file-remove-oi-skip-cached:
sha1-file: remove OBJECT_INFO_SKIP_CACHED
"git commit" gives output similar to "git status" when there is
nothing to commit, but without honoring the advise.statusHints
configuration variable, which has been corrected.
* hw/commit-advise-while-rejecting:
commit: honor advice.statusHints when rejecting an empty commit
Sample credential helper for using .netrc has been updated to work
out of the box.
* dl/credential-netrc:
contrib/credential/netrc: work outside a repo
contrib/credential/netrc: make PERL_PATH configurable
Ever since df56607dff (git-common-dir: make "modules/"
per-working-directory directory, 2014-11-30), submodules in linked worktrees
are cloned to $GIT_DIR/modules, i.e. $GIT_COMMON_DIR/worktrees/<name>/modules.
However, this convention was not followed when the worktree updater commands
checkout, reset and read-tree learned to recurse into submodules. Specifically,
submodule.c::submodule_move_head, introduced in 6e3c1595c6 (update submodules:
add submodule_move_head, 2017-03-14) and submodule.c::submodule_unset_core_worktree,
(re)introduced in 898c2e65b7 (submodule: unset core.worktree if no working tree
is present, 2018-12-14) use get_git_common_dir() instead of get_git_dir()
to get the path of the submodule repository.
This means that, for example, 'git checkout --recurse-submodules <branch>'
in a linked worktree will correctly checkout <branch>, detach the submodule's HEAD
at the commit recorded in <branch> and update the submodule working tree, but the
submodule HEAD that will be moved is the one in $GIT_COMMON_DIR/modules/<name>/,
i.e. the submodule repository of the main superproject working tree.
It will also rewrite the gitfile in the submodule working tree of the linked worktree
to point to $GIT_COMMON_DIR/modules/<name>/.
This leads to an incorrect (and confusing!) state in the submodule working tree
of the main superproject worktree.
Additionally, if switching to a commit where the submodule is not present,
submodule_unset_core_worktree will be called and will incorrectly remove
'core.wortree' from the config file of the submodule in the main superproject worktree,
$GIT_COMMON_DIR/modules/<name>/config.
Fix this by constructing the path to the submodule repository using get_git_dir()
in both submodule_move_head and submodule_unset_core_worktree.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'checkout --to' functionality was moved to 'worktree add', tests were adapted
in f194b1ef6e (tests: worktree: retrofit "checkout --to" tests for "worktree add",
2015-07-06).
The calls were changed to 'worktree add' in this test (then t7410), but the test
descriptions were not updated, keeping 'checkout' instead of using the new
terminology (linked worktrees).
Also, in the test each worktree is created in
$TRASH_DIRECTORY/<leading-directory>/main, where the name of <leading-directory>
carries some information about what behavior each test verifies. This directory
structure is not mandatory for the tests; the worktrees can live next to one
another in the trash directory.
Clarify the tests by using the right terminology, and remove the unnecessary
leading directories such that all superproject worktrees are directly next to one
another in the trash directory.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The subshells used in the setup phase of this test are unnecessary.
Remove them by using 'git -C' and 'test_commit -C'.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test was added in df56607dff (git-common-dir: make "modules/"
per-working-directory directory, 2014-11-30), back when the 'git worktree' command
did not exist and 'git checkout --to' was used to create supplementary worktrees.
Since this file contains tests for the interaction of 'git worktree' with
submodules, rename it to t2405-worktree-submodule.sh, following the naming scheme for
tests checking the behavior of various commands with submodules.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users in a wide variety of situations find themselves with HTTP push
problems. Oftentimes these issues are due to antivirus software,
filtering proxies, or other man-in-the-middle situations; other times,
they are due to simple unreliability of the network.
However, a common solution to HTTP push problems found online is to
increase http.postBuffer. This works for none of the aforementioned
situations and is only useful in a small, highly restricted number of
cases: essentially, when the connection does not properly support
HTTP/1.1.
Document when raising this value is appropriate and what it actually
does, and discourage people from using it as a general solution for push
problems, since it is not effective there.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is quite common for users to want to ignore the changes to a file
that Git tracks. Common scenarios for this case are IDE settings and
configuration files, which should generally not be tracked and possibly
generated from tracked files using a templating mechanism.
However, users learn about the assume-unchanged and skip-worktree bits
and try to use them to do this anyway. This is problematic, because
when these bits are set, many operations behave as the user expects, but
they usually do not help when git checkout needs to replace a file.
There is no sensible behavior in this case, because sometimes the data
is precious, such as certain configuration files, and sometimes it is
irrelevant data that the user would be happy to discard.
Since this is not a supported configuration and users are prone to
misuse the existing features for unintended purposes, causing general
sadness and confusion, let's document the existing behavior and the
pitfalls in the documentation for git update-index so that users know
they should explore alternate solutions.
In addition, let's provide a recommended solution to dealing with the
common case of configuration files, since there are well-known
approaches used successfully in many environments.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a frequent misconception that the user.name variable controls
authentication in some way, and as a result, beginning users frequently
attempt to change it when they're having authentication troubles.
Document that the convention is that this variable represents some form
of a human's personal name, although that is not required. In addition,
address concerns about whether Unicode is supported.
Use the term "personal name" as this is likely to draw the intended
contrast, be applicable across cultures which may have different naming
conventions, and be easily understandable to people who do not speak
English as their first language. Indicate that "some form" is
conventionally used, as people may use a nickname or preferred name
instead of a full legal name.
Point users who may be confused about authentication to an appropriate
configuration option instead. Provide a shortened form of this
information in the configuration option description.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the section on setting author and committer information, we omit the
author.* and committer.* variables, so mention them for completeness.
In addition, guide users to the typical case: simply setting user.name
and user.email, which are recommended if one does not need complex
configuration.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at one time it made perfect sense to store information about
configuring author and committer information in the documentation for
git commit-tree, in modern Git that operation is seldom used. Most
users will use git commit and expect to find comprehensive documentation
about its use in the manual page for that command.
Considering that there is significant confusion about how one is to use
the user.name and user.email variables, let's put as much documentation
as possible into an obvious place where users will be more likely to
find it.
In addition, expand the environment variables section to describe their
use more fully. Even though we now describe all of the options there
and in the configuration settings documentation, preserve the existing
text in git-commit.txt so that people can easily reason about the
ordering of the various options they can use. Explain the use of the
author.* and committer.* options as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--bool` option to `git-config` is marked as historical, and users are
recommended to use `--type=bool` instead. This commit replaces all occurrences
of `--bool` in the templates.
Also note that, no other deprecated type options are found, including `--int`,
`--bool-or-int`, `--path`, or `--expiry-date`.
Signed-off-by: Lucius Hu <orctarorga@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take advantage of helper function 'test_path_is_file()' to
replace 'test -f' since the function makes the code more
readable and gives better error messages.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests in `t6025-merge-symlinks.sh` were written a long time ago, and
has a lot of style violations, including the mixed-use of tabs and spaces,
missing indentations, and other shell script style violations. Update it to
match the CodingGuidelines.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch continues the effort that is already applied to
`git commit`, `git reset`, `git checkout` etc.
1) Changed outdated descriptions to mention pathspec instead.
2) Added reference to 'linkgit:gitglossary[7]'.
3) Removed content that merely repeated gitglossary.
4) Merged the remainder of "discussion" into `<patchspec>`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many languages, the adverb with the root "actual" means "at the
present time." However, this usage is considered dated or even archaic
in English, and for referring to events occurring at the present time,
we usually prefer "currently" or "presently". "Actually" is commonly
used in modern English only for the meaning of "in fact" or to express a
contrast with what is expected.
Since the documentation refers to the available options at the present
time (that is, at the time of writing) instead of drawing a contrast,
let's switch to "currently," which both is commonly used and sounds less
formal than "presently."
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To prevent long blocking time during a 'git fetch' call, a user
may want to set up a schedule for background 'git fetch' processes.
However, these runs will update the refs/remotes branches due to
the default refspec set in the config when Git adds a remote.
Hence the user will not notice when remote refs are updated during
their foreground fetches. In fact, they may _want_ those refs to
stay put so they can work with the refs from their last foreground
fetch call.
This can be accomplished by overriding the configured refspec using
'--refmap=' along with a custom refspec:
git fetch --refmap='' <remote> +refs/heads/*:refs/hidden/<remote>/*
to populate a custom ref space and download a pack of the new
reachable objects. This kind of call allows a few things to happen:
1. We download a new pack if refs have updated.
2. Since the refs/hidden branches exist, GC will not remove the
newly-downloaded data.
3. With fetch.writeCommitGraph enabled, the refs/hidden refs are
used to update the commit-graph file.
To avoid the refs/hidden directory from filling without bound, the
--prune option can be included. When providing a refspec like this,
the --prune option does not delete remote refs and instead only
deletes refs in the target refspace.
Update the documentation to clarify how '--refmap=""' works and
create tests to guarantee this behavior remains in the future.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3404.3 is a simple test added by commit d078c39106 ("t3404: todo list
with commented-out commands only aborts", 2018-08-10) which was designed
to test a todo list that only contained commented-out commands. There
were two problems with this test: (1) its title did not reflect the
purpose of the test, and (2) it tested the desired behavior through a
side-effect of other functionality instead of directly testing the
desired behavior discussed in the commit message.
Modify the test to directly test the desired behavior and update the
test title.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b00bf1c9a8 ("git-rebase: make --allow-empty-message the
default", 2018-06-27) made --allow-empty-message the default and thus
turned --allow-empty-message into a no-op but did not update the
documentation to reflect this. Update the documentation now, and hide
the option from the normal -h output since it is not useful.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In builtin/grep.c:add_work() we pre-load the userdiff drivers before
adding the grep_source in the todo list. This operation is currently
being performed after acquiring the grep_mutex, but as it's already
thread-safe, we don't need to protect it here. So let's move it out of
the critical section which should avoid thread contention and improve
performance.
Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's
repository[2], I got the following mean times for 30 executions after 2
warmups:
Original | 6.2886s
-------------------------|-----------
Out of critical section | 5.7852s
[1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running
Manjaro Linux.
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
04-06-2019), after a 'git gc' execution.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops (to the point
that using a single thread would be faster than multiple threads). But
now that zlib inflation can be performed in parallel we can regain the
speedup, so let's re-enable threads in non-worktree grep.
Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:
Threads | Regex 1 | Regex 2
---------|------------|-----------
1 | 17.2920s | 20.9624s
2 | 9.6512s | 11.3184s
4 | 6.7723s | 7.6268s
8** | 6.2886s | 6.9843s
These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
SSD, running Manjaro Linux. But to make sure the optimization also
performs well on HDD, the tests were repeated on another machine with an
i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
5400 rpm), also running Manjaro Linux:
Threads | Regex 1 | Regex 2
---------|------------|-----------
1 | 18.4035s | 22.5368s
2 | 12.5063s | 14.6409s
4** | 10.9136s | 12.7106s
** Note that in these cases we relied on hyper-threading, and that's
probably why we don't see a big difference in time.
Unfortunately, multithreaded git-grep might be slow in the non-worktree
case when --textconv is used and there're too many text conversions.
Probably the reason for this is that the object read lock is used to
protect fill_textconv() and therefore there is a mutual exclusion
between textconv execution and object reading. Because both are
time-consuming operations, not being able to perform them in parallel
can cause performance drops. To inform the users about this (and other
threading details), let's also add a "NOTES ON THREADS" section to
Documentation/git-grep.txt.
[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
04-06-2019), after a 'git gc' execution.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some fields in struct raw_object_store are lazy initialized by the
thread-unsafe packfile.c:prepare_packed_git(). Although this function is
present in the call stack of git-grep threads, all paths to it are
currently protected by obj_read_lock() (and the main thread usually
indirectly calls it before firing the worker threads, anyway). However,
it's possible that future modifications add new unprotected paths to it,
introducing a race condition. Because errors derived from it wouldn't
happen often, it could be hard to detect. So to prevent future
headaches, let's force eager initialization of packed_git when setting
git-grep up. There'll be a small overhead in the cases where we didn't
really need to prepare packed_git during execution but this shouldn't be
very noticeable.
Also, packed_git may be re-initialized by
packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
are already protected by obj_read_lock() but it may suffer from the same
problem in the future. So let's also internally protect it with
obj_read_lock() (which is a recursive mutex).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that object reading operations are internally protected, the
submodule initialization functions at builtin/grep.c:grep_submodule()
are very close to being thread-safe. Let's take a look at each call and
remove from the critical section what we can, for better performance:
- submodule_from_path() and is_submodule_active() cannot be called in
parallel yet only because they call repo_read_gitmodules() which
contains, in its call stack, operations that would otherwise be in
race condition with object reading (for example parse_object() and
is_promisor_remote()). However, they only call repo_read_gitmodules()
if it wasn't read before. So let's pre-read it before firing the
threads and allow these two functions to safely be called in
parallel.
- repo_submodule_init() is already thread-safe, so remove it from the
critical section without other necessary changes.
- The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
no other thread is performing object reading operations in the subrepo
yet. However, threads might be working in the superproject, and this
function calls add_to_alternates_memory() internally, which is racy
with object readings in the superproject. So it must be kept
protected for now. Let's add a "NEEDSWORK" to it, informing why it
cannot be removed from the critical section yet.
- Finally, add_to_alternates_memory() must be kept protected for the
same reason as the item above.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, submodule-config.c doesn't have an externally accessible
function to read gitmodules only if it wasn't already read. But this
exact behavior is internally implemented by gitmodules_read_check(), to
perform a lazy load. Let's merge this function with
repo_read_gitmodules() adding a 'skip_if_read' which allows both
internal and external callers to access this functionality. This
simplifies a little the code. The added option will also be used in
the following patch.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-grep uses 'grep_read_mutex' to protect its calls to object reading
operations. But these have their own internal lock now, which ensures a
better performance (allowing parallel access to more regions). So, let's
remove the former and, instead, activate the latter with
enable_obj_read_lock().
Sections that are currently protected by 'grep_read_mutex' but are not
internally protected by the object reading lock should be surrounded by
obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion
with object reading operations, keeping the current behavior and
avoiding race conditions. Namely, these places are:
In grep.c:
- fill_textconv() at fill_textconv_grep().
- userdiff_get_textconv() at grep_source_1().
In builtin/grep.c:
- parse_object_or_die() and the submodule functions at
grep_submodule().
- deref_tag() and gitmodules_config_oid() at grep_objects().
If these functions become thread-safe, in the future, we might remove
the locking and probably get some speedup.
Note that some of the submodule functions will already be thread-safe
(or close to being thread-safe) with the internal object reading lock.
However, as some of them will require additional modifications to be
removed from the critical section, this will be done in its own patch.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow object reading to be performed by multiple threads protecting it
with an internal lock, the obj_read_mutex. The lock usage can be toggled
with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
functions which can be safely called in parallel are:
read_object_file_extended(), repo_read_object_file(),
read_object_file(), read_object_with_reference(), read_object(),
oid_object_info() and oid_object_info_extended(). It's also possible
to use obj_read_lock() and obj_read_unlock() to protect other sections
that cannot execute in parallel with object reading.
Probably there are many spots in the functions listed above that could
be executed unlocked (and thus, in parallel). But, for now, we are most
interested in allowing parallel access to zlib inflation. This is one of
the sections where object reading spends most of the time in (e.g. up to
one-third of git-grep's execution time in the chromium repo corresponds
to inflation) and it's already thread-safe. So, to take advantage of
that, the obj_read_mutex is released when calling git_inflate() and
re-acquired right after, for every calling spot in
oid_object_info_extended()'s call chain. We may refine this lock to also
exploit other possible parallel spots in the future, but for now,
threaded zlib inflation should already give great speedups for threaded
object reading callers.
Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
it would be now, with the parallel inflation. Take for example the
following situation, where two threads - A and B - are executing the
code at unpack_entry():
1. Thread A is performing the decompression of a base O (which is not
yet in the cache) at PHASE II. Thread B is simultaneously trying to
unpack O, but just starting at PHASE I.
2. Since O is not yet in the cache, B will go to PHASE II to also
perform the decompression.
3. When they finish decompressing, one of them will get the object
reading mutex and go to PHASE III while the other waits for the
mutex. Let’s say A got the mutex first.
4. Thread A will add O to the cache, go throughout the rest of PHASE III
and return.
5. Thread B gets the mutex, also add O to the cache (if the check wasn't
there) and returns.
Finally, it is also important to highlight that the object reading lock
can only ensure thread-safety in the mentioned functions thanks to two
complementary mechanisms: the use of 'struct raw_object_store's
replace_mutex, which guards sections in the object reading machinery
that would otherwise be thread-unsafe; and the 'struct pack_window's
inuse_cnt, which protects window reading operations (such as the one
performed during the inflation of a packed object), allowing them to
execute without the acquisition of the obj_read_mutex.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
replace-object functions are very close to being thread-safe: the only
current racy section is the lazy initialization at
prepare_replace_object(). The following patches will protect some object
reading operations to be called threaded, but before that, replace
functions must be protected. To do so, add a mutex to struct
raw_object_store and acquire it before lazy initializing the
replace_map. This won't cause any noticeable performance drop as the
mutex will no longer be used after the replace_map is initialized.
Later, when the replace functions are called in parallel, thread
debuggers might point our use of the added replace_map_initialized flag
as a data race. However, as this boolean variable is initialized as
false and it's only updated once, there's no real harm. It's perfectly
fine if the value is updated right after a thread read it in
replace-map.h:lookup_replace_object() (there'll only be a performance
penalty for the affected threads at that moment). We could cease the
debugger warning protecting the variable reading at the said function.
However, this would negatively affect performance for all threads
calling it, at any time, so it's not really worthy since the warning
doesn't represent a real problem. Instead, to make sure we don't get
false positives (at ThreadSanitizer, at least) an entry for the
respective function is added to .tsan-suppressions.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
deref_tag() calls is_promisor_object() and parse_object(), both of which
perform lazy initializations and other thread-unsafe operations. If it
was only called by grep_objects() this wouldn't be a problem as the
latter is only executed by the main thread. However, deref_tag() is also
present in read_object_file()'s call stack. So calling deref_tag() in
grep_objects() without acquiring the grep_read_mutex may incur in a race
condition with object reading operations (such as the ones internally
performed by fill_textconv(), called at fill_textconv_grep()). The same
problem happens with the call to gitmodules_config_oid() which also has
parse_object() in its call stack. Fix that protecting both calls with
the said grep_read_mutex.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There're currently two function calls in builtin/grep.c:grep_submodule()
which might result in race conditions:
- submodule_from_path(): it has config_with_options() in its call stack
which, in turn, may have read_object_file() in its own. Therefore,
calling the first function without acquiring grep_read_mutex may end
up causing a race condition with other object read operations
performed by worker threads (for example, at the fill_textconv()
call in grep.c:fill_textconv_grep()).
- parse_object_or_die(): it falls into the same problem, having
repo_has_object_file(the_repository, ...) in its call stack. Besides
that, parse_object(), which is also called by parse_object_or_die(),
is thread-unsafe and also called by object reading functions.
It's unlikely to really fall into a data race with these operations as
the volume of calls to them is usually very low. But we better protect
ourselves against this possibility, anyway. So, to solve these issues,
move both of these function calls into the critical section of
grep_read_mutex.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-grep uses an internal grep_read_mutex to protect object reading
operations. Similarly, there's a grep_attr_mutex to protect calls to the
gitattributes machinery. However, two of the three functions protected
by the last mutex may also perform object reading, as seen below:
- userdiff_get_textconv() > notes_cache_init() >
notes_cache_match_validity() > lookup_commit_reference_gently() >
parse_object() > repo_has_object_file() >
repo_has_object_file_with_flags() > oid_object_info_extended()
- userdiff_find_by_path() > git_check_attr() > collect_some_attrs() >
prepare_attr_stack() > read_attr() > read_attr_from_index() >
read_blob_data_from_index() > read_object_file()
As these calls are not protected by grep_read_mutex, there might be race
conditions with other threads performing object reading (e.g. threads
calling fill_textconv() at grep.c:fill_textconv_grep()). To prevent
that, let's make sure to acquire the lock before both of these calls.
Note: this patch might slow down the threaded grep in worktree, for the
sake of thread-safeness. However, in the following patches, we should
regain performance by replacing grep_read_mutex for an internal object
reading lock and allowing parallel inflation during object reading.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cases when a submodule fetch fails when there are many submodules, the error
from the lone failing submodule fetch is buried under activity on the other
submodules if more than one fetch fell back on fetch-by-oid. Call out a failure
late so the user is aware that something went wrong, and where.
Because fetch_finish() is only called synchronously by
run_processes_parallel, mutexing is not required around
submodules_with_errors.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A test in t7800 tries to make sure that when git-difftool runs an
external tool that fails, it stops looking at files. Our fake failing
tool prints the file name it was asked to diff before exiting non-zero,
and then we confirm the output contains only that file.
However, this subtly relies on our internal reuse_worktree_file().
Because we're diffing between branches, the command run by difftool
might see:
- the git-stored filename (e.g., "file"), if we decided that the
working tree contents were up-to-date with the object in the index
and HEAD, and we could reuse them
- a temporary filename (e.g. "/tmp/abc123_file") if we had to dump the
contents from the object database
If the latter case happens, then the test fails, because it's expecting
the string "file". I discovered this when debugging something unrelated
with reuse_worktree_file(). I _thought_ it should be able to be
triggered by a racy-git situation, but running:
./t7800-difftool.sh --stress --run=2,13
never seems to fail. However, by my reading of reuse_worktree_file(),
this would probably always fail under Cygwin, because it sets
NO_FAST_WORKING_DIRECTORY. At any rate, since reuse_worktree_file()
is meant to be an optimization that may or may not trigger, our test
should be robust either way.
Instead of checking the filename, let's just make sure we got a single
line of output (which would not be true if we continued after the first
failure).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We run a series of hunk-header tests in a loop, and each one does this:
test_when_finished 'cat actual' && # for debugging only
This is pretty pointless. When the test succeeds, we waste time running
a useless cat process. If you're debugging a failure with "-i", then we
won't run the when-finished part at all. So it helps only if you're
running with something like "--verbose-log".
Since we expect the tests to succeed most of the time, a better way to
do this would be a helper that checks the output and dumps "actual" only
when it fails. But it's probably not even worth the effort, as anyone
debugging a failure could just run with "-i" and investigate the
"actual" file themselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of the gcc and clang Address Sanitizer produce test
failures related to regexec(). This triggers with gcc-10 and clang-8
(but not gcc-9 nor clang-7). Running:
make CC=gcc-10 SANITIZE=address test
results in failures in t4018, t3206, and t4062.
The cause seems to be that when built with ASan, we use a different
version of regexec() than normal. And this version doesn't understand
the REG_STARTEND flag. Here's my evidence supporting that.
The failure in t4062 is an ASan warning:
expecting success of 4062.2 '-G matches':
git diff --name-only -G "^(0{64}){64}$" HEAD^ >out &&
test 4096-zeroes.txt = "$(cat out)"
=================================================================
==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220
READ of size 4097 at 0x7fa76f672000 thread T0
#0 0x7fa7726f75b5 (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5)
#1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117
#2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52
#3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166
[...]
In this case we're looking in a buffer which was mmap'd via
reuse_worktree_file(), and whose size is 4096 bytes. But libasan's
regex tries to look at byte 4097 anyway! If we tweak Git like this:
diff --git a/diff.c b/diff.c
index 8e2914c031..cfae60c120 100644
--- a/diff.c
+++ b/diff.c
@@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate,
*/
if (ce_uptodate(ce) ||
(!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0)))
- return 1;
+ return 0;
return 0;
}
to use a regular buffer (with a trailing NUL) instead of an mmap, then
the complaint goes away.
The other failures are actually diff output with an incorrect funcname
header. If I instrument xdiff to show the funcname matching like so:
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 8509f9ea22..f6c3dc1986 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -197,6 +197,7 @@ struct ff_regs {
struct ff_reg {
regex_t re;
int negate;
+ char *printable;
} *array;
};
@@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len,
for (i = 0; i < regs->nr; i++) {
struct ff_reg *reg = regs->array + i;
- if (!regexec_buf(®->re, line, len, 2, pmatch, 0)) {
+ int ret = regexec_buf(®->re, line, len, 2, pmatch, 0);
+ warning("regexec %s:\n regex: %s\n buf: %.*s",
+ ret == 0 ? "matched" : "did not match",
+ reg->printable,
+ (int)len, line);
+ if (!ret) {
if (reg->negate)
return -1;
break;
@@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
expression = value;
if (regcomp(®->re, expression, cflags))
die("Invalid regexp to look for hunk header: %s", expression);
+ reg->printable = xstrdup(expression);
free(buffer);
value = ep + 1;
}
then when compiling with ASan and gcc-10, running the diff from t4018.66
produces this:
$ git diff -U1 cpp-skip-access-specifiers
warning: regexec did not match:
regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
buf: private:
warning: regexec matched:
regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
buf: private:
diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
index 4d4a9db..ebd6f42 100644
--- a/cpp-skip-access-specifiers
+++ b/cpp-skip-access-specifiers
@@ -6,3 +6,3 @@ private:
void DoSomething();
int ChangeMe;
};
void DoSomething();
- int ChangeMe;
+ int IWasChanged;
};
That first regex should match (and is negated, so it should be telling
us _not_ to match "private:"). But it wouldn't if regexec() is looking
at the whole buffer, and not just the length-limited line we've fed to
regexec_buf(). So this is consistent again with REG_STARTEND being
ignored.
The correct output (compiling without ASan, or gcc-9 with Asan) looks
like this:
warning: regexec matched:
regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
buf: private:
[...more lines that we end up not using...]
warning: regexec matched:
regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
buf: class RIGHT : public Baseclass
diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
index 4d4a9db..ebd6f42 100644
--- a/cpp-skip-access-specifiers
+++ b/cpp-skip-access-specifiers
@@ -6,3 +6,3 @@ class RIGHT : public Baseclass
void DoSomething();
- int ChangeMe;
+ int IWasChanged;
};
So it really does seem like libasan's regex engine is ignoring
REG_STARTEND. We should be able to work around it by compiling with
NO_REGEX, which would use our local regexec(). But to make matters even
more interesting, this isn't enough by itself.
Because ASan has support from the compiler, it doesn't seem to intercept
our call to regexec() at the dynamic library level. It actually
recognizes when we are compiling a call to regexec() and replaces it
with ASan-specific code at that point. And unlike most of our other
compat code, where we might have git_mmap() or similar, the actual
symbol name in the compiled compat/regex code is regexec(). So just
compiling with NO_REGEX isn't enough; we still end up in libasan!
We can work around that by having the preprocessor replace regexec with
git_regexec (both in the callers and in the actual implementation), and
we truly end up with a call to our custom regex code, even when
compiling with ASan. That's probably a good thing to do anyway, as it
means anybody looking at the symbols later (e.g., in a debugger) would
have a better indication of which function is which. So we'll do the
same for the other common regex functions (even though just regexec() is
enough to fix this ASan problem).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interactive `add` command allows selecting multiple files for some
of its sub-commands, via unique prefixes, indices or index ranges.
When re-implementing `git add -i` in C, we even added a code comment
talking about ranges with a missing end index, such as `2-`, but the
code did not actually accept those, as pointed out in
https://github.com/git-for-windows/git/issues/2466#issuecomment-574142760.
Let's fix this, and add a test case to verify that this stays fixed
forever.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user does not select any files to `patch` or `diff`, there is
no need to call `run_add_p()` on them.
Even worse: we _have_ to avoid calling `parse_pathspec()` with an empty
list because that would trigger this error:
BUG: pathspec.c:557: PATHSPEC_PREFER_CWD requires arguments
So let's avoid doing any work on a list of files that is empty anyway.
This fixes https://github.com/git-for-windows/git/issues/2466.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 777b420347 (dir: synchronize treat_leading_path() and
read_directory_recursive(), 2019-12-19) tried to add two warning
comments in those functions, pointing at each other. But the one in
treat_leading_path() just points at itself.
Let's fix that. Since the comment also redirects the reader for more
details to "the commit that added this warning", and since we're now
modifying the warning (creating a new commit without those details),
let's mention the actual commit id.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Restructure the code slightly to avoid passing around a struct dirent
anywhere, which also enables us to avoid trying to manufacture one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I was going to title this "dir: more synchronizing of
treat_leading_path() and read_directory_recursive()", a nod to commit
777b420347 ("dir: synchronize treat_leading_path() and
read_directory_recursive()", 2019-12-19), but the title was too long.
Anyway, first the backstory...
fill_directory() has always had a slightly error-prone interface: it
returns a subset of paths which *might* match the specified pathspec; it
was intended to prune away some paths which didn't match the specified
pathspec and keep at least all the ones that did match it. Given this
interface, callers were responsible to post-process the results and
check whether each actually matched the pathspec.
builtin/clean.c did this. It would first prune out duplicates (e.g. if
"dir" was returned as well as all files under "dir/", then it would
simplify this to just "dir"), and after pruning duplicates it would
compare the remaining paths to the specified pathspec(s). This
post-processing itself could run into problems, though, as noted in
commit 404ebceda0 ("dir: also check directories for matching
pathspecs", 2019-09-17):
For the case of git-clean and a set of pathspecs of "dir/file" and
"more", this caused a problem because we'd end up with dir entries
for both of
"dir"
"dir/file"
Then correct_untracked_entries() would try to helpfully prune
duplicates for us by removing "dir/file" since it's under "dir",
leaving us with
"dir"
Since the original pathspec only had "dir/file", the only entry left
doesn't match and leaves nothing to be removed. (Note that if only
one pathspec was specified, e.g. only "dir/file", then the
common_prefix_len optimizations in fill_directory would cause us to
bypass this problem, making it appear in simple tests that we could
correctly remove manually specified pathspecs.)
That commit fixed the issue -- when multiple pathspecs were specified --
by making sure fill_directory() wouldn't return both "dir" and
"dir/file" outside the common_prefix_len optimization path. This is
where it starts to get fun.
In commit b9670c1f5e ("dir: fix checks on common prefix directory",
2019-12-19), we noticed that the common_prefix_len wasn't doing
appropriate checks and letting all kinds of stuff through, resulting in
recursing into .git/ directories and other craziness. So it started
locking down and doing checks on pathnames within that code path. That
continued with commit 777b420347 ("dir: synchronize
treat_leading_path() and read_directory_recursive()", 2019-12-19), which
noted the following:
Our optimization to avoid calling into read_directory_recursive()
when all pathspecs have a common leading directory mean that we need
to match the logic that read_directory_recursive() would use if we
had just called it from the root. Since it does more than call
treat_path() we need to copy that same logic.
...and then it more forcefully addressed the issue with this wonderfully
ironic statement:
Needing to duplicate logic like this means it is guaranteed someone
will eventually need to make further changes and forget to update
both locations. It is tempting to just nuke the leading_directory
special casing to avoid such bugs and simplify the code, but
unpack_trees' verify_clean_subdirectory() also calls
read_directory() and does so with a non-empty leading path, so I'm
hesitant to try to restructure further. Add obnoxious warnings to
treat_leading_path() and read_directory_recursive() to try to warn
people of such problems.
You would think that with such a strongly worded description, that its
author would have actually ensured that the logic in
treat_leading_path() and read_directory_recursive() did actually match
and that *everything* that was needed had at least been copied over at
the time that this paragraph was written. But you'd be wrong, I messed
it up by missing part of the logic.
Copy the missing bits to fix the new final test in t7300.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
b9670c1f5e (dir: fix checks on common prefix directory, 2019-12-19)
modified the way pathspecs are handled when handling a directory
during "git clean -f <path>". While this improved the behavior for
known test breakages, it also regressed in how the clean command
handles cleaning a specified file.
Add a test case that demonstrates this behavior. This test passes
before b9670c1f5e then fails after.
Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the upgrade, the library names changed from libeay32/ssleay32 to
libcrypto/libssl.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make our values hash independent, we turn the directory of the object
into "Y" and the file name into "Z" after having sorted items by their
name. However, when using SHA-256, one of our file names begins with an
"a" character, which means it sorts into the wrong place in the list,
causing the test to fail.
Since we don't care about the order of these items, just sort them after
stripping actual hash contents, which means they'll work with any hash
algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test performs a clone from outside any repository. Consequently,
the hash algorithm used defaults to SHA-1. When the test is running with
SHA-256, this results in an object ID that is not usable by the rest of
the test. In order to ensure that we provide a usable value, switch into
the source repository before hashing.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test uses $_z40 to express an all-zeros object ID, which doesn't
work for SHA-256. Use $ZERO_OID instead, which is the right size for
all hash values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use regex values based on $OID_REGEX instead of hard-coding them based
on expected object ID lengths.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test modifies a pkt-line stream with sed to change a line with
"shallow" to "unshallow". However, this modification is dependent on
the size of the hash in use; with SHA-256, the pkt-line length is
different, leading to the sed command having no effect.
Use test_oid_cache to specify the correct values for each hash so that
the test continues to work.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute the various pkt-line values based on the length of the object
IDs in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use $OID_REGEX instead of hard-coding 40-based regular expressions.
Change invocations of cut with a hard-coded constant to split using a
delimiter instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding invalid object IDs in this test, use test_oid to
look up ones of the appropriate length.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are some offsets in the commit graph files used to corrupt data.
Compute these offsets for both SHA-1 and SHA-256 so that the test works
with either.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test corrupts various locations in a multi-pack index to test
various error responses. However, these offsets differ between SHA-1
indexes and SHA-256 indexes due to differences in object length. Use
test_oid to look up the correct offsets based on the algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using SHA-1, the existing value of the byte we use is 0x13, so
writing the byte 0x07 serves to corrupt the test and verify that we
detect corruption. However, when we use SHA-256, the value at that
offset is already 0x07, so our "corruption" doesn't work and the test
fails to detect it.
To provide a value that is truly out of range, let's use 0xff, which is
not likely to be a valid value as the high byte of a two-byte offset in
a multi-pack index this small.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running with SHA-256 as the hash algorithm, the hash version octet
is 2 instead of 1. Pick the right value depending on the hash algorithm
and use it where we look for the existing value. To ensure the test
checking for invalid data passes, use 3 as the test value for an invalid
hash version.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes values for object IDs instead of
using hard-coded hashes. Move the heredocs later in the tests so we can
take advantage of computed values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes values for object IDs instead of
using hard-coded hashes. Additionally, update the sanitize_output
function to sanitize the index lines in diff output, since it's clear
from the assertions in question that we are not interested in the
specific object IDs.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding a fixed length example object ID in the test,
look one up using the translation tables.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a specific invalid hard-coded object ID, generate one
of the appropriate length by looking one up in the translation tables.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the object ID used in the index line will differ between different
algorithms, compute these values instead of hard-coding them.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the length of an object ID when creating a tree,
generate it based on $ZERO_OID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Complete paths after 'git worktree add <TAB>' and refs after 'git
worktree add -b <TAB>' and 'git worktree add some/dir <TAB>'.
Uncharacteristically for a Git command, 'git worktree add' takes a
mandatory path parameter before a commit-ish as its optional last
parameter. In addition, it has both standalone --options and options
with a mandatory unstuck parameter ('-b <new-branch>'). Consequently,
trying to complete refs for that last optional commit-ish parameter
resulted in a more convoluted than usual completion function, but
hopefully all the included comments will make it not too hard to
digest.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Complete the paths of existing working trees for 'git worktree's
'move', 'remove', 'lock', and 'unlock' subcommands.
Note that 'git worktree list --porcelain' shows absolute paths, so for
simplicity's sake we'll complete full absolute paths as well (as
opposed to turning them into relative paths by finding common leading
directories between $PWD and the working tree's path and removing
them, risking trouble with symbolic links or Windows drive letters; or
completing them one path component at a time).
Never list the path of the main working tree, as it cannot be moved,
removed, locked, or unlocked.
Ideally we would only list unlocked working trees for the 'move',
'remove', and 'lock' subcommands, and only locked ones for 'unlock'.
Alas, 'git worktree list --porcelain' doesn't indicate which working
trees are locked, so for now we'll complete the paths of all existing
working trees.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The completion function for 'git worktree' uses separate but very
similar case arms to complete --options for each subcommand.
Combine these into a single case arm to avoid repetition.
Note that after this change we won't complete 'git worktree remove's
'--force' option, but that is consistent with our general stance on
not offering '--force', as it should be used with care.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the __git_find_on_cmdline() helper function so far we've
only been interested in which one of a set of words appear on the
command line. To complete options for some of 'git worktree's
subcommands in the following patches we'll need not only that, but the
index of that word on the command line as well.
Extend __git_find_on_cmdline() to optionally show the index of the
found word on the command line (IOW in the $words array) when the
'--show-idx' option is given.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The __git_find_on_cmdline() helper function started its life as
__git_find_subcommand() [1], but it served a more general purpose than
looking for subcommands, so later it was renamed accordingly [2].
However, that rename didn't touch the body of the function, and left
the $subcommand local variable behind, still reminiscent of the
function's original purpose.
Let's clean up the names of __git_find_on_cmdline()'s local variables
and get rid of that $subcommand variable name.
While at it, add a short comment describing the function's purpose.
[1] 3ff1320d4b (bash: refactor searching for subcommands on the
command line, 2008-03-10),
[2] 918c03c2a7 (bash: rename __git_find_subcommand() to
__git_find_on_cmdline(), 2009-09-15)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following two patches will refactor and extend the
__git_find_on_cmdline() helper function, so let's add a few tests
first to make sure that its basic behavior doesn't change.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, signature verification for merge and pull operations checked
if the key had a trust-level of either TRUST_NEVER or TRUST_UNDEFINED in
verify_merge_signature(). If that was the case, the process die()d.
The other code paths that did signature verification relied entirely on
the return code from check_commit_signature(). And signatures made with
a good key, irregardless of its trust level, was considered valid by
check_commit_signature().
This difference in behavior might induce users to erroneously assume
that the trust level of a key in their keyring is always considered by
Git, even for operations where it is not (e.g. during a verify-commit or
verify-tag).
The way it worked was by gpg-interface.c storing the result from the
key/signature status *and* the lowest-two trust levels in the `result`
member of the signature_check structure (the last of these status lines
that were encountered got written to `result`). These are documented in
GPG under the subsection `General status codes` and `Key related`,
respectively [1].
The GPG documentation says the following on the TRUST_ status codes [1]:
"""
These are several similar status codes:
- TRUST_UNDEFINED <error_token>
- TRUST_NEVER <error_token>
- TRUST_MARGINAL [0 [<validation_model>]]
- TRUST_FULLY [0 [<validation_model>]]
- TRUST_ULTIMATE [0 [<validation_model>]]
For good signatures one of these status lines are emitted to
indicate the validity of the key used to create the signature.
The error token values are currently only emitted by gpgsm.
"""
My interpretation is that the trust level is conceptionally different
from the validity of the key and/or signature. That seems to also have
been the assumption of the old code in check_signature() where a result
of 'G' (as in GOODSIG) and 'U' (as in TRUST_NEVER or TRUST_UNDEFINED)
were both considered a success.
The two cases where a result of 'U' had special meaning were in
verify_merge_signature() (where this caused git to die()) and in
format_commit_one() (where it affected the output of the %G? format
specifier).
I think it makes sense to refactor the processing of TRUST_ status lines
such that users can configure a minimum trust level that is enforced
globally, rather than have individual parts of git (e.g. merge) do it
themselves (except for a grace period with backward compatibility).
I also think it makes sense to not store the trust level in the same
struct member as the key/signature status. While the presence of a
TRUST_ status code does imply that the signature is good (see the first
paragraph in the included snippet above), as far as I can tell, the
order of the status lines from GPG isn't well-defined; thus it would
seem plausible that the trust level could be overwritten with the
key/signature status if they were stored in the same member of the
signature_check structure.
This patch introduces a new configuration option: gpg.minTrustLevel. It
consolidates trust-level verification to gpg-interface.c and adds a new
`trust_level` member to the signature_check structure.
Backward-compatibility is maintained by introducing a special case in
verify_merge_signature() such that if no user-configurable
gpg.minTrustLevel is set, then the old behavior of rejecting
TRUST_UNDEFINED and TRUST_NEVER is enforced. If, on the other hand,
gpg.minTrustLevel is set, then that value overrides the old behavior.
Similarly, the %G? format specifier will continue show 'U' for
signatures made with a key that has a trust level of TRUST_UNDEFINED or
TRUST_NEVER, even though the 'U' character no longer exist in the
`result` member of the signature_check structure. A new format
specifier, %GT, is also introduced for users that want to show all
possible trust levels for a signature.
Another approach would have been to simply drop the trust-level
requirement in verify_merge_signature(). This would also have made the
behavior consistent with other parts of git that perform signature
verification. However, requiring a minimum trust level for signing keys
does seem to have a real-world use-case. For example, the build system
used by the Qubes OS project currently parses the raw output from
verify-tag in order to assert a minimum trust level for keys used to
sign git tags [2].
[1] https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=blob;f=doc/doc/DETAILS;h=bd00006e933ac56719b1edd2478ecd79273eae72;hb=refs/heads/master
[2] 9674c1991d/scripts/verify-git-tag (L43)
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Git users at $DAYJOB have been using protocol v2 as a default for
~1.5 years now and others have been also reporting good experiences
with it, so it seems like a good time to bump the default version. It
produces a significant performance improvement when fetching from
repositories with many refs, such as
https://chromium.googlesource.com/chromium/src.
This only affects the client, not the server. (The server already
defaults to supporting protocol v2.) The protocol change is backward
compatible, so this should produce no significant effect when
contacting servers that only speak protocol v0.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_PROTOCOL_VERSION environment variable can be used to
upgrade the version of Git protocol used in tests. If both
GIT_TEST_PROTOCOL_VERSION and 'protocol.version' are set, the higher
value wins.
For usage within tests, these semantics are too complex. Instead,
always use the value from protocol.version configuration when it is
set, falling back to GIT_TEST_PROTOCOL_VERSION. This way, the envvar
provides a reliable preview of what will happen if the default
protocol version is changed.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8cbeba0632 (tests: define GIT_TEST_PROTOCOL_VERSION,
2019-02-25), it has been possible to run tests with a newer protocol
version by setting the GIT_TEST_PROTOCOL_VERSION envvar to a version
number. Tests that assume protocol v0 handle this by explicitly
setting
GIT_TEST_PROTOCOL_VERSION=
or similar constructs like 'test -z "$GIT_TEST_PROTOCOL_VERSION" ||
return 0' to declare that they only handle the default (v0) protocol.
The emphasis there is a bit off: it would be clearer to specify
GIT_TEST_PROTOCOL_VERSION=0 to inform the reader that these tests are
specifically testing and relying on details of protocol v0. Do so.
This way, a reader does not need to know what the default protocol
version is, and the tests can continue to work when the default
protocol version used by Git advances past v0.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's protocol version 2 has been working well in production for over
a year. Simplify documentation by no longer referring to it as
experimental.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cat-file -e" uses has_object_file, which can fetch from promisor
remotes when an object is missing. These tests end up checking that
that fetch fails instead of for the object being missing.
By luck, the tests pass anyway:
- in one of these tests ("filtering by size"), the fetch fails because
(in protocol v0) the server does not support fetches by SHA-1
- in the second, the object is present but the test could pass even if
it weren't if the fetch succeeds
- in the third, the test sets extensions.partialClone to "arbitrary
string" so that when it tries to fetch, it looks up the "arbitrary
string" remote which does not exist
Use "git rev-list --objects --missing=allow-any", so that the tests
pass for the right reason.
Noticed while testing with protocol v2, which allows fetching by sha1
by default, causing the first fetch to succeed and the test to fail.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 633a53179e (fetch test: avoid use of "VAR= cmd" with a shell
function, 2019-12-26), t5552.5 (do not send "have" with ancestors of
commits that server ACKed) fails when run with
GIT_TEST_PROTOCOL_VERSION=2.
The cause:
The progression of "have"s sent in negotiation depends on whether we
are using a stateless RPC based transport or a stateful bidirectional
one (see for example 44d8dc54e7, "Fix potential local deadlock during
fetch-pack", 2011-03-29). In protocol v2, all transports are
stateless transports, while in protocol v0, transports such as local
access and ssh are stateful.
In stateful transports, the number of "have"s to send multiplies by
two each round until we reach PIPESAFE_FLUSH (that is, 32), and then
it increases by PIPESAFE_FLUSH each round. In stateless transport,
the count multiplies by two each round until we reach LARGE_FLUSH
(which is 16384) and then multiplies by 1.1 each round after that.
Moreover, in stateful transports, as fetch-pack.c explains:
We keep one window "ahead" of the other side, and will wait
for an ACK only on the next one.
This affects t5552.5 because it looks for "have"s from the negotiator
that appear in that second window. With protocol version 2, the
second window never arrives, and the test fails.
Until 633a53179e (2019-12-26), a previous test in the same file
contained
GIT_TEST_PROTOCOL_VERSION= trace_fetch client origin to_fetch
In many common shells (e.g. bash when run as "sh"), the setting of
GIT_TEST_PROTOCOL_VERSION to the empty string lasts beyond the
intended duration of the trace_fetch invocation. This causes it to
override the GIT_TEST_PROTOCOL_VERSION setting that was passed in to
the test during the remainder of the test script, so t5552.5 never got
run using protocol v2 on those shells, regardless of the
GIT_TEST_PROTOCOL_VERSION setting from the environment. 633a53179e
fixed that, revealing the failing test.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like assigning a nonempty value, assigning an empty value to a
shell variable when calling a function produces non-portable behavior:
in some shells, the assignment lasts for the duration of the function
invocation, and in others, it persists after the function returns.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like assigning a nonempty value, assigning an empty value to a
shell variable when calling a function produces non-portable behavior:
in some shells, the assignment lasts for the duration of the function
invocation, and in others, it persists after the function returns.
Use an explicit subshell with the envvar exported to make the behavior
consistent across shells and crystal clear.
All previous instances of this pattern used "VAR=value" (with nonempty
`value`), which is already diagnosed automatically by "make test-lint"
since a0a630192d (t/check-non-portable-shell: detect "FOO=bar
shell_func", 2018-07-13).
Noticed using an improved "make test-lint".
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python3 deprecates raw_input() from 2.7 and replaced it with input().
Since we do not need 2.7's input() semantics, `raw_input()` is aliased
to `input()` for easy forward compatability.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not clear why a generator was used to create the regex used to
parse git-diff-tree output; I assume an early implementation required
it, but is not part of the mainline change.
Simply use a lazily initialized global instead.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python3 uses dict.items() instead of .iteritems() to provide
iteratoration over dict. Although items() is technically less efficient
for python2.7 (allocates a new list instead of simply iterating), the
amount of data involved is very small and the penalty negligible.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For python3, reduce() has been moved to functools.reduce(). This is
also available in python2.7.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As part of its importing process, git-p4 sends a `checkpoint` followed
immediately by `progress` to fast-import to force synchronization.
Due to buffering, it is possible for the `progress` command to not be
flushed before git-p4 begins to wait for the corresponding response.
This causes the script to freeze completely, and is consistently
observable at least on python-3.6.9.
Make sure this command sequence is completely flushed before waiting.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
p4 does not appear to understand marshal format version 3 and above.
Version 2 was the latest supported by python-2.7.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Opening .gitp4-usercache.txt in text mode makes python 3 happy without
explicitly adding encoding and decoding.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
P4 allows essentially arbitrary encoding for path data while we would
perfer to be dealing only with unicode strings. Since path data need to
survive round-trip back to p4, this patch implements the general policy
that we store path data as-is, but decode them to unicode before doing
any non-trivial processing.
A new `decode_path()` method is provided that generally does the correct
conversion, taking into account `git-p4.pathEncoding` configuration.
For python2.7, path strings will be left as-is if it only contains ASCII
characters.
For python3, decoding is always done so that we have str objects.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under python3, calls to write() on the stream to `git fast-import` must
be encoded. This patch wraps the IO object such that this encoding is
done transparently.
Conversely, any text data read from subprocesses must also be decoded
before running through the rest of the pipeline.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Reviewed-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The marshalled dict in the response given on STDOUT by p4 uses `str` for
keys and string values. When run using python3, these values are
deserialized as `bytes`, leading to a whole host of problems as the rest
of the code assumes `str` is used throughout.
This patch changes the deserialization behaviour such that, as much as
possible, text output from p4 is decoded to native unicode strings.
Exceptions are made for the field `data` as it is usually arbitrary
binary data. `depotFile[0-9]*`, `path`, and `clientFile` are also exempt
as they contain path strings not encoded with UTF-8, and must survive
round-trip back to p4.
Conversely, text data being piped to p4 must always be encoded when
running under python3.
encode_text_stream() and decode_text_stream() were added to make these
transformations more convenient.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that python2.7 is the minimum required version and we no longer use
the basestring type, it is not necessary to use type aliasing to ensure
python3 compatibility.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 3 handles strings differently than Python 2.7. Since Python 2
is reaching it's end of life, a series of changes are being submitted to
enable python 3.5 and following support. The current code fails basic
tests under python 3.5.
Some codepaths can represent a command line the program
internally prepares to execute either as a single string
(i.e. each token properly quoted, concatenated with $IFS) or
as a list of argv[] elements, and there are 9 places where
we say "if X is isinstance(_, basestring), then do this
thing to handle X as a command line in a single string; if
not, X is a command line in a list form".
This does not work well with Python 3, as there is no
basestring (everything is Unicode now), and even with Python
2, it was not an ideal way to tell the two cases apart,
because an internally formed command line could have been in
a single Unicode string.
Flip the check to say "if X is not a list, then handle X as
a command line in a single string; otherwise treat it as a
command line in a list form".
This will get rid of references to 'basestring', to migrate
the code ready for Python 3.
Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python2.6 and earlier have been end-of-life'd for many years now, and
we actually already use 2.7-only features in the code. Make the version
check reflect current realities.
This also removes the need to explicitly define CalledProcessError if
it's not available.
Signed-off-by: Yang Zhao <yang.zhao@skyboxlabs.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the advise function in advice.c to display hints to the users, as
it provides a neat and a standard format for hint messages, i.e: the
text is colored in yellow and the line starts by the word "hint:".
Also this will enable us to control the messages using advice.*
configuration variables.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fix resolves the previously-added test_expect_failure in
t4215-log-skewed-merges.sh.
The issue lies in the "else" condition while updating the mapping
inside graph_output_collapsing_line(). In 0f0f389f (graph: tidy up
display of left-skewed merges, 2019-10-15), the output of left-
skewed merges was changed to allow an immediate horizontal edge in
the first parent, output by graph_output_post_merge_line() instead
of by graph_output_collapsing_line(). This condensed the first line
behavior as follows:
Before 0f0f389f:
| | | | | | *-.
| | | | | | |\ \
| |_|_|_|_|/ | |
|/| | | | | / /
After 0f0f389f:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
However, a very subtle issue arose when the second and third parent
edges are collapsed in later steps. The second parent edge is now
immediately adjacent to a vertical edge. This means that the
condition
} else if (graph->mapping[i - 1] < 0) {
in graph_output_collapsing_line() evaluates as false. The block for
this condition was the only place where we connected the target
column with the current position with horizontal edge markers.
In this case, the final "else" block is run, and the edge is marked
as horizontal, but did not back-fill the blank columns between the
target and the current edge. Since the second parent edge is marked
as horizontal, the third parent edge is not marked as horizontal.
This causes the output to continue as follows:
Before this change:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
By adding the logic for "filling" a horizontal edge between the
target column and the current column, we are able to resolve the
issue.
After this change:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
This output properly matches the expected blend of the edge
behavior before 0f0f389f and the merge commit rendering from
0f0f389f.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A previous test in t4215-log-skewed-merges.sh was added to demonstrate
exactly the topology of a reported failure in "git log --graph". While
investigating the fix, we realized that multiple edges that could
collapse with horizontal lines were not doing so.
Specifically, examine this section of the graph:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | |
Document this behavior with a test. This behavior is new, as the
behavior in v2.24.1 has the following output:
| | | | | | *-.
| | | | | | |\ \
| |_|_|_|_|/ / /
|/| | | | | / /
| | |_|_|_|/ /
| |/| | | | /
| | | |_|_|/
| | |/| | |
| | * | | |
The behavior changed logically in 479db18b ("graph: smooth appearance
of collapsing edges on commit lines", 2019-10-15), but was actually
broken due to an assert() bug in 458152cc ("graph: example of graph
output that can be simplified", 2019-10-15). A future change could
modify this behavior to do the following instead:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | |
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, `parse_pathspec_file()` was tested indirectly by invoking
git commands with properly crafted inputs. As demonstrated by the
previous bugfix, testing complicated black boxes indirectly can lead to
tests that silently test the wrong thing.
Introduce direct tests for `parse_pathspec_file()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While working on the next patch, I also noticed that quotes testing via
`"\"file\\101.t\""` was somewhat incorrect: I escaped `\` one time while
I had to escape it two times! Tests still worked due to `"` being
preserved which in turn prevented pathspec from matching files.
Fix this by using here-doc instead.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also move some old tests into the new tests: it doesn't seem reasonable
to have individual error condition tests.
Old test for `git commit` was corrected, previously it was instructed
to use stdin but wasn't provided with any stdin. While this works at
the moment, it's not exactly perfect.
Old tests for `git reset` were improved to test for a specific error
message.
Suggested-By: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This job runs the test suite twice, once in regular mode, and once with
a whole slew of `GIT_TEST_*` variables set.
Now that the built-in version of `git add --interactive` is
feature-complete, let's also throw `GIT_TEST_ADD_I_USE_BUILTIN` into
that fray.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `interactive.singlekey = true`, we react immediately to keystrokes,
even to Escape sequences (e.g. when pressing a cursor key).
The problem with Escape sequences is that we do not really know when
they are done, and as a heuristic we poll standard input for half a
second to make sure that we got all of it.
While waiting half a second is not asking for a whole lot, it can become
quite annoying over time, therefore with this patch, we read the
terminal capabilities (if available) and extract known Escape sequences
from there, then stop polling immediately when we detected that the user
pressed a key that generated such a known sequence.
This recapitulates the remaining part of b5cc003253 (add -i: ignore
terminal escape sequences, 2011-05-17).
Note: We do *not* query the terminal capabilities directly. That would
either require a lot of platform-specific code, or it would require
linking to a library such as ncurses.
Linking to a library in the built-ins is something we try very hard to
avoid (we even kicked the libcurl dependency to a non-built-in remote
helper, just to shave off a tiny fraction of a second from Git's startup
time). And the platform-specific code would be a maintenance nightmare.
Even worse: in Git for Windows' case, we would need to query MSYS2
pseudo terminals, which `git.exe` simply cannot do (because it is
intentionally *not* an MSYS2 program).
To address this, we simply spawn `infocmp -L -1` and parse its output
(which works even in Git for Windows, because that helper is included in
the end-user facing installations).
This is done only once, as in the Perl version, but it is done only when
the first Escape sequence is encountered, not upon startup of `git add
-i`; This saves on startup time, yet makes reacting to the first Escape
sequence slightly more sluggish. But it allows us to keep the
terminal-related code encapsulated in the `compat/terminal.c` file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This recapitulates part of b5cc003253 (add -i: ignore terminal escape
sequences, 2011-05-17):
add -i: ignore terminal escape sequences
On the author's terminal, the up-arrow input sequence is ^[[A, and
thus fat-fingering an up-arrow into 'git checkout -p' is quite
dangerous: git-add--interactive.perl will ignore the ^[ and [
characters and happily treat A as "discard everything".
As a band-aid fix, use Term::Cap to get all terminal capabilities.
Then use the heuristic that any capability value that starts with ^[
(i.e., \e in perl) must be a key input sequence. Finally, given an
input that starts with ^[, read more characters until we have read a
full escape sequence, then return that to the caller. We use a
timeout of 0.5 seconds on the subsequent reads to avoid getting stuck
if the user actually input a lone ^[.
Since none of the currently recognized keys start with ^[, the net
result is that the sequence as a whole will be ignored and the help
displayed.
Note that we leave part for later which uses "Term::Cap to get all
terminal capabilities", for several reasons:
1. it is actually not really necessary, as the timeout of 0.5 seconds
should be plenty sufficient to catch Escape sequences,
2. it is cleaner to keep the change to special-case Escape sequences
separate from the change that reads all terminal capabilities to
speed things up, and
3. in practice, relying on the terminal capabilities is a bit overrated,
as the information could be incomplete, or plain wrong. For example,
in this developer's tmux sessions, the terminal capabilities claim
that the "cursor up" sequence is ^[M, but the actual sequence
produced by the "cursor up" key is ^[[A.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Perl version of `git add -p` supports this config setting to allow
users to input commands via single characters (as opposed to having to
press the <Enter> key afterwards).
This is an opt-in feature because it requires Perl packages
(Term::ReadKey and Term::Cap, where it tries to handle an absence of the
latter package gracefully) to work. Note that at least on Ubuntu, that
Perl package is not installed by default (it needs to be installed via
`sudo apt-get install libterm-readkey-perl`), so this feature is
probably not used a whole lot.
In C, we obviously do not have these packages available, but we just
introduced `read_single_keystroke()` that is similar to what
Term::ReadKey provides, and we use that here.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typically, input on the command-line is line-based. It is actually not
really easy to get single characters (or better put: keystrokes).
We provide two implementations here:
- One that handles `/dev/tty` based systems as well as native Windows.
The former uses the `tcsetattr()` function to put the terminal into
"raw mode", which allows us to read individual keystrokes, one by one.
The latter uses `stty.exe` to do the same, falling back to direct
Win32 Console access.
Thanks to the refactoring leading up to this commit, this is a single
function, with the platform-specific details hidden away in
conditionally-compiled code blocks.
- A fall-back which simply punts and reads back an entire line.
Note that the function writes the keystroke into an `strbuf` rather than
a `char`, in preparation for reading Escape sequences (e.g. when the
user hit an arrow key). This is also required for UTF-8 sequences in
case the keystroke corresponds to a non-ASCII letter.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' Git Bash runs in MinTTY by default, which does not have
a Win32 Console instance, but uses MSYS2 pseudo terminals instead.
This is a problem, as Git for Windows does not want to use the MSYS2
emulation layer for Git itself, and therefore has no direct way to
interact with that pseudo terminal.
As a workaround, use the `stty` utility (which is included in Git for
Windows, and which *is* an MSYS2 program, so it knows how to deal with
the pseudo terminal).
Note: If Git runs in a regular CMD or PowerShell window, there *is* a
regular Win32 Console to work with. This is not a problem for the MSYS2
`stty`: it copes with this scenario just fine.
Also note that we introduce support for more bits than would be
necessary for a mere `disable_echo()` here, in preparation for the
upcoming `enable_non_canonical()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are about to introduce the function `enable_non_canonical()`, which
shares almost the complete code with `disable_echo()`.
Let's prepare for that, by refactoring out that shared code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Perl version of `git add -p` reads the config setting
`diff.algorithm` and if set, uses it to generate the diff using the
specified algorithm.
This patch ports that functionality to the C version.
Note: just like `git-add--interactive.perl`, we do _not_ respect this
config setting in `git add -i`'s `diff` command, but _only_ in the
`patch` command.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Perl version supports post-processing the colored diff (that is
generated in addition to the uncolored diff, intended to offer a
prettier user experience) by a command configured via that config
setting, and now the built-in version does that, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 42f7d45428 (add--interactive: detect bogus diffFilter output,
2018-03-03), we added a test case that verifies that the diffFilter
feature complains appropriately when the output is too short.
In preparation for the upcoming change where the built-in `add -p` is
taught to respect that setting, let's adjust that test a little. The
problem is that `echo too-short` is configured as diffFilter, and it
does not read the `stdin`. When calling it through `pipe_command()`, it
is therefore possible that we try to feed the `diff` to it while it is
no longer listening, and we receive a `SIGPIPE`.
The Perl code apparently handles this in a way similar to an
end-of-file, but taking a step back, we realize that a diffFilter that
does not even _look_ at its standard input is very unrealistic. The
entire point of this feature is to transform the diff, not to ignore it
altogether.
So let's modify the test case to reflect that insight: instead of
printing some bogus text, let's use a diffFilter that deletes the first
line of the diff instead.
This still tests for the same thing, but it does not confuse the
built-in `add -p` with that `SIGPIPE`.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless --force is specified, 'submodule add' checks if the destination
path is ignored by calling 'git add --dry-run --ignore-missing', and,
if that call fails, aborts with a custom "path is ignored" message (a
slight variant of what 'git add' shows). Aborting early rather than
letting the downstream 'git add' call fail is done so that the command
exits before cloning into the destination path. However, in rare
cases where the dry-run call fails for a reason other than the path
being ignored---for example, due to a preexisting index.lock
file---displaying the "ignored path" error message hides the real
source of the failure.
Instead of displaying the tailored "ignored path" message, let's
report the standard error from the dry run to give the caller more
accurate information about failures that are not due to an ignored
path.
For the ignored path case, this leads to the following change in the
error message:
The following [-path is-]{+paths are+} ignored by one of your .gitignore files:
<destination path>
Use -f if you really want to add [-it.-]{+them.+}
The new phrasing is a bit awkward, because 'submodule add' is only
dealing with one destination path. Alternatively, we could continue
to use the tailored message when the exit code is 1 (the expected
status for a failure due to an ignored path) and relay the standard
error for all other non-zero exits. That, however, risks hiding the
message of unrelated failures that share an exit code of 1, so it
doesn't seem worth doing just to avoid a clunkier, but still clear,
error message.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some file monitors like watchman will use something other than a timestamp
to keep better track of what changes happen in between calls to query
the fsmonitor. The clockid in watchman is a string. Now that the index
is storing an opaque token for the last update the code needs to be
updated to pass that opaque token to a verion 2 of the fsmonitor hook.
Because there are repos that already have version 1 of the hook and we
want them to continue to work when git is updated, we need to handle
both version 1 and version 2 of the hook. In order to do that a
config value is being added core.fsmonitorHookVersion to force what
version of the hook should be used. When this is not set it will default
to -1 and then the code will attempt to call version 2 of the hook first.
If that fails it will fallback to trying version 1.
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some file system monitors might not use or take a timestamp for processing
and in the case of watchman could have race conditions with using a
timestamp. Watchman uses something called a clockid that is used for race
free queries to it. The clockid for watchman is simply a string.
Change the fsmonitor_last_update from being a uint64_t to a char pointer
so that any arbitrary data can be stored in it and passed back to the
fsmonitor.
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further tweak to a "no backslash in indexed paths" for Windows port
we applied earlier.
* js/mingw-loosen-overstrict-tree-entry-checks:
mingw: safeguard better against backslashes in file names
In 224c7d70fa (mingw: only test index entries for backslashes, not tree
entries, 2019-12-31), we relaxed the check for backslashes in tree
entries to check only index entries.
However, the code change was incorrect: it was added to
`add_index_entry_with_check()`, not to `add_index_entry()`, so under
certain circumstances it was possible to side-step the protection.
Besides, the description of that commit purported that all index entries
would be checked when in fact they were only checked when being added to
the index (there are code paths that do not do that, constructing
"transient" index entries).
In any case, it was pointed out in one insightful review at
https://github.com/git-for-windows/git/pull/2437#issuecomment-566771835
that it would be a much better idea to teach `verify_path()` to perform
the check for a backslash. This is safer, even if it comes with two
notable drawbacks:
- `verify_path()` cannot say _what_ is wrong with the path, therefore
the user will no longer be told that there was a backslash in the
path, only that the path was invalid.
- The `git apply` command also calls the `verify_path()` function, and
might have been able to handle Windows-style paths (i.e. with
backslashes instead of forward slashes). This will no longer be
possible unless the user (temporarily) sets `core.protectNTFS=false`.
Note that `git add <windows-path>` will _still_ work because
`normalize_path_copy_len()` will convert the backslashes to forward
slashes before hitting the code path that creates an index entry.
The clear advantage is that `verify_path()`'s purpose is to check the
validity of the file name, therefore we naturally tap into all the code
paths that need safeguarding, also implicitly into future code paths.
The benefits of that approach outweigh the downsides, so let's move the
check from `add_index_entry_with_check()` to `verify_path()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The clear_ce_flags_dir() method processes the cache entries within
a common directory. The returned int is the number of cache entries
processed by that directory. When using the sparse-checkout feature
in cone mode, we can skip the pattern matching for entries in the
directories that are entirely included or entirely excluded.
eb42feca (unpack-trees: hash less in cone mode, 2019-11-21)
introduced this performance feature. The old mechanism relied on
the counts returned by calling clear_ce_flags_1(), but the new
mechanism calculated the number of rows by subtracting "cache_end"
from "cache" to find the size of the range. However, the equation
is wrong because it divides by sizeof(struct cache_entry *). This
is not how pointer arithmetic works!
A coverity build of Git for Windows in preparation for the 2.25.0
release found this issue with the warning, "Pointer differences,
such as cache_end - cache, are automatically scaled down by the
size (8 bytes) of the pointed-to type (struct cache_entry *).
Most likely, the division by sizeof(struct cache_entry *) is
extraneous and should be eliminated." This warning is correct.
This leaves us with the question "how did this even work?" The
problem that occurs with this incorrect pointer arithmetic is
a performance-only bug, and a very slight one at that. Since
the entry count returned by clear_ce_flags_dir() is reduced by
a factor of 8, the loop in clear_ce_flags_1() will re-process
entries from those directories.
By inserting global counters into unpack-tree.c and tracing
them with trace2_data_intmax() (in a private change, for
testing), I was able to see count how many times the loop inside
clear_ce_flags_1() processed an entry and how many times
clear_ce_flags_dir() was called. Each of these are reduced by at
least a factor of 8 with the current change. A factor larger
than 8 happens when multiple levels of directories are repeated.
Specifically, in the Linux kernel repo, the command
git sparse-checkout set LICENSES
restricts the working directory to only the files at root and
in the LICENSES directory. Here are the measured counts:
clear_ce_flags_1 loop blocks:
Before: 11,520
After: 1,621
clear_ce_flags_dir calls:
Before: 7,048
After: 606
While these are dramatic counts, the time spent in
clear_ce_flags_1() is under one millisecond in each case, so
the improvement is not measurable as an end-to-end time.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The english term generation is here not used in the sense of "to
generate" but in the sense of "generations of beings".
This corrects the initial translation from cf4c0c25 (l10n: update German
translation, 2018-12-06).
Fixed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
The whole submoduleAlternateErrorStrategyDie item is interpreted as
being part of the supporting content of the preceding item. This is
because we don't give a double-colon "::" for the separator, but just a
single colon, ":". Let's fix that.
There are a few other matches for [^:]:\s*$ in Documentation/config, but
I didn't spot any similar bugs among them.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since recent updates to the log graph rendering code, drawing
certain merges started triggering an assert on a condition that
would no longer hold true, which has been corrected.
* ds/graph-assert-fix:
graph: fix lack of color in horizontal lines
graph: drop assert() for merge with two collapsing parents
* https://github.com/prati0100/git-gui:
git-gui: allow opening currently selected file in default app
git-gui: allow closing console window with Escape
git gui: fix branch name encoding error
git-gui: revert untracked files by deleting them
git-gui: update status bar to track operations
git-gui: consolidate naming conventions
In some cases, horizontal lines in rendered graphs can lose their
coloring. This is due to a use of graph_line_addch() instead of
graph_line_write_column(). Using a ternary operator to pick the
character is nice for compact code, but we actually need a column to
provide the color.
Add a test to t4215-log-skewed-merges.sh to prevent regression.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git log --graph" shows a merge commit that has two collapsing
lines, like:
| | | | *
| |_|_|/|
|/| | |/
| | |/|
| |/| |
| * | |
* | | |
we trigger an assert():
graph.c:1228: graph_output_collapsing_line: Assertion
`graph->mapping[i - 3] == target' failed.
The assert was introduced by eaf158f8 ("graph API: Use horizontal
lines for more compact graphs", 2009-04-21), which is quite old.
This assert is trying to say that when we complete a horizontal
line with a single slash, it is because we have reached our target.
It is actually the _second_ collapsing line that hits this assert.
The reason we are in this code path is because we are collapsing
the first line, and in that case we are hitting our target now
that the horizontal line is complete. However, the second line
cannot be a horizontal line, so it will collapse without horizontal
lines. In this case, it is inappropriate to assert that we have
reached our target, as we need to continue for another column
before reaching the target. Dropping the assert is safe here.
The new behavior in 0f0f389f12 (graph: tidy up display of
left-skewed merges, 2019-10-15) caused the behavior change that
made this assertion failure possible. In addition to making the
assert possible, it also changed how multiple edges collapse.
In a larger example, the current code will output a collapse
as follows:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | |
However, the intended collapse should allow multiple horizontal lines
as follows:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | |
This behavior is not corrected by this change, but is noted for a later
update.
Helped-by: Jeff King <peff@peff.net>
Reported-by: Bradley Smith <brad@brad-smith.co.uk>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since ba227857d2 (Reduce the number of connects when fetching,
2008-02-04), when we disconnect a git transport, we send a final flush
packet. This cleanly tells the other side that we're done, and avoids
the other side complaining "the remote end hung up unexpectedly" (though
we'd only see that for transports that pass along the server stderr,
like ssh or local-host).
But when we've initiated a v2 stateless-connect session over a transport
helper, there's no point in sending this flush packet. Each operation
we've performed is self-contained, and the other side is fine with us
hanging up between operations.
But much worse, by sending the flush packet we may cause the helper to
issue an entirely new request _just_ to send the flush packet. So we can
incur an extra network request just to say "by the way, we have nothing
more to send".
Let's drop this extra flush packet. As the test shows, this reduces the
number of POSTs required for a v2 ls-remote over http from 2 to 1.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible in a case where the index file contains a tree extension
but no blobs within that tree exist for index_pos_by_traverse_info() to
segfault. If the name_entry passed into index_pos_by_traverse_info() has
no blobs inside, AND is alphabetically later than all blobs currently in
the index file, index_pos_by_traverse_info() will segfault. For example,
an index file which looks something like this:
aaa#0
bbb/aaa#0
[Extensions]
TREE: zzz
In this example, 'index_name_pos(..., "zzz/", ...)' will return '-4',
indicating that "zzz/" could be inserted at position 3. However, when
the checks which ensure that the insertion position of "zzz/" look for a
blob at that position beginning with "zzz/", the index cache is accessed
out of range, causing a segfault.
This kind of index state is not typically generated during user
operations, and is in fact an edge case of the state being checked for
in the conditional where it was added. However, since the entry for the
BUG() line is ambiguous, tell some additional context to help Git
developers debug the failure later. When we know the name of the dir we
were trying to look up, it becomes possible to examine the index file
in a hex util to determine what went wrong; the position gives a hint
about where to start looking.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git restore --staged <path>" removes a path that's in the index,
it marks the entry with CE_REMOVE, but we don't do anything to
invalidate the cache-tree. In the non-staged case, we end up in
checkout_worktree(), which calls remove_marked_cache_entries(). That
actually drops the entries from the index, as well as invalidating the
cache-tree and untracked-cache.
But with --staged, we never call checkout_worktree(), and the CE_REMOVE
entries remain. Interestingly, they are dropped when we write out the
index, but that means the resulting index is inconsistent: its
cache-tree will not match the actual entries, and running "git commit"
immediately after will create the wrong tree.
We can solve this by calling remove_marked_cache_entries() ourselves
before writing out the index. Note that we can't just hoist it out of
checkout_worktree(); that function needs to iterate over the CE_REMOVE
entries (to drop their matching worktree files) before removing them.
One curiosity about the test: without this patch, it actually triggers a
BUG() when running git-restore:
BUG: cache-tree.c:810: new1 with flags 0x4420000 should not be in cache-tree
But in the original problem report, which used a similar recipe,
git-restore actually creates the bogus index (and the commit is created
with the wrong tree). I'm not sure why the test here behaves differently
than my out-of-suite reproduction, but what's here should catch either
symptom (and the fix corrects both cases).
Reported-by: Torsten Krah <krah.tm@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 328c6cb853 (doc: promote "git switch", 2019-03-29), an example
was changed to use "git switch" rather than "git checkout" but an
instance of "git checkout" in the explanatory text preceding the
example was overlooked. Fix this oversight.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For easier understanding, here are the existing good scenarios:
1) Have *no* file 'foo', *no* local branch 'foo' and a *single*
remote branch 'foo'
2) `git checkout foo` will create local branch foo, see [1]
and
1) Have *a* file 'foo', *no* local branch 'foo' and a *single*
remote branch 'foo'
2) `git checkout foo` will complain, see [3]
This patch prevents the following scenario:
1) Have *a* file 'foo', *no* local branch 'foo' and *multiple*
remote branches 'foo'
2) `git checkout foo` will successfully... revert contents of
file `foo`!
That is, adding another remote suddenly changes behavior significantly,
which is a surprise at best and could go unnoticed by user at worst.
Please see [3] which gives some real world complaints.
To my understanding, fix in [3] overlooked the case of multiple remotes,
and the whole behavior of falling back to reverting file was never
intended:
[1] introduces the unexpected behavior. Before, there was fallback
from not-a-ref to pathspec. This is reasonable fallback. After, there
is another fallback from ambiguous-remote to pathspec. I understand
that it was a copy&paste oversight.
[2] noticed the unexpected behavior but chose to semi-document it
instead of forbidding, because the goal of the patch series was
focused on something else.
[3] adds `die()` when there is ambiguity between branch and file. The
case of multiple tracking branches is seemingly overlooked.
The new behavior: if there is no local branch and multiple remote
candidates, just die() and don't try reverting file whether it
exists (prevents surprise) or not (improves error message).
[1] Commit 70c9ac2f ("DWIM "git checkout frotz" to "git checkout -b frotz origin/frotz"" 2009-10-18)
https://public-inbox.org/git/7vaazpxha4.fsf_-_@alter.siamese.dyndns.org/
[2] Commit ad8d5104 ("checkout: add advice for ambiguous "checkout <branch>"", 2018-06-05)
https://public-inbox.org/git/20180502105452.17583-1-avarab@gmail.com/
[3] Commit be4908f1 ("checkout: disambiguate dwim tracking branches and local files", 2018-11-13)
https://public-inbox.org/git/20181110120707.25846-1-pclouds@gmail.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert `[]` to `test` and break if-then into separate lines, both of
which bring the style in line with Git's coding guidelines.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a space between the function name and () which brings the style in
line with Git's coding guidelines.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this function, we free the pointer we get from locate_in_PATH and
then check whether it's NULL. However, this is undefined behavior if
the pointer is non-NULL, since the C standard no longer permits us to
use a valid pointer after freeing it.
The only case in which the C standard would permit this to be defined
behavior is if r were NULL, since it states that in such a case "no
action occurs" as a result of calling free.
It's easy to suggest that this is not likely to be a problem, but we
know that GCC does aggressively exploit the fact that undefined
behavior can never occur to optimize and rewrite code, even when that's
contrary to the expectations of the programmer. It is, in fact, very
common for it to omit NULL pointer checks, just as we have here.
Since it's easy to fix, let's do so, and avoid a potential headache in
the future.
Noticed-by: Miriam R. <mirucam@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 1959bf6430 (string_list API: document what "sorted" means,
2012-09-17), Documentation/technical/api-string-list.txt was updated to
specify that strcmp() was used for sorting. In commit 8dd5afc926
(string-list: allow case-insensitive string list, 2013-01-07), a cmp
member was added to struct string_list to allow callers to specify an
alternative comparison function, but api-string-list.txt was not
updated. In commit 4f665f2cf3 (string-list.h: move documentation from
Documentation/api/ into header, 2017-09-26), the now out-dated
api-string-list.txt documentation was moved into string-list.h. Update
the docs to reflect the configurability of sorting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
check_updates() has a lot of code that repeatedly checks whether
o->update or o->dry_run are set. (Note that o->dry_run is a
near-synonym for !o->update, but not quite as per commit 2c9078d05b
("unpack-trees: add the dry_run flag to unpack_trees_options",
2011-05-25).) In fact, this function almost turns into a no-op whenever
the condition
!o->update || o->dry_run
is met. Simplify the code by checking this condition at the beginning
of the function, and when it is true, do the few things that are
relevant and return early.
There are a few things that make the conversion not quite obvious:
* The fact that check_updates() does not actually turn into a no-op
when updates are not wanted may be slightly surprising. However,
commit 33ecf7eb61 (Discard "deleted" cache entries after using them
to update the working tree, 2008-02-07) put the discarding of
unused cache entries in check_updates() so we still need to keep
the call to remove_marked_cache_entries(). It's possible this
call belongs in another function, but it is certainly needed as
tests will fail if it is removed.
* The original called remove_scheduled_dirs() unconditionally.
Technically, commit 7847892716 (unlink_entry(): introduce
schedule_dir_for_removal(), 2009-02-09) should have made that call
conditional, but it didn't matter in practice because
remove_scheduled_dirs() becomes a no-op when all the calls to
unlink_entry() are skipped. As such, we do not need to call it.
* When (o->dry_run && o->update), the original would have two calls
to git_attr_set_direction() surrounding a bunch of skipped updates.
These two calls to git_attr_set_direction() cancel each other out
and thus can be omitted when o->dry_run is true just as they
already are when !o->update.
* The code would previously call setup_collided_checkout_detection()
and report_collided_checkout() even when o->dry_run. However, this
was just an expensive no-op because
setup_collided_checkout_detection() merely cleared the CE_MATCHED
flag for each cache entry, and report_collided_checkout() reported
which ones had it set. Since a dry-run would skip all the
checkout_entry() calls, CE_MATCHED would never get set and thus
no collisions would be reported. Since we can't detect the
collisions anyway without doing updates, skipping the collisions
detection setup and reporting is an optimization.
* The code previously would call get_progress() and
display_progress() even when (!o->update || o->dry_run). This
served to show how long it took to skip all the updates, which is
somewhat useless. Since we are skipping the updates, we can skip
showing how long it takes to skip them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to write split commit-graph file(s) upon fetching computed
bogus value for the parameter used in splitting the resulting
files, which has been corrected.
* ds/commit-graph-set-size-mult:
commit-graph: prefer default size_mult when given zero
"git sparse-checkout list" subcommand learned to give its output in
a more concise form when the "cone" mode is in effect.
* ds/sparse-list-in-cone-mode:
sparse-checkout: document interactions with submodules
sparse-checkout: list directories in cone mode
An earlier update to Git for Windows declared that a tree object is
invalid if it has a path component with backslash in it, which was
overly strict, which has been corrected. The only protection the
Windows users need is to prevent such path (or any path that their
filesystem cannot check out) from entering the index.
* js/mingw-loosen-overstrict-tree-entry-checks:
mingw: only test index entries for backslashes, not tree entries
The sentence wants to talk about the superproject's possesive, not plural form.
Signed-off-by: Thomas Menzel <dev@tomsit.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the .editorconfig did not specify an indentation style for
text files. However, a quick look for indentation-like spacing suggest
that tabs are more common for documentation:
$ git grep -Pe '^ {4}' -- '*.txt' |wc -l
2683
$ git grep -Pe '^\t' -- '*.txt' |wc -l
14011
Note that there are a lot of files that indent list continuations (and
other things) with a single space -- if the first search was made
without the fixed quantifier the result would look very different.
However, the result does correspond with my anecdotal experience when
editing git documentation.
This commit adds *.txt to .editorconfig as an extension that should be
indented with tabs.
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like in-memory alternates, pretend_object_file contains a trap for the
unwary: careless callers can use it to create references to an object
that does not exist in the on-disk object store.
Add a comment documenting how to use the function without risking such
problems.
The only current caller is blame, which uses pretend_object_file to
create an in-memory commit representing the working tree state.
Noticed during a discussion of how to safely use this function in
operations like "git merge" which, unlike blame, are not read-only.
Inspired-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to "From:" and "Subject:" already mentioned in the
documentation, "Date:" can also appear as an in-body header
to override the value in the e-mail headers. Document it.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This typo was introduced in 94c0956b60 (sparse-checkout: create builtin
with 'list' subcommand, 2019-11-21).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow opening the currently selected file in its default app by clicking
on its name.
* zs/open-current-file:
git-gui: allow opening currently selected file in default app
In 50f26bd ("fetch: add fetch.writeCommitGraph config setting",
2019-09-02), the fetch builtin added the capability to write a
commit-graph using the "--split" feature. This feature creates
multiple commit-graph files, and those can merge based on a set
of "split options" including a size multiple. The default size
multiple is 2, which intends to provide a log_2 N depth of the
commit-graph chain where N is the number of commits.
However, I noticed during dogfooding that my commit-graph chains
were becoming quite large when left only to builds by 'git fetch'.
It turns out that in split_graph_merge_strategy(), we default the
size_mult variable to 2 except we override it with the context's
split_opts if they exist. In builtin/fetch.c, we create such a
split_opts, but do not populate it with values.
This problem is due to two failures:
1. It is unclear that we can add the flag COMMIT_GRAPH_WRITE_SPLIT
with a NULL split_opts.
2. If we have a non-NULL split_opts, then we override the default
values even if a zero value is given.
Correct both of these issues. First, do not override size_mult when
the options provide a zero value. Second, stop creating a split_opts
in the fetch builtin.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During a clone of a repository that contained a file with a backslash in
its name in the past, as of v2.24.1(2), Git for Windows prints errors
like this:
error: filename in tree entry contains backslash: '\'
The idea is to prevent Git from even trying to write files with
backslashes in their file names: while these characters are valid in
file names on other platforms, on Windows it is interpreted as directory
separator (which would obviously lead to ambiguities, e.g. when there is
a file `a\b` and there is also a file `a/b`).
Arguably, this is the wrong layer for that error: As long as the user
never checks out the files whose names contain backslashes, there should
not be any problem in the first place.
So let's loosen the requirements: we now leave tree entries with
backslashes in their file names alone, but we do require any entries
that are added to the Git index to contain no backslashes on Windows.
Note: just as before, the check is guarded by `core.protectNTFS` (to
allow overriding the check by toggling that config setting), and it
is _only_ performed on Windows, as the backslash is not a directory
separator elsewhere, even when writing to NTFS-formatted volumes.
An alternative approach would be to try to prevent creating files with
backslashes in their file names. However, that comes with its own set of
problems. For example, `git config -f C:\ProgramData\Git\config ...` is
a very valid way to specify a custom config location, and we obviously
do _not_ want to prevent that. Therefore, the approach chosen in this
patch would appear to be better.
This addresses https://github.com/git-for-windows/git/issues/2435
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a partial clone, if a user provides the hash of the empty tree ("git
mktree </dev/null" - for SHA-1, this is 4b825d...) to a command which
requires that that object be parsed, for example:
git diff-tree 4b825d <a non-empty tree>
then Git will lazily fetch the empty tree, unnecessarily, because
parsing of that object invokes repo_has_object_file(), which does not
special-case the empty tree.
Instead, teach repo_has_object_file() to consult find_cached_object()
(which handles the empty tree), thus bringing it in line with the rest
of the object-store-accessing functions. A cost is that
repo_has_object_file() will now need to oideq upon each invocation, but
that is trivial compared to the filesystem lookup or the pack index
search required anyway. (And if find_cached_object() needs to do more
because of previous invocations to pretend_object_file(), all the more
reason to be consistent in whether we present cached objects.)
As a historical note, the function now known as repo_read_object_file()
was taught the empty tree in 346245a1bb ("hard-code the empty tree
object", 2008-02-13), and the function now known as oid_object_info()
was taught the empty tree in c4d9986f5f ("sha1_object_info: examine
cached_object store too", 2011-02-07). repo_has_object_file() was never
updated, perhaps due to oversight. The flag OBJECT_INFO_SKIP_CACHED,
introduced later in dfdd4afcf9 ("sha1_file: teach
sha1_object_info_extended more flags", 2017-06-26) and used in
e83e71c5e1 ("sha1_file: refactor has_sha1_file_with_flags", 2017-06-26),
was introduced to preserve this difference in empty-tree handling, but
now it can be removed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Forbid pathnames that the platform's filesystem cannot represent on
MinGW.
* js/mingw-reserved-filenames:
mingw: refuse paths containing reserved names
mingw: short-circuit the conversion of `/dev/null` to UTF-16
"git rebase --signoff" stopped working when the command was written
in C, which has been corrected.
* en/rebase-signoff-fix:
rebase: fix saving of --signoff state for am-based rebases
Miscellaneous small UX improvements on "git-p4".
* bk/p4-misc-usability:
git-p4: show detailed help when parsing options fail
git-p4: yes/no prompts should sanitize user text
Back when merge-recursive was first introduced in commit 6d297f8137
(Status update on merge-recursive in C, 2006-07-08), it created a
sha_eq() function. This function pre-dated the introduction of
hashcmp() to cache.h by about a month, but was switched over to using
hashcmp() as part of commit 9047ebbc22 (Split out merge_recursive() to
merge-recursive.c, 2008-08-12). In commit b4da9d62f9 (merge-recursive:
convert leaf functions to use struct object_id, 2016-06-24), sha_eq() was
renamed to oid_eq() and its hashcmp() call was switched to oideq().
oid_eq() is basically just a wrapper around oideq() that has some extra
checks to protect against NULL arguments or to allow short-circuiting if
one of the arguments is NULL. I don't know if any caller ever tried to
call with NULL arguments, but certainly none do now which means the
extra checks serve no purpose. (Also, if these checks were genuinely
useful, then they probably should be added to the main oideq() so all
callers could benefit from them.)
Reduce the cognitive overhead of having both oid_eq() and oideq(), by
getting rid of merge-recursive's special oid_eq() wrapper.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the core.sparseCheckoutCone config setting was added in
879321eb0b ("sparse-checkout: add 'cone' mode" 2019-11-21), the
variables storing the config values for core.sparseCheckout and
core.sparseCheckoutCone were rearranged in cache.h, but in doing
so the "extern" keyword was dropped.
While we are tending to drop the "extern" keyword for function
declarations, it is still necessary for global variables used
across multiple *.c files. The impact of not having the extern
keyword may be unpredictable.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many times there's the need to quickly open a source file (the one you're
looking at in Git GUI) in the predefined text editor / IDE. Of course,
the file can be searched for in your preferred file manager or directly
in the text editor, but having the option to directly open the current
file from Git GUI would be just faster. This change enables just that by:
- clicking the diff header path (which is now highlighted as a hyperlink)
- or diff header path context menu -> Open
Note: executable files will be run and not opened for editing.
Signed-off-by: Zoli Szabó <zoli.szabo@gmail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Using 'git submodule (init|deinit)' a user can select a subset of
submodules to populate. This behaves very similar to the sparse-checkout
feature, but those directories contain their own .git directory
including an object database and ref space. To have the sparse-checkout
file also determine if those files should exist would easily cause
problems. Therefore, keeping these features independent in this way
is the best way forward.
Also create a test that demonstrates this behavior to make sure
it doesn't change as the sparse-checkout feature evolves.
Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command takes a list of directories as input, then creates an ordered
list of sparse-checkout patterns such that those directories are
recursively included and all sibling entries along the parent directories
are also included. Listing the patterns is less user-friendly than the
directories themselves.
In cone mode, and as long as the patterns match the expected cone-mode
pattern types, change the output of 'git sparse-checkout list' to only
show the directories that created the patterns.
With this change, the following piped commands would not change the
working directory:
git sparse-checkout list | git sparse-checkout set --stdin
The only time this would not work is if core.sparseCheckoutCone is
true, but the sparse-checkout file contains patterns that do not
match the expected pattern types for cone mode.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.25-rc0
* tag 'v2.25.0-rc0': (531 commits)
Git 2.25-rc0
sparse-checkout: improve OS ls compatibility
dir.c: use st_add3() for allocation size
dir: consolidate similar code in treat_directory()
dir: synchronize treat_leading_path() and read_directory_recursive()
dir: fix checks on common prefix directory
t4015: improve coverage of function context test
commit: forbid --pathspec-from-file --all
t3434: mark successful test as such
notes.h: fix typos in comment
t6030: don't create unused file
t5580: don't create unused file
t3501: don't create unused file
bisect--helper: convert `*_warning` char pointers to char arrays.
The sixth batch
fix-typo: consecutive-word duplications
Makefile: drop GEN_HDRS
built-in add -p: show helpful hint when nothing can be staged
built-in add -p: only show the applicable parts of the help text
built-in add -p: implement the 'q' ("quit") command
...
Do not use $GIT_BUILD_DIR without quotes; it may contain spaces and be
split into fields. But it is not necessary to access test-tool with an
absolute path in the first place as it can be found via path lookup.
Remove the explicit path.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have pread emulation for portability, so there's
there's no reason to make two syscalls where one suffices.
Furthermore, readers of the packfile will be using mmap
(or pread to emulate mmap), anyways, so the file description
offset does not matter in this case.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The line number, regex or offset parameters <start> and <end> in
`git log -L <start>,<end>:<file>`, or the function name regex in
`git log -L :<funcname>:<file>` must exist in the starting
revision, or else the command exits with a fatal error.
This is not obvious in the documentation, so add a note to that
effect.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently the line-log functionality (git log -L) only supports
displaying patch output (`-p | --patch`, its default behavior) and suppressing it
(`-s | --no-patch`). A check was added in the code to that effect in 5314efaea
(line-log: detect unsupported formats, 2019-03-10) but the documentation was not
updated.
Explicitly mention that `-L` implies `-p`, that patch output can be
suppressed using `-s`, and that all other diff formats are not allowed.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_open sets close-on-exec since cd66ada065
("sha1_file: open window into packfiles with O_CLOEXEC").
There's no reason to keep using fcntl to set the close-on-exec
flag, anymore.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Assorted fixes to the directory traversal API.
* en/fill-directory-fixes:
dir.c: use st_add3() for allocation size
dir: consolidate similar code in treat_directory()
dir: synchronize treat_leading_path() and read_directory_recursive()
dir: fix checks on common prefix directory
dir: break part of read_directory_recursive() out for reuse
dir: exit before wildcard fall-through if there is no wildcard
dir: remove stray quote character in comment
Revert "dir.c: make 'git-status --ignored' work within leading directories"
t3011: demonstrate directory traversal failures
The effort to move "git-add--interactive" to C continues.
* js/add-p-in-c:
built-in add -p: show helpful hint when nothing can be staged
built-in add -p: only show the applicable parts of the help text
built-in add -p: implement the 'q' ("quit") command
built-in add -p: implement the '/' ("search regex") command
built-in add -p: implement the 'g' ("goto") command
built-in add -p: implement hunk editing
strbuf: add a helper function to call the editor "on an strbuf"
built-in add -p: coalesce hunks after splitting them
built-in add -p: implement the hunk splitting feature
built-in add -p: show different prompts for mode changes and deletions
built-in app -p: allow selecting a mode change as a "hunk"
built-in add -p: handle deleted empty files
built-in add -p: support multi-file diffs
built-in add -p: offer a helpful error message when hunk navigation failed
built-in add -p: color the prompt and the help text
built-in add -p: adjust hunk headers as needed
built-in add -p: show colored hunks by default
built-in add -i: wire up the new C code for the `patch` command
built-in add -i: start implementing the `patch` functionality in C
Code cleanup.
* rs/ref-read-cleanup:
remote: pass NULL to read_ref_full() because object ID is not needed
refs: pass NULL to refs_read_ref_full() because object ID is not needed
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.
* ds/sparse-cone: (21 commits)
sparse-checkout: improve OS ls compatibility
sparse-checkout: respect core.ignoreCase in cone mode
sparse-checkout: check for dirty status
sparse-checkout: update working directory in-process for 'init'
sparse-checkout: cone mode should not interact with .gitignore
sparse-checkout: write using lockfile
sparse-checkout: use in-process update for disable subcommand
sparse-checkout: update working directory in-process
sparse-checkout: sanitize for nested folders
unpack-trees: add progress to clear_ce_flags()
unpack-trees: hash less in cone mode
sparse-checkout: init and set in cone mode
sparse-checkout: use hashmaps for cone patterns
sparse-checkout: add 'cone' mode
trace2: add region in clear_ce_flags
sparse-checkout: create 'disable' subcommand
sparse-checkout: add '--stdin' option to set subcommand
sparse-checkout: 'set' subcommand
clone: add --sparse mode
sparse-checkout: create 'init' subcommand
...
Redo "git name-rev" to avoid recursive calls.
* sg/name-rev-wo-recursion:
name-rev: cleanup name_ref()
name-rev: eliminate recursion in name_rev()
name-rev: use 'name->tip_name' instead of 'tip_name'
name-rev: drop name_rev()'s 'generation' and 'distance' parameters
name-rev: restructure creating/updating 'struct rev_name' instances
name-rev: restructure parsing commits and applying date cutoff
name-rev: pull out deref handling from the recursion
name-rev: extract creating/updating a 'struct name_rev' into a helper
t6120: add a test to cover inner conditions in 'git name-rev's name_rev()
name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation
name-rev: avoid unnecessary cast in name_ref()
name-rev: use strbuf_strip_suffix() in get_rev_name()
t6120-describe: modernize the 'check_describe' helper
t6120-describe: correct test repo history graph in comment
Some Porcelain commands are written in Perl, and tests on them are
expected not to work when the platform lacks a working perl.
* ra/t5150-depends-on-perl:
t5150: skip request-pull test if Perl is disabled
"git format-patch" can take a set of configured format.notes values
to specify which notes refs to use in the log message part of the
output. The behaviour of this was not consistent with multiple
--notes command line options, which has been corrected.
* dl/format-patch-notes-config-fixup:
notes.h: fix typos in comment
notes: break set_display_notes() into smaller functions
config/format.txt: clarify behavior of multiple format.notes
format-patch: move git_config() before repo_init_revisions()
format-patch: use --notes behavior for format.notes
notes: extract logic into set_display_notes()
notes: create init_display_notes() helper
notes: rename to load_display_notes()
A few more commands learned the "--pathspec-from-file" command line
option.
* am/pathspec-f-f-checkout:
checkout, restore: support the --pathspec-from-file option
doc: restore: synchronize <pathspec> description
doc: checkout: synchronize <pathspec> description
doc: checkout: fix broken text reference
doc: checkout: remove duplicate synopsis
add: support the --pathspec-from-file option
cmd_add: prepare for next patch
An earlier series to teach "--pathspec-from-file" to "git commit"
forgot to make the option incompatible with "--all", which has been
corrected.
* am/pathspec-from-file:
commit: forbid --pathspec-from-file --all
There are a couple of reserved names that cannot be file names on
Windows, such as `AUX`, `NUL`, etc. For an almost complete list, see
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
If one would try to create a directory named `NUL`, it would actually
"succeed", i.e. the call would return success, but nothing would be
created.
Worse, even adding a file extension to the reserved name does not make
it a valid file name. To understand the rationale behind that behavior,
see https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073
Let's just disallow them all.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit, we want to disallow accessing any path that contains
any segment that is equivalent to `NUL`. In particular, we want to
disallow accessing `NUL` (e.g. to prevent any repository from being
checked out that contains a file called `NUL`, as that is not a valid
file name on Windows).
However, there are legitimate use cases within Git itself to write to
the Null device. As Git is really a Linux project, it does not abstract
that idea, though, but instead uses `/dev/null` to describe this
intention.
So let's side-step the validation _specifically_ in the case that we
want to write to (or read from) `/dev/null`, via a dedicated short-cut
in the code that skips the call to `validate_win32_path()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The built-in `git add -i` machinery obviously has its `the_repository`
structure initialized at the point where `cmd_commit()` calls it, and
therefore does not look at the environment variable `GIT_INDEX_FILE`.
But when being called from `commit --interactive`, it has to, because
the index was already locked in that case, and we want to ask the
interactive add machinery to work on the `index.lock` file instead of
the `index` file.
Technically, we could teach `run_add_i()`, or for that matter
`run_add_p()`, to look specifically at that environment variable, but
the entire idea of passing in a parameter of type `struct repository *`
is to allow working on multiple repositories (and their index files)
independently.
So let's instead override the `index_file` field of that structure
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a straight-forward port of 2f0896ec3a (restore: support
--patch, 2019-04-25) which added support for `git restore -p`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch teaches the built-in `git add -p` machinery all the tricks it
needs to know in order to act as the work horse for `git checkout -p`.
Apart from the minor changes (slightly reworded messages, different
`diff` and `apply --check` invocations), it requires a new function to
actually apply the changes, as `git checkout -p` is a bit special in
that respect: when the desired changes do not apply to the index, but
apply to the work tree, Git does not fail straight away, but asks the
user whether to apply the changes to the worktree at least.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The scripted version of `git stash` called directly into the Perl script
`git-add--interactive.perl`, and this was faithfully converted to C.
However, we have a much better way to do this now: call the internal API
directly, which will now incidentally also respect the
`add.interactive.useBuiltin` setting. Let's just do this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As `git add` traditionally did not expose the `--patch=<mode>` modes via
command-line options, the scripted version of `git stash` had to call
`git add--interactive` directly.
But this prevents the built-in `add -p` from kicking in, as
`add--interactive` is the scripted version (which does not have a
"fall-back" to the built-in version).
So let's introduce support for internal switch for `git add` that the
scripted `git stash` can use to call the appropriate backend (scripted
or built-in, depending on `add.interactive.useBuiltin`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git stash` and `git reset` commands support a `--patch` option, and
both simply hand off to `git add -p` to perform that work. Let's teach
the built-in version of that command to be able to perform that work, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Perl script backing `git add -p` is used not only for that command,
but also for `git stash -p`, `git reset -p` and `git checkout -p`.
In preparation for teaching the C version of `git add -p` to support
also the latter commands, let's abstract away what is "stage" specific
into a dedicated data structure describing the differences between the
patch modes.
Finally, please note that the Perl version tries to make sure that the
diffs are only generated for the modified files. This is not actually
necessary, as the calls to Git's diff machinery already perform that
work, and perform it well. This makes it unnecessary to port the
`FILTER` field of the `%patch_modes` struct, as well as the
`get_diff_reference()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On FreeBSD, when executed by root ls enables the '-A' option:
-A Include directory entries whose names begin with a dot (`.')
except for . and ... Automatically set for the super-user unless
-I is specified.
As a result the .git directory appeared in the output when run as root.
Simulate no-dotfile ls behaviour using a shell glob.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Ed Maste <emaste@FreeBSD.org>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, git-credential-netrc does not work outside of a git
repository. It fails with the following error:
fatal: Not a git repository: . at /usr/share/perl5/Git.pm line 214.
There is no real reason why need to be within a repository, though.
Credential helpers should be able to work just fine outside the
repository as well.
Call the non-self version of config() so that git-credential-netrc no
longer needs to be run within a repository.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shebang path for the Perl interpreter in git-credential-netrc was
hardcoded. However, some users may have it located at a different
location and thus, would have had to manually edit the script.
Add a .perl prefix to the script to denote it as a template and ignore
the generated version. Augment the Makefile so that it generates
git-credential-netrc from git-credential-netrc.perl, just like other
Perl scripts.
The Makefile recipes were shamelessly stolen from
contrib/mw-to-git/Makefile.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, we were running `test_must_fail full_name`. However,
`test_must_fail` should only be used on git commands. Inline full_name()
so that we can use test_must_fail on the git command directly.
When full_name() was introduced in 28fb84382b (Introduce
<branch>@{upstream} notation, 2009-09-10), the `git -C` option wasn't
available yet (since it was introduced in 44e1e4d67d (git: run in a
directory given with -C option, 2013-09-09)). As a result, the helper
function removed the need to manually cd each time. However, since
`git -C` is available now, we can just use that instead and inline
full_name().
An alternate approach was taken where we taught full_name() to accept an
optional `!` arg to trigger test_must_fail behavior. However, this added
more unnecessary complexity than inlining so we inline instead.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected test style is to have all commands tested within a
test_expect_success block. Move the generation of the 'expect' text into
their corresponding blocks. While we're at it, insert a second
`commit=$(git rev-parse HEAD)` into the next test case so that it's
clear where $commit is coming from.
The biggest advantage of doing this is that we now check the return code
of `git rev-parse HEAD` so we can catch it in case it fails.
This patch is best viewed with `--color-moved --ignore-all-space`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return code of git commands are lost when a command is in a
non-assignment command substitution in favour of the surrounding
command's. Rewrite instances of this so that git commands run
on their own.
In commit_subject(), use a `tformat` instead of `format` since,
previously, we were testing the output of a command substitution which
didn't care if there was a trailing newline since it was automatically
stripped. Since we use test_cmp() now, the trailing newline matters so
use `tformat` to always output it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() family of functions (including test_might_fail())
should only be used on git commands. Replace test_might_fail() with
a compound command wrapping the old cp invocation that always returns 0.
The `test_might_fail cp` line was introduced in 466e8d5d66 (t1501: fix
test with split index, 2015-03-24). It is necessary because there might
exist some index files in `repo.git/sharedindex.*` and, if they exist,
we want to copy them over. However, if they don't exist, we don't want
to error out because we expect that possibility. As a result, we want to
keep the "might fail" semantics so we always return 0, even if the
underlying cp errors out.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we should assume that external commands work sanely. Replace
`test_must_fail test -f` with `test_path_is_missing` since we expect
these paths to not exist.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In one case, we were using a redirection operator to feed input into
sed. However, since sed is capable of opening its own input file, make
sed do that instead of redirecting input into it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on only allowing `test_must_fail` to work on a
restricted subset of commands, including `git`. Reorder the commands so
that `nongit` comes before `test_must_fail`. This way, `test_must_fail`
operates on a git command.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() family of functions (including test_might_fail())
should only be used on git commands. Replace `test_might_fail rm` with
`rm -f` so that we don't use `test_might_fail` on a non-git command.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since
check_packed_refs_marked() just wraps a grep invocation, replace
`test_must_fail check_packed_refs_marked` with
`! check_packed_refs_marked`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. Since has_cr() just
wraps a tr and grep pipeline, replace `test_must_fail has_cr` with
`! has_cr`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an effort to remove test_must_fail for all invocations not related to
git or test-tool, replace invocations of `test_must_fail attr_check`
with a plain attr_check call with the $expect argument set to the
actual value output by git.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In several places, we used `test_line_count = 0` to check for an empty
file. Although this is correct, it's overkill. Use test_must_be_empty()
instead because it's more suited for this purpose.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had named the parameters in attr_check() but $2 was being used
instead of $expect. Make all variable accesses in attr_check() use named
variables instead of numbered arguments for clarity.
While we're at it, add variable assignments to the &&-chain. These
aren't ever expected to fail but if a future developer ever adds some
code above the assignments and they could fail in some way, the intact
&&-chain will ensure that the failure is caught.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we should assume that external commands work sanely. We use
test_must_fail to test run_sub_test_lib_test() but that function does
not invoke any git commands internally. Even better, we have a function
that's exactly meant to be used when we expect to have a failing test
suite: run_sub_test_lib_test_err()!
Replace `test_must_fail run_sub_test_lib_test` with
`run_sub_test_lib_test_err`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, cleanup_git() would use `test_must_fail test -d` to ensure
that the directory is removed. However, test_must_fail should only be
used for git commands. Use test_path_is_missing() instead to check that
the directory has been removed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was an error introduced in the conversion from shell in commit
21853626ea ("built-in rebase: call `git am` directly", 2019-01-18),
which was noticed by a random browsing of the code.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When preparing a manufactured dirent instance, we add a length of
path to the size of struct to decide how many bytes to allocate.
Make sure this addition does not wrap-around to cause us
underallocate.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the DIR_SKIP_NESTED_GIT and DIR_NO_GITLINKS cases were checking for
whether a path was actually a nonbare repository. That code could be
shared, with just the result of how to act differing between the two
cases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our optimization to avoid calling into read_directory_recursive() when
all pathspecs have a common leading directory mean that we need to match
the logic that read_directory_recursive() would use if we had just
called it from the root. Since it does more than call treat_path() we
need to copy that same logic.
Alternatively, we could try to change treat_path to return path_recurse
for an untracked directory under the given special circumstances that
this logic checks for, but a simple switch results in many test failures
such as 'git clean -d' not wiping out untracked but empty directories.
To work around that, we'd need the caller of treat_path to check for
path_recurse and sometimes special case it into path_untracked. In
other words, we'd still have extra logic in both places.
Needing to duplicate logic like this means it is guaranteed someone will
eventually need to make further changes and forget to update both
locations. It is tempting to just nuke the leading_directory special
casing to avoid such bugs and simplify the code, but unpack_trees'
verify_clean_subdirectory() also calls read_directory() and does so with
a non-empty leading path, so I'm hesitant to try to restructure further.
Add obnoxious warnings to treat_leading_path() and
read_directory_recursive() to try to warn people of such problems.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many years ago, the directory traversing logic had an optimization that
would always recurse into any directory that was a common prefix of all
the pathspecs without walking the leading directories to get down to
the desired directory. Thus,
git ls-files -o .git/ # case A
would notice that .git/ was a common prefix of all pathspecs (since
it is the only pathspec listed), and then traverse into it and start
showing unknown files under that directory. Unfortunately, .git/ is not
a directory we should be traversing into, which made this optimization
problematic. This also affected cases like
git ls-files -o --exclude-standard t/ # case B
where t/ was in the .gitignore file and thus isn't interesting and
shouldn't be recursed into. It also affected cases like
git ls-files -o --directory untracked_dir/ # case C
where untracked_dir/ is indeed untracked and thus interesting, but the
--directory flag means we only want to show the directory itself, not
recurse into it and start listing untracked files below it.
The case B class of bugs were noted and fixed in commits 16e2cfa909
("read_directory(): further split treat_path()", 2010-01-08) and
48ffef966c ("ls-files: fix overeager pathspec optimization",
2010-01-08), with the idea being that we first wanted to check whether
the common prefix was interesting. The former patch noted that
treat_path() couldn't be used when checking the common prefix because
treat_path() requires a dir_entry() and we haven't read any directories
at the point we are checking the common prefix. So, that patch split
treat_one_path() out of treat_path(). The latter patch then created a
new treat_leading_path() which duplicated by hand the bits of
treat_path() that couldn't be broken out and then called
treat_one_path() for the remainder. There were three problems with this
approach:
* The duplicated logic in treat_leading_path() accidentally missed the
check for special paths (such as is_dot_or_dotdot and matching
".git"), causing case A types of bugs to continue to be an issue.
* The treat_leading_path() logic assumed we should traverse into
anything where path_treatment was not path_none, i.e. it perpetuated
class C types of bugs.
* It meant we had split logic that needed to kept in sync, running the
risk that people introduced new inconsistencies (such as in commit
be8a84c526, which we reverted earlier in this series, or in commit
df5bcdf83a which we'll fix in a subsequent commit)
Fix most these problems by making treat_leading_path() not only loop
over each leading path component, but calling treat_path() directly on
each. To do so, we have to create a synthetic dir_entry, but that only
takes a few lines. Then, pay attention to the path_treatment result we
get from treat_path() and don't treat path_excluded, path_untracked, and
path_recurse all the same as path_recurse.
This leaves one remaining problem, the new inconsistency from commit
df5bcdf83a. That will be addressed in a subsequent commit.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ea9882bfc4 (commit: disable status hints when writing to
COMMIT_EDITMSG, 2013-09-12) the intent was to disable status hints
when writing to COMMIT_EDITMSG, because giving the hints in the "git
status" like output in the commit message template are too late to
be useful (they say things like "'git add' to stage", but that is
only possible after aborting the current "git commit" session).
But there is one case that the hints can be useful: When the current
attempt to commit is rejected because no change is recorded in the
index. The message is given and "git commit" errors out, so the
hints can immediately be followed by the user. Teach the codepath
to honor the configuration variable.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test that includes an actual function line in the test file to
check if context is expanded to include the whole function, and add an
ignored change before function context to check if that one stays hidden
while the originally ignored change within function context is shown.
This differs from the existing test, which is concerned with the case
where there is no function line at all in the file (and we might look
past the beginning of the file).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I forgot this in my previous patch `--pathspec-from-file` for
`git commit` [1]. When both `--pathspec-from-file` and `--all` were
specified, `--all` took precedence and `--pathspec-from-file` was
ignored. Before `--pathspec-from-file` was implemented, this case was
prevented by this check in `parse_and_validate_options()` :
die(_("paths '%s ...' with -a does not make sense"), argv[0]);
It is unfortunate that these two cases are disconnected. This came as
result of how the code was laid out before my patches, where `pathspec`
is parsed outside of `parse_and_validate_options()`. This branch is
already full of refactoring patches and I did not dare to go for another
one.
Fix by mirroring `die()` for `--pathspec-from-file` as well.
[1] Commit e440fc58 ("commit: support the --pathspec-from-file option" 2019-11-19)
Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t3434.3 was fixed by commit 917d0d6234 ("Merge branch
'js/rebase-r-safer-label'", 2019-12-05). t3434 did not exist in
js/rebase-r-safer-label, so could not have marked the test as fixed, and
it was probably not noticed that the merge fixed this test. Mark it as
fixed now.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 1d7297513d (notes: break set_display_notes() into smaller functions,
2019-12-11), we introduced a comment which had a couple of typos. In the
first typo, we referenced 'enable_default_display_notes' instead of
'enable_ref_display_notes'. In the second typo, we wrote "is a points to"
instead of "is a pointer to". Correct both of these typos.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
my_bisect_log3.txt was added by c9c4e2d5a2 (bisect: only check merge
bases when needed, 2008-08-22), but hasn't been used then and since.
Get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file "out" was introduced by 13b57da833 (mingw: verify that paths
are not mistaken for remote nicknames, 2017-05-29), but has not actually
been used then and since. Get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file "out" became unused with fd53b7ffd1 (merge-recursive: improve
add_cacheinfo error handling, 2018-04-19); get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This gives users a quick shortcut to close the window. But since the
window can also show commands in progress, closing the window on Escape
can give the perception that the command has been cancelled even though
it hasn't been. So, only enable this binding when the command is done.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Instead of using a pointer that points at a constant string,
just give name directly to the constant string; this way, we
do not have to allocate a pointer variable in addition to
the string we want to use.
Let's convert `need_bad_and_good_revision_warning` and
`need_bisect_start_warning` char pointers to char arrays.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test on "fast-import" used to get stuck when "fast-import" died
in the middle.
* sg/t9300-robustify:
t9300-fast-import: don't hang if background fast-import exits too early
t9300-fast-import: store the PID in a variable instead of pidfile
Test coverage update in preparation for further work on "git add -i".
* js/add-i-a-bit-more-tests:
apply --allow-overlap: fix a corner case
git add -p: use non-zero exit code when the diff generation failed
t3701: verify that the diff.algorithm config setting is handled
t3701: verify the shown messages when nothing can be added
t3701: add a test for the different `add -p` prompts
t3701: avoid depending on the TTY prerequisite
t3701: add a test for advanced split-hunk editing
Code clean-up.
* dl/range-diff-with-notes:
range-diff: clear `other_arg` at end of function
range-diff: mark pointers as const
t3206: fix incorrect test name
* hw/doc-in-header:
trace2: move doc to trace2.h
submodule-config: move doc to submodule-config.h
tree-walk: move doc to tree-walk.h
trace: move doc to trace.h
run-command: move doc to run-command.h
parse-options: add link to doc file in parse-options.h
credential: move doc to credential.h
argv-array: move doc to argv-array.h
cache: move doc to cache.h
sigchain: move doc to sigchain.h
pathspec: move doc to pathspec.h
revision: move doc to revision.h
attr: move doc to attr.h
refs: move doc to refs.h
remote: move doc to remote.h and refspec.h
sha1-array: move doc to sha1-array.h
merge: move doc to ll-merge.h
graph: move doc to graph.h and graph.c
dir: move doc to dir.h
diff: move doc to diff.h and diffcore.h
The "diff" machinery learned not to lose added/removed blank lines
in the context when --ignore-blank-lines and --function-context are
used at the same time.
* rs/xdiff-ignore-ws-w-func-context:
xdiff: unignore changes in function context
"git rebase" did not work well when format.useAutoBase
configuration variable is set, which has been corrected.
* dl/rebase-with-autobase:
rebase: fix format.useAutoBase breakage
format-patch: teach --no-base
t4014: use test_config()
format-patch: fix indentation
t3400: demonstrate failure with format.useAutoBase
In a repository with many packfiles, the cost of the procedure that
avoids registering the same packfile twice was unnecessarily high
by using an inefficient search algorithm, which has been corrected.
* cs/store-packfiles-in-hashmap:
packfile.c: speed up loading lots of packfiles
"git add -i" that is getting rewritten in C has been extended to
cover subcommands other than the "patch".
* js/builtin-add-i-cmds:
built-in add -i: offer the `quit` command
built-in add -i: re-implement the `diff` command
built-in add -i: implement the `patch` command
built-in add -i: re-implement `add-untracked` in C
built-in add -i: re-implement `revert` in C
built-in add -i: implement the `update` command
built-in add -i: prepare for multi-selection commands
built-in add -i: allow filtering the modified files list
add-interactive: make sure to release `rev.prune_data`
Avoid gmtime() and localtime() and prefer their reentrant
counterparts.
* dd/time-reentrancy:
mingw: use {gm,local}time_s as backend for {gm,local}time_r
archive-zip.c: switch to reentrant localtime_r
date.c: switch to reentrant {gm,local}time_r
Reduce unnecessary reading of state variables back from the disk
during sequencer operation.
* ag/sequencer-todo-updates:
sequencer: directly call pick_commits() from complete_action()
rebase: fill `squash_onto' in get_replay_opts()
sequencer: move the code writing total_nr on the disk to a new function
sequencer: update `done_nr' when skipping commands in a todo list
sequencer: update `total_nr' when adding an item to a todo list
When a user provides invalid parameters to git-p4, the program
reports the failure but does not provide the correct command syntax.
Add an exception handler to the command-line argument parser to display
the command's specific command line parameter syntax when an exception
is thrown. Rethrow the exception so the current behavior is retained.
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When prompting the user interactively for direction, the tests are
not forgiving of user input format.
For example, the first query asks for a yes/no response. If the user
enters the full word "yes" or "no" or enters a capital "Y" the test
will fail.
Create a new function, prompt(prompt_text) where
* prompt_text is the text prompt for the user
* returns a single character where valid return values are
found by inspecting prompt_text for single characters
surrounded by square brackets
This new function must prompt the user for input and sanitize it by
converting the response to a lower case string, trimming leading and
trailing spaces, and checking if the first character is in the list
of choices. If it is, return the first letter.
Change the current references to raw_input() to use this new function.
Since the method requires the returned text to be one of the available
choices, remove the loop from the calling code that handles response
verification.
Thanks-to: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Ben Keene <seraphire@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct unintentional duplication(s) of words, such as "the the",
and "can can" etc.
The changes are only applied to cases where it's fixing what is clearly
wrong or prone to misunderstanding, as suggested by the reviewers.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Denton Liu <liu.denton@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ryenus <ryenus@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When ebb7baf0 ("Makefile: add a hdr-check target", 2018-09-19)
implemented hdr-check target, it wanted to leave some header files
exempt from the stricter check the target implements, and added
GEN_HDRS macro.
This however is probably a bad move for two reasons:
- If we value the header cleanliness check, we eventually want to
teach our header generating scripts to produce clean headers.
Keeping the blanket "generated headers can be left as dirty as we
want" exception does not nudge us in the right direction.
- There is a list of generated header files, GENERATED_H, which is
used to keep track of dependencies. Presence of GEN_HDRS that is
too similarly named would confuse developers who are adding new
generated header files which list to add theirs.
- Even though unicode-width.h could be generated using a contrib/
script, as far as our build infrastructure is concerned, it is a
source file that is tracked in the source control system. Its
presence in GEN_HDRS list is doubly misleading.
Get rid of GEN_HDRS, which is used only once to list the headers we
do not run hdr-check test on, and instead explicitly list that the
ones, either tracked or generated, that we exempt from the test.
This allows GENERATED_H to be the sole "here are build artifact
header files that are expendable" list, so use it in the clean
target to $(RM) them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch will make `git add -p` show "No changes." or "Only binary
files changed." in that case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When displaying the only hunk in a file's diff, the prompt already
excludes the commands to navigate to the previous/next hunk.
Let's also let the `?` command show only the help lines corresponding to
the commands that are displayed in the prompt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This command is actually very similar to the 'd' ("do not stage this
hunk or any of the later hunks in the file") command: it just does
something on top, namely leave the loop and return a value indicating
that we're quittin'.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch implements the hunk searching feature in the C version of
`git add -p`.
A test is added to verify that this behavior matches the one of the Perl
version of `git add -p`.
Note that this involves a change of behavior: the Perl version uses (of
course) the Perl flavor of regular expressions, while this patch uses
the regcomp()/regexec(), i.e. POSIX extended regular expressions. In
practice, this behavior change is unlikely to matter.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this patch, it is now possible to see a summary of the available
hunks and to navigate between them (by number).
A test is added to verify that this behavior matches the one of the Perl
version of `git add -p`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like `git add --edit` allows the user to edit the diff before it is
being applied to the index, this feature allows the user to edit the
diff *hunk*.
Naturally, it gets a bit more complicated here because the result has
to play well with the remaining hunks of the overall diff. Therefore,
we have to do a loop in which we let the user edit the hunk, then test
whether the result would work, and if not, drop the edits and let the
user decide whether to try editing the hunk again.
Note: in contrast to the Perl version, we use the same diff
"coalescing" (i.e. merging overlapping hunks into a single one) also for
the check after editing, and we introduce a new flag for that purpose
that asks the `reassemble_patch()` function to pretend that all hunks
were selected for use.
This allows us to continue to run `git apply` *without* the
`--allow-overlap` option (unlike the Perl version), and it also fixes
two known breakages in `t3701-add-interactive.sh` (which we cannot mark
as resolved so far because the Perl script version is still the default
and continues to have those breakages).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helper supports the scenario where Git has a populated `strbuf` and
wants to let the user edit it interactively.
In `git add -p`, we will use this to allow interactive hunk editing: the
diff hunks are already in memory, but we need to write them out to a
file so that an editor can be launched, then read everything back once
the user is done editing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is considered "the right thing to do", according to 933e44d3a0
("add -p": work-around an old laziness that does not coalesce hunks,
2011-04-06).
Note: we cannot simply modify the hunks while merging them; Once we
implement hunk editing, we will call `reassemble_patch()` whenever a
hunk is edited, therefore we must not modify the hunks (because the user
might e.g. hit `K` and change their mind whether to stage the previous
hunk).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If this developer's workflow is any indication, then this is *the* most
useful feature of Git's interactive `add `command.
Note: once again, this is not a verbatim conversion from the Perl code
to C: the `hunk_splittable()` function, for example, essentially did all
the work of splitting the hunk, just to find out whether more than one
hunk would have been the result (and then tossed that result into the
trash). In C we instead count the number of resulting hunks (without
actually doing the work of splitting, but just counting the transitions
from non-context lines to context lines), and store that information
with the hunk, and we do that *while* parsing the diff in the first
place.
Another deviation: the built-in `git add -p` was designed with a single
strbuf holding the diff (and another one holding the colored diff, if
that one was asked for) in mind, and hunks essentially store just the
start and end offsets pointing into that strbuf. As a consequence, when
we split hunks, we now use a special mode where the hunk header is
generated dynamically, and only the rest of the hunk is stored using
such start/end offsets. This way, we also avoid the frequent
formatting/re-parsing of the hunk header of the Perl version.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like the Perl version, we now helpfully ask the user whether they
want to stage a mode change, or a deletion.
Note that we define the prompts in an array, in preparation for a later
patch that changes those prompts to yet different versions for `git
reset -p`, `git stash -p` and `git checkout -p` (which all call the `git
add -p` machinery to do the actual work).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This imitates the way the Perl version treats mode changes: it offers
the mode change up for the user to decide, as if it was a diff hunk.
In contrast to the Perl version, we make use of the fact that the mode
line is the first hunk, and explicitly strip out that line from the diff
header if that "hunk" was not selected to be applied, and skipping that
hunk while coalescing the diff. The Perl version plays some kind of diff
line lego instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This addresses the same problem as 24ab81ae4d (add-interactive: handle
deletion of empty files, 2009-10-27), although in a different way: we
not only stick the "deleted file" line into its own pseudo hunk, but
also the entire remainder (if any) of the same diff.
That way, we do not have to play any funny games with regards to
coalescing the diff after the user selected what (possibly pseudo-)hunks
to stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For simplicity, the initial implementation in C handled only a single
modified file. Now it handles an arbitrary number of files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When skipping a hunk that adds a different number of lines than it
removes, we need to adjust the subsequent hunk headers of non-skipped
hunks: in pathological cases, the context is not enough to determine
precisely where the patch should be applied.
This problem was identified in 23fea4c240 (t3701: add failing test for
pathological context lines, 2018-03-01) and fixed in the Perl version in
fecc6f3a68 (add -p: adjust offsets of subsequent hunks when one is
skipped, 2018-03-01).
And this patch fixes it in the C version of `git add -p`.
In contrast to the Perl version, we try to keep the extra text on the
hunk header (which typically contains the signature of the function
whose code is changed in the hunk) intact.
Note: while the C version does not support staging mode changes at this
stage, we already prepare for this by simply skipping the hunk header if
both old and new offset is 0 (this cannot happen for regular hunks, and
we will use this as an indicator that we are looking at a special hunk).
Likewise, we already prepare for hunk splitting by handling the absence
of extra text in the hunk header gracefully: only the first split hunk
will have that text, the others will not (indicated by an empty extra
text start/end range). Preparing for hunk splitting already at this
stage avoids an indentation change of the entire hunk header-printing
block later, and is almost as easy to review as without that handling.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like the Perl version, we now generate two diffs if `color.diff` is
set: one with and one without color. Then we parse them in parallel and
record which hunks start at which offsets in both.
Note that this is a (slight) deviation from the way the Perl version did
it: we are no longer reading the output of `diff-files` line by line
(which is more natural for Perl than for C), but in one go, and parse
everything later, so we might just as well do it in synchrony.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code in `git-add--interactive.perl` that takes care of the `patch`
command can look quite intimidating. There are so many modes in which it
can be called, for example.
But for the `patch` command in `git add -i`, only one mode is relevant:
the `stage` mode. And we just implemented the beginnings of that mode in
C so far. So let's use it when `add.interactive.useBuiltin=true`.
Now, while the code in `add-patch.c` is far from reaching feature parity
with the code in `git-add--interactive.perl` (color is not implemented,
the diff algorithm cannot be configured, the colored diff cannot be
post-processed via `interactive.diffFilter`, many commands are
unimplemented yet, etc), hooking it all up with the part of `git add -i`
that is already converted to C makes it easier to test and develop it.
Note: at this stage, both the `add.interactive.useBuiltin` config
setting is still safely opt-in, and will probably be fore quite some
time, to allow for thorough testing "in the wild" without adversely
affecting existing users.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous steps, we re-implemented the main loop of `git add -i`
in C, and most of the commands.
Notably, we left out the actual functionality of `patch`, as the
relevant code makes up more than half of `git-add--interactive.perl`,
and is actually pretty independent of the rest of the commands.
With this commit, we start to tackle that `patch` part. For better
separation of concerns, we keep the code in a separate file,
`add-patch.c`. The new code is still guarded behind the
`add.interactive.useBuiltin` config setting, and for the moment,
it can only be called via `git add -p`.
The actual functionality follows the original implementation of
5cde71d64a (git-add --interactive, 2006-12-10), but not too closely
(for example, we use string offsets rather than copying strings around,
and after seeing whether the `k` and `j` commands are applicable, in the
C version we remember which previous/next hunk was undecided, and use it
rather than looking again when the user asked to jump).
As a further deviation from that commit, We also use a comma instead of
a slash to separate the available commands in the prompt, as the current
version of the Perl script does this, and we also add a line about the
question mark ("print help") to the help text.
While it is tempting to use this conversion of `git add -p` as an excuse
to work on `apply_all_patches()` so that it does _not_ want to read a
file from `stdin` or from a file, but accepts, say, an `strbuf` instead,
we will refrain from this particular rabbit hole at this stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The regex failed to compile on FreeBSD.
Also add /* -- */ mark to separate the two regex entries given to
the PATTERNS() macro, to make it consistent with patterns for other
content types.
Signed-off-by: Ed Maste <emaste@FreeBSD.org>
Reviewed-by: Jeff King <peff@peff.net>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although Asciidoc allows to not indent following lines in a list item,
it is clearer and safer to follow the recommended rule.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non ASCII characters may be handled by publishing chains, but right
now, nothing indicates the encoding of files. Moreover, non ASCII
source strings upset the localization toolchain.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user uses the sparse-checkout feature in cone mode, they
add patterns using "git sparse-checkout set <dir1> <dir2> ..."
or by using "--stdin" to provide the directories line-by-line over
stdin. This behaviour naturally looks a lot like the way a user
would type "git add <dir1> <dir2> ..."
If core.ignoreCase is enabled, then "git add" will match the input
using a case-insensitive match. Do the same for the sparse-checkout
feature.
Perform case-insensitive checks while updating the skip-worktree
bits during unpack_trees(). This is done by changing the hash
algorithm and hashmap comparison methods to optionally use case-
insensitive methods.
When this is enabled, there is a small performance cost in the
hashing algorithm. To tease out the worst possible case, the
following was run on a repo with a deep directory structure:
git ls-tree -d -r --name-only HEAD |
git sparse-checkout set --stdin
The 'set' command was timed with core.ignoreCase disabled or
enabled. For the repo with a deep history, the numbers were
core.ignoreCase=false: 62s
core.ignoreCase=true: 74s (+19.3%)
For reproducibility, the equivalent test on the Linux kernel
repository had these numbers:
core.ignoreCase=false: 3.1s
core.ignoreCase=true: 3.6s (+16%)
Now, this is not an entirely fair comparison, as most users
will define their sparse cone using more shallow directories,
and the performance improvement from eb42feca97 ("unpack-trees:
hash less in cone mode" 2019-11-21) can remove most of the
hash cost. For a more realistic test, drop the "-r" from the
ls-tree command to store only the first-level directories.
In that case, the Linux kernel repository takes 0.2-0.25s in
each case, and the deep repository takes one second, plus or
minus 0.05s, in each case.
Thus, we _can_ demonstrate a cost to this change, but it is
unlikely to matter to any reasonable sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8164c961e1 (format-patch: use --notes behavior for format.notes,
2019-12-09), we introduced set_display_notes() which was a monolithic
function with three mutually exclusive branches. Break the function up
into three small and simple functions that each are only responsible for
one task.
This family of functions accepts an `int *show_notes` instead of
returning a value suitable for assignment to `show_notes`. This is for
two reasons. First of all, this guarantees that the external
`show_notes` variable changes in lockstep with the
`struct display_notes_opt`. Second, this prompts future developers to be
careful about doing something meaningful with this value. In fact, a
NULL check is intentionally omitted because causing a segfault here
would tell the future developer that they are meant to use the value for
something meaningful.
One alternative was making the family of functions accept a
`struct rev_info *` instead of the `struct display_notes_opt *`, since
the former contains the `show_notes` field as well. This does not work
because we have to call git_config() before repo_init_revisions().
However, if we had a `struct rev_info`, we'd need to initialize it before
it gets assigned values from git_config(). As a result, we break the
circular dependency by having standalone `int show_notes` and
`struct display_notes_opt notes_opt` variables which temporarily hold
values from git_config() until the values are copied over to `rev`.
To implement this change, we need to get a pointer to
`rev_info::show_notes`. Unfortunately, this is not possible with
bitfields and only direct-assignment is possible. Change
`rev_info::show_notes` to a non-bitfield int so that we can get its
address.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8164c961e1 (format-patch: use --notes behavior for format.notes,
2019-12-09), we slightly tweaked the behavior of having multiple
`format.notes` configuration variables. We did not update the
documentation to reflect this, however.
Explictly state the behavior of having multiple `format.notes`
configuration variables so users are clear on what to expect.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Presently in the manpages git-submodule[1] links to gitsubmodules[7]
and gitmodules[5], gitsubmodules[7] links to git-submodule[1] and gitmodules[5],
but gitmodules[5] only link to git-submodule[1] (and git-config[1]).
Add a link to gitsubmodules[7] in gitmodules[5], so that a person
stumbling upon gitmodules[5] can quickly access gitsubmodules[7],
which has a more high-level overview of submodule usage.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_ref_full() wraps refs_read_ref_full(), which in turn wraps
refs_resolve_ref_unsafe(), which handles a NULL oid pointer of callers
not interested in the resolved object ID. Make use of that feature to
document that mv() is such a caller.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
refs_read_ref_full() wraps refs_resolve_ref_unsafe(), which handles a
NULL oid pointer of callers not interested in the resolved object ID.
Pass NULL from files_copy_or_rename_ref() to clarify that it is one
such caller.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
msgfile2 became unused with 3968658599 (Make builtin-tag.c use
parse_options., 2007-11-09), get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file "stdout" has been created by the test script since its initial
(and so far only) version added by 3aa4d81f88 (mailinfo: support
format=flowed, 2018-08-25), but has never been used. Get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create an add_path_to_appropriate_result_list() function from the code
at the end of read_directory_recursive() so we can use it elsewhere.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The DO_MATCH_LEADING_PATHSPEC had a fall-through case for if there was a
wildcard, noting that we don't yet have enough information to determine
if a further paths under the current directory might match due to the
presence of wildcards. But if we have no wildcards in our pathspec,
then we shouldn't get to that fall-through case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit be8a84c526 ("dir.c: make 'git-status --ignored' work within
leading directories", 2013-04-15) noted that
git status --ignored <SOMEPATH>
would not list ignored files and directories within <SOMEPATH> if
<SOMEPATH> was untracked, and modified the behavior to make it show
them. However, it did so via a hack that broke consistency; it would
show paths under <SOMEPATH> differently than a simple
git status --ignored | grep <SOMEPATH>
would show them. A correct fix is slightly more involved, and
complicated slightly by this hack, so we revert this commit (but keep
corrected versions of the testcases) and will later fix the original
bug with a subsequent patch.
Some history may be helpful:
A very, very similar case to the commit we are reverting was raised in
commit 48ffef966c ("ls-files: fix overeager pathspec optimization",
2010-01-08); but it actually went in somewhat the opposite direction. In
that commit, it mentioned how
git ls-files -o --exclude-standard t/
used to show untracked files under t/ even when t/ was ignored, and then
changed the behavior to stop showing untracked files under an ignored
directory. More importantly, this commit considered keeping this
behavior but noted that it would be inconsistent with the behavior when
multiple pathspecs were specified and thus rejected it.
The reason for this whole inconsistency when one pathspec is specified
versus zero or two is because common prefixes of pathspecs are sent
through a different set of checks (in treat_leading_path()) than normal
file/directory traversal (those go through read_directory_recursive()
and treat_path()). As such, for consistency, one needs to check that
both codepaths produce the same result.
Revert commit be8a84c526, except instead
of removing the testcase it added, modify it to check for correct and
consistent behavior. A subsequent patch in this series will fix the
testcase.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add several tests demonstrating directory traversal failures of various
sorts in dir.c (and one similar looking test that turns out to be a
git_fnmatch bug). A lot of these tests look like near duplicates of
each other, but an optimization path in dir.c to pre-descend into a
common prefix and the specialized treatment of trailing slashes in dir.c
mean the tiny differences are sometimes important and potentially cause
different codepaths to be explored.
Of the 7 failing tests, 2 are new to git-2.24.0 (tweaked by side effects
of the en/clean-nested-with-ignored-topic); the other 5 also failed
under git-2.23.0 and earlier.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git lfs" allows users to specify the custom storage location with
the configuration variable `lfs.storage`, but when interacting with
GitLFS pointers, "git p4" always uses the hardcoded default that is
the `.git/lfs/` directory, without paying attention to the
configuration.
Use the value configured in `lfs.storage`, if exists, as all the
"git" operations do, for consistency.
Signed-off-by: r.burenkov <panzercheg@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5e82c3dd22 (bisect--helper: `bisect_reset` shell function in C,
2019-01-02), the `git bisect reset` subcommand was ported to C. When the
call to `git checkout` failed, an error message was reported to the
user.
However, this error message used the `strbuf` that had just been
released already. Let's switch that around: first use it, then release
it.
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix UTF-8 refnames not displaying properly because the encoding was not
set to UTF-8.
* kk/branch-name-encoding:
git gui: fix branch name encoding error
Hide lower-level verify_signed-buffer() API as a pure helper to
implement the public check_signature() function, in order to
encourage new callers to use the correct and more strict
validation.
* hi/gpg-use-check-signature:
gpg-interface: prefer check_signature() for GPG verification
One kind of progress messages were always given during commit-graph
generation, instead of following the "if it takes more than two
seconds, show progress" pattern, which has been corrected.
* ds/commit-graph-delay-gen-progress:
commit-graph: use start_delayed_progress()
progress: create GIT_PROGRESS_DELAY
The interaction between "git clone --recurse-submodules" and
alternate object store was ill-designed. The documentation and
code have been taught to make more clear recommendations when the
users see failures.
* jt/clone-recursesub-ref-advise:
submodule--helper: advise on fatal alternate error
Doc: explain submodule.alternateErrorStrategy
"git log" family learned "--pretty=reference" that gives the name
of a commit in the format that is often used to refer to it in log
messages.
* dl/pretty-reference:
SubmittingPatches: use `--pretty=reference`
pretty: implement 'reference' format
pretty: add struct cmt_fmt_map::default_date_mode_type
pretty: provide short date format
t4205: cover `git log --reflog -z` blindspot
pretty.c: inline initalize format_context
revision: make get_revision_mark() return const pointer
completion: complete `tformat:` pretty format
SubmittingPatches: remove dq from commit reference
pretty-formats.txt: use generic terms for hash
SubmittingPatches: use generic terms for hash
Work around a issue where a FD that is left open when spawning a
child process and is kept open in the child can interfere with the
operation in the parent process on Windows.
* js/mingw-inherit-only-std-handles:
mingw: forbid translating ERROR_SUCCESS to an errno value
mingw: do set `errno` correctly when trying to restrict handle inheritance
mingw: restrict file handle inheritance only on Windows 7 and later
mingw: spawned processes need to inherit only standard handles
mingw: work around incorrect standard handles
mingw: demonstrate that all file handles are inherited by child processes
A few commands learned to take the pathspec from the
standard input or a named file, instead of taking it as the command
line arguments.
* am/pathspec-from-file:
commit: support the --pathspec-from-file option
doc: commit: synchronize <pathspec> description
reset: support the `--pathspec-from-file` option
doc: reset: synchronize <pathspec> description
pathspec: add new function to parse file
parse-options.h: add new options `--pathspec-from-file`, `--pathspec-file-nul`
"git rebase -i" learned a few options that are known by "git
rebase" proper.
* ra/rebase-i-more-options:
rebase -i: finishing touches to --reset-author-date
rebase: add --reset-author-date
rebase -i: support --ignore-date
sequencer: rename amend_author to author_to_rename
rebase -i: support --committer-date-is-author-date
sequencer: allow callers of read_author_script() to ignore fields
rebase -i: add --ignore-whitespace flag
In 13cdf78094 (format-patch: teach format.notes config option,
2019-05-16), the order in which git_config() and repo_init_revisions()
were swapped so that `rev.notes_opt` would be initialized before
git_config() was called. This is problematic, however, as git_config()
should generally be called before repo_init_revisions().
Break this circular dependency by creating `show_notes` and `notes_opt`
which git_config() reads into. Then, copy these values over to
`rev.show_notes` and `rev.notes_opt` after repo_init_revisions() is
called.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we had multiple `format.notes` config values where we had `<ref1>`,
`false`, `<ref2>` (in that order), then we would print out the notes for
both `<ref1>` and `<ref2>`. This doesn't make sense, however, since we
parse the config in a top-down manner and a `false` should be able to
override previous configurations, just like how `--no-notes` will
override previous `--notes`.
Duplicate the logic that handles the `--[no-]notes[=]` option to
`format.notes` for consistency. As a result, when parsing the config
from top to bottom, `format.notes = true` will behave like `--notes`,
`format.notes = <ref>` will behave like `--notes=<ref>` and
`format.notes = false` will behave like `--no-notes`.
This change isn't strictly backwards compatible but since it is an edge
case where a sane user would not mix notes refs with `false` and this
feature is relatively new (released only in v2.23.0), this change should
be harmless.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of open coding the logic that tweaks the variables in
`struct display_notes_opt` within handle_revision_opt(), abstract away the
logic into set_display_notes() so that it can be reused.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently open code the initialization for revs->notes_opt. Abstract
this away into a helper function so that the logic can be reused in a
future commit.
This is slightly wasteful as we memset the struct twice but this is only
run once so it shouldn't have any major effect.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the function comment, init_display_notes() was supposed to
"Load the notes machinery for displaying several notes trees." Rename
this function to load_display_notes() so that its use is more accurately
represented.
This is done because, in a future commit, we will reuse the name
init_display_notes().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier patches in this series moved a couple of conditions from the
recursive name_rev() function into its caller name_ref(), for no other
reason than to make eliminating the recursion a bit easier to follow.
Since the previous patch name_rev() is not recursive anymore, so let's
move all those conditions back into name_rev().
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name_rev() function calls itself recursively for each interesting
parent of the commit it got as parameter, and, consequently, it can
segfault when processing a deep history if it exhausts the available
stack space. E.g. running 'git name-rev --all' and 'git name-rev
HEAD~100000' in the gcc, gecko-dev, llvm, and WebKit repositories
results in segfaults on my machine ('ulimit -s' reports 8192kB of
stack size limit), and nowadays the former segfaults in the Linux repo
as well (it reached the necessasry depth sometime between v5.3-rc4 and
-rc5).
Eliminate the recursion by inserting the interesting parents into a
LIFO 'prio_queue' [1] and iterating until the queue becomes empty.
Note that the parent commits must be added in reverse order to the
LIFO 'prio_queue', so their relative order is preserved during
processing, i.e. the first parent should come out first from the
queue, because otherwise performance greatly suffers on mergy
histories [2].
The stacksize-limited test 'name-rev works in a deep repo' in
't6120-describe.sh' demonstrated this issue and expected failure. Now
the recursion is gone, so flip it to expect success. Also gone are
the dmesg entries logging the segfault of that segfaulting 'git
name-rev' process on every execution of the test suite.
Note that this slightly changes the order of lines in the output of
'git name-rev --all', usually swapping two lines every 35 lines in
git.git or every 150 lines in linux.git. This shouldn't matter in
practice, because the output has always been unordered anyway.
This patch is best viewed with '--ignore-all-space'.
[1] Early versions of this patch used a 'commit_list', resulting in
~15% performance penalty for 'git name-rev --all' in 'linux.git',
presumably because of the memory allocation and release for each
insertion and removal. Using a LIFO 'prio_queue' has basically no
effect on performance.
[2] We prefer shorter names, i.e. 'v0.1~234' is preferred over
'v0.1^2~5', meaning that usually following the first parent of a
merge results in the best name for its ancestors. So when later
we follow the remaining parent(s) of a merge, and reach an already
named commit, then we usually find that we can't give that commit
a better name, and thus we don't have to visit any of its
ancestors again.
OTOH, if we were to follow the Nth parent of the merge first, then
the name of all its ancestors would include a corresponding '^N'.
Those are not the best names for those commits, so when later we
reach an already named commit following the first parent of that
merge, then we would have to update the name of that commit and
the names of all of its ancestors as well. Consequently, we would
have to visit many commits several times, resulting in a
significant slowdown.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following the previous patches in this series we can get the value of
'name_rev()'s 'tip_name' parameter from the 'struct rev_name'
associated with the commit as well.
So let's use 'name->tip_name' instead, which makes the patch
eliminating the recursion of name_rev() a bit easier to follow.
Note that at this point we could drop the 'tip_name' parameter as
well, but that parameter will be necessary later, after the recursion
is eliminated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an enumeration to assign names to the magic values that determine
the ZIP compression method to use. Use those names to improve code
readability.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After "git checkout -b '漢字'" to create a branch with UTF-8 character
in it, "git gui" shows the branch name incorrectly, as it forgets to
turn the bytes read from the "git for-each-ref" and read from "HEAD"
file into Unicode characters.
Signed-off-by: Kazuhiro Kato <kato-k@ksysllc.co.jp>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
This test case was added in 66ae9a57b8 (t3404: rebase -i: demonstrate
short SHA-1 collision, 2013-08-23), and it is not indented in the way we
usually indent sub-shell code in our test cases these days.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
flush_current_id() prints the hexadecimal representation of two object
IDs. When the code was added in f97672225b (Add "git-patch-id" program
to generate patch ID's., 2005-06-23), sha1_to_hex() had only a single
internal static buffer, so the result of one invocation had to be stored
in a local buffer.
Since dcb3450fd8 (sha1_to_hex() usage cleanup, 2006-05-03) it rotates
through four buffers, which allows to print up to four object IDs at the
same time. 1a876a69af (patch-id: convert to use struct object_id,
2015-03-13) replaced sha1_to_hex() with oid_to_hex(), which has the same
feature. Use it to simplify the code.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Publicize lore.kernel.org mailing list archive and use URLs
pointing into it to refer to notable messages in the documentation.
* dl/lore-is-the-archive:
doc: replace LKML link with lore.kernel.org
RelNotes: replace Gmane with real Message-IDs
doc: replace MARC links with lore.kernel.org
Doc update for the mailing list archiving and nntp service.
* jk/lore-is-the-archive:
doc: replace public-inbox links with lore.kernel.org
doc: recommend lore.kernel.org over public-inbox.org
PerfTest fix to avoid stale result mixed up with the latest round
of test results.
* tg/perf-remove-stale-result:
perf-lib: use a single filename for all measurement types
Performance tweak on "git push" into a repository with many refs
that point at objects we have never heard of.
* jk/send-pack-check-negative-with-quick:
send-pack: use OBJECT_INFO_QUICK to check negative objects
Code cleanup.
* rs/use-skip-prefix-more:
name-rev: use skip_prefix() instead of starts_with()
push: use skip_prefix() instead of starts_with()
shell: use skip_prefix() instead of starts_with()
fmt-merge-msg: use skip_prefix() instead of starts_with()
fetch: use skip_prefix() instead of starts_with()
Test cleanup.
* rs/test-cleanup:
t7811: don't create unused file
t9300: don't create unused file
test: use test_must_be_empty F instead of test_cmp empty F
test: use test_must_be_empty F instead of test -z $(cat F)
t1400: use test_must_be_empty
t1410: use test_line_count
t1512: use test_line_count
While running "revert" or "cherry-pick --edit" for multiple
commits, a recent regression incorrectly detected "nothing to
commit, working tree clean", instead of replaying the commits,
which has been corrected.
* sg/assume-no-todo-update-in-cherry-pick:
sequencer: don't re-read todo for revert and cherry-pick
Following the previous patches in this series we can get the values of
name_rev()'s 'generation' and 'distance' parameters from the 'stuct
rev_name' associated with the commit as well.
Let's simplify the function's signature and remove these two
unnecessary parameters.
Note that at this point we could do the same with the 'tip_name',
'taggerdate' and 'from_tag' parameters as well, but those parameters
will be necessary later, after the recursion is eliminated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the beginning of the recursive name_rev() function it creates a new
'struct rev_name' instance for each previously unvisited commit or, if
this visit results in better name for an already visited commit, then
updates the 'struct rev_name' instance attached to the commit, or
returns early.
Restructure this so it's caller creates or updates the 'struct
rev_name' instance associated with the commit to be passed as
parameter, i.e. both name_ref() before calling name_rev() and
name_rev() itself as it iterates over the parent commits.
This makes eliminating the recursion a bit easier to follow, and the
condition moved to name_ref() will be moved back to name_rev() after
the recursion is eliminated.
This change also plugs the memory leak that was temporarily unplugged
in the earlier "name-rev: pull out deref handling from the recursion"
patch in this series.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the beginning of the recursive name_rev() function it parses the
commit it got as parameter, and returns early if the commit is older
than a cutoff limit.
Restructure this so the caller parses the commit and checks its date,
and doesn't invoke name_rev() if the commit to be passed as parameter
is older than the cutoff, i.e. both name_ref() before calling
name_rev() and name_rev() itself as it iterates over the parent
commits.
This makes eliminating the recursion a bit easier to follow, and the
condition moved to name_ref() will be moved back to name_rev() after
the recursion is eliminated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'if (deref) { ... }' condition near the beginning of the recursive
name_rev() function can only ever be true in the first invocation,
because the 'deref' parameter is always 0 in the subsequent recursive
invocations.
Extract this condition from the recursion into name_rev()'s caller and
drop the function's 'deref' parameter. This makes eliminating the
recursion a bit easier to follow, and it will be moved back into
name_rev() after the recursion is eliminated.
Furthermore, drop the condition that die()s when both 'deref' and
'generation' are non-null (which should have been a BUG() to begin
with).
Note that this change reintroduces the memory leak that was plugged in
in commit 5308224633 (name-rev: avoid leaking memory in the `deref`
case, 2017-05-04), but a later patch (name-rev: restructure
creating/updating 'struct rev_name' instances) in this series will
plug it in again.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later patch in this series we'll want to do this in two places.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 'builtin/name-rev.c' in the name_rev() function there is a loop
iterating over all parents of the given commit, and the loop body
looks like this:
if (parent_number > 1) {
if (generation > 0)
// branch #1
new_name = ...
else
// branch #2
new_name = ...
name_rev(parent, new_name, ...);
} else {
// branch #3
name_rev(...);
}
These conditions are not covered properly in the test suite. As far
as purely test coverage goes, they are all executed several times over
in 't6120-describe.sh'. However, they don't directly influence the
command's output, because the repository used in that test script
contains several branches and tags pointing somewhere into the middle
of the commit DAG, and thus result in a better name for the
to-be-named commit. This can hide bugs: e.g. by replacing the
'new_name' parameter of the first recursive name_rev() call with
'tip_name' (effectively making both branch #1 and #2 a noop) 'git
name-rev --all' shows thousands of bogus names in the Git repository,
but the whole test suite still passes successfully. In an early
version of a later patch in this series I managed to mess up all three
branches (at once!), but the test suite still passed.
So add a new test case that operates on the following history:
A--------------master
\ /
\----------M2
\ /
\---M1-C
\ /
B
and names the commit 'B' to make sure that all three branches are
crucial to determine 'B's name:
- There is only a single ref, so all names are based on 'master',
without any undesired interference from other refs.
- Each time name_rev() follows the second parent of a merge commit,
it appends "^2" to the name. Following 'master's second parent
right at the start ensures that all commits on the ancestry path
from 'master' to 'B' have a different base name from the original
'tip_name' of the very first name_rev() invocation. Currently,
while name_rev() is recursive, it doesn't matter, but it will be
necessary to properly cover all three branches after the recursion
is eliminated later in this series.
- Following 'M2's second parent makes sure that branch #2 (i.e. when
'generation = 0') affects 'B's name.
- Following the only parent of the non-merge commit 'C' ensures that
branch #3 affects 'B's name, and that it increments 'generation'.
- Coming from 'C' 'generation' is 1, thus following 'M1's second
parent makes sure that branch #1 affects 'B's name.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Casting a 'struct object' to 'struct commit' is unnecessary there,
because it's already available in the local 'commit' variable.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_name_rev() basically open-codes strip_suffix() before adding a
string to a strbuf.
Let's use the strbuf right from the beginning, i.e. add the whole
string to the strbuf and then use strbuf_strip_suffix(), making the
code more idiomatic.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'check_describe' helper function runs 'git describe' outside of
'test_expect_success' blocks, with extra hand-rolled code to record
and examine its exit code.
Update this helper and move the 'git describe' invocation inside the
'test_expect_success' block.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were leaking memory by not clearing `other_arg` after we were done
using it. Clear it after we've finished using it.
Note that this isn't strictly necessary since the memory will be
reclaimed once the command exits. However, since we are releasing the
strbufs, we should also clear `other_arg` for consistency.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The contents pointed to by `diffopt` and `other_arg` should not be
modified. Mark these as `const` to indicate this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name of the test used to indicate that it was testing the `--notes`
option but it was really testing the `format.notes` configuration.
Correct the test name to reflect this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The five tests checking 'git fast-import's checkpoint handling in
't9300-fast-import.sh', all with the prefix "V:" in their test
description, can hang indefinitely if 'git fast-import' unexpectedly
dies early in any of these tests.
These five tests run 'git fast-import' in the background, while
feeding instructions to its standard input through a fifo (fd 8) from
a background subshell, and reading and verifying its standard output
through another fifo (fd 9) in the test script's main shell process.
This "reading and verifying" is basically a 'while read ...' shell
loop iterating until 'git fast-import' outputs the expected line,
ignoring any other output. This doesn't work very well when 'git
fast-import' dies before printing that particular line, because the
'read' builtin doesn't get EOF after the death of 'git fast-import',
as their input and output are not connected directly but through a
fifo. Consequently, that 'read' hangs waiting for the next line from
the already dead 'git fast-import', leaving the test script and in
turn the whole test suite hanging.
Avoid this hang by checking whether the background 'git fast-import'
process exited unexpectedly early, and interrupt the 'while read' loop
if it did. We have to jump through some hoops to achive that, though:
- Start the background 'git fast-import' in another background
subshell, which then:
- prints the PID of that 'git fast-import' process to the fifo,
to be read by the main shell process, so it will know which
process to kill when the test is finished.
- waits until that 'git fast-import' process exits. If it does
exit, then report its exit code, and write a message to the
fifo used for 'git fast-import's standard output, thus
un-block the 'read' builtin in the main shell process.
- Modify that 'while read' loop to break the loop upon seeing that
message, and fail the test in the usual way.
- Once the test is finished kill that background subshell as well,
and do so before killing the background 'git fast-import'.
Otherwise the background 'git fast-import' and subshell processes
would die racily, and if 'git fast-import' were to die sooner,
then we might get some undesired and potentially confusing
messages in the test's output.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The five tests running 'git fast-import' in the background in
't9300-fast-import.sh' store the PID of that background process in a
pidfile, to be used to check whether that background process survived
each test and then to kill it in test_when_finished commands. To
achieve this all these five tests run three $(cat <pidfile>) command
substitutions each.
Store the PID of the background 'git fast-import' in a variable to
avoid those extra processes.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In dcb500dc16 (cherry-pick/revert: advise using --skip, 2019-07-02),
`git commit` learned to suggest to run `git cherry-pick --skip` when
trying to cherry-pick an empty patch.
However, it was overlooked that there are more conditions than just a
`git cherry-pick` when this advice is printed (which originally
suggested the neutral `git reset`): the same can happen during a rebase.
Let's suggest the correct command, even during a rebase.
While at it, we adjust more places in `builtin/commit.c` that
incorrectly assumed that the presence of a `CHERRY_PICK_HEAD` meant that
surely this must be a `cherry-pick` in progress.
Note: we take pains to handle the situation when a user runs a `git
cherry-pick` _during_ a rebase. This is quite valid (e.g. in an `exec`
line in an interactive rebase). On the other hand, it is not possible to
run a rebase during a cherry-pick, meaning: if both `rebase-merge/` and
`sequencer/` exist or CHERRY_PICK_HEAD and REBASE_HEAD point to the same
commit , we still want to advise to use `git cherry-pick --skip`.
Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Working out which command wants to create a commit requires detailed
knowledge of the sequencer internals and that knowledge is going to
increase in subsequent commits. With that in mind lets encapsulate that
knowledge in sequencer.c rather than spreading it into builtin/commit.c.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add FROM_CHERRY_PICK_MULTI for a sequence of cherry-picks rather than
using a separate variable.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git commit` relies on the presence of CHERRY_PICK_HEAD to show the
correct error message in the case of an empty pick. This fixes a
regression introduced by the conversion from shell to C. In the shell
version everything was a cherry-pick as far as the sequencer code was
concerned so it always wrote CHERRY_PICK_HEAD. The conversion to C
forgot to update the code that creates CHERRY_PICK_HEAD. We do not want
to create CHERRY_PICK_HEAD for fixup and squash commands as that would
prevent `git commit --amend` from running.
Note that the error message shown by `git commit` for an empty pick
during a rebase is currently wrong as it talks about running `git
cherry-pick --skip` rather than `git rebase --skip`. This will be fixed
in a future commit which is why the tests are in t3403-rebase-skip.sh.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We disallow partial commits and amending when CHERRY_PICK_HEAD
exists. Add a couple of tests to check the error message printed in each
case before we refactor the code responsible for this.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In dcb500dc16 (cherry-pick/revert: advise using --skip, 2019-07-02),
`git commit` learned to suggest to run `git cherry-pick --skip` when
trying to cherry-pick an empty patch, but that was never tested for.
Here is a test that verifies that a message is given to the user that
contains the correct invocation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a number of places where we compare two revisions with
test $(git rev-parse rev1) = $(git rev-parse rev2)
when these fail there's no indication what has gone wrong and you need
to be running with `-x` to see where the test has failed. Lets use
test_cmp_rev instead.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Yes, yes, this is supposed to be only a band-aid option for `git add -p`
not Doing The Right Thing. But as long as we carry the `--allow-overlap`
option, we might just as well get it right.
This fixes the case where one hunk inserts a line before the first line,
and is followed by a hunk whose context overlaps with the first one's
and which appends a line at the end.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first thing `git add -p` does is to generate a diff. If this diff
cannot be generated, `git add -p` should not continue as if nothing
happened, but instead fail.
What we *actually* do here is much broader: we now verify for *every*
`run_cmd_pipe()` call that the spawned process actually succeeded.
Note that we have to change two callers in this patch, as we need to
store the spawned process' output in a local variable, which means that
the callers can no longer decide whether to interpret the `return <$fh>`
in array or in scalar context.
This bug was noticed while writing a test case for the diff.algorithm
feature, and we let that test case double as a regression test for this
fixed bug, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch, there is actually no test in Git's test suite that
covers the diff.algorithm feature. Let's add one.
We do this by passing a bogus value and then expecting `git diff-files`
to produce the appropriate error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for re-implementing `git add -p` in pure C (where we will
purposefully keep the implementation of `git add -p` separate from the
implementation of `git add -i`), let's verify that the user is told the
same things as in the Perl version when the diff file is either empty or
contains only entries about binary files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git add -p` command offers different prompts for regular diff hunks
vs mode change pseudo hunks vs diffs deleting files.
Let's cover this in the regresion test suite, in preparation for
re-implementing `git add -p` in C.
For the mode change prompt, we use a trick that lets this test case pass
even on systems without executable bit, i.e. where `core.filemode =
false` (such as Windows): we first add the file to the index with `git
add --chmod=+x`, and then call `git add -p` with `core.filemode` forced
to `true`. The file on disk has no executable bit set, therefore we will
see a mode change.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The TTY prerequisite is a rather heavy one: it not only requires Perl to
work, but also the IO/Pty.pm module (with native support, and it
requires pseudo terminals, too).
In particular, test cases marked with the TTY prerequisite would be
skipped in Git for Windows' SDK.
In the case of `git add -p`, we do not actually need that big a hammer,
as we do not want to test any functionality that requires a pseudo
terminal; all we want is for the interactive add command to use color,
even when being called from within the test suite.
And we found exactly such a trick earlier already: when we added a test
case to verify that the main loop of `git add -i` is colored
appropriately. Let's use that trick instead of the TTY prerequisite.
While at it, we avoid the pipes, as we do not want a SIGPIPE to break
the regression test cases (which will be much more likely when we do not
run everything through Perl because that is inherently slower).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this developer's workflows, it often happens that a hunk needs to be
edited in a way that adds lines, and sometimes even reduces the number
of context lines.
Let's add a regression test for this.
Note that just like the preceding test case, the new test case is *not*
handled gracefully by the current `git add -p`. It will be handled
correctly by the upcoming built-in `git add -p`, though.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-2.23: (44 commits)
Git 2.23.1
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
...
* maint-2.22: (43 commits)
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
...
* maint-2.21: (42 commits)
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
...
These patches fix several bugs in quoting arguments when spawning shell
scripts on Windows.
Note: these bugs are Windows-only, as we have to construct a command
line for the process-to-spawn, unlike Linux/macOS, where `execv()`
accepts an already-split command line.
Furthermore, these fixes were not included in the CVE-2019-1350 part of
v2.14.6 because the Windows-specific quoting when spawning shell scripts
was contributed from Git for Windows into Git only in the v2.21.x era.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Previously, we failed to quote characters such as '*', '(' and the
likes. Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This reverts the work-around that was introduced just for the v2.20.x
release train in "t7415: adjust test for dubiously-nested submodule
gitdirs for v2.20.x"; It is not necessary for v2.21.x.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When constructing command-lines to spawn processes, it is an unfortunate
but necessary decision to quote arguments differently: MSYS2 has
different dequoting rules (inherited from Cygwin) than the rest of
Windows.
To accommodate that, Git's Windows compatibility layer has two separate
quoting helpers, one for MSYS2 (which it uses exclusively when spawning
`sh`) and the other for regular Windows executables.
The MSYS2 one had an unfortunate bug where a `,` somehow slipped in,
instead of the `;`. As a consequence, empty arguments would not be
enclosed in a pair of double quotes, but the closing double quote was
skipped.
Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the point where `mingw_spawn_fd()` is called, we already have a full
path to the script interpreter in that scenario, and we pass it in as
the executable to run, while the `argv` reflect what the script should
receive as command-line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
In v2.15.4, we started to reject `submodule.update` settings in
`.gitmodules`. Let's raise a BUG if it somehow still made it through
from anywhere but the Git config.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
In v2.20.x, Git clones submodules recursively by first creating the
submodules' gitdirs and _then_ "updating" the submodules. This can lead
to the situation where the clone path is taken because the directory
(while it exists already) is not a git directory, but then the clone
fails because that gitdir is unexpectedly already a directory.
This _also_ works around the vulnerability that was fixed in "Disallow
dubiously-nested submodule git directories", but it produces a different
error message than the one expected by the test case, therefore we
adjust the test case accordingly.
Note: as the two submodules "race each other", there are actually two
possible error messages, therefore we have to teach the test case to
expect _two_ possible (and good) outcomes in addition to the one it
expected before.
Note: this workaround is only necessary for the v2.20.x release train;
The behavior changed again in v2.21.x so that the original test case's
expectations are met again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.19: (34 commits)
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
...
* maint-2.18: (33 commits)
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
...
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
This allows hosting providers to detect whether they are being used
to attack users using malicious 'update = !command' settings in
.gitmodules.
Since ac1fbbda20 (submodule: do not copy unknown update mode from
.gitmodules, 2013-12-02), in normal cases such settings have been
treated as 'update = none', so forbidding them should not produce any
collateral damage to legitimate uses. A quick search does not reveal
any repositories making use of this construct, either.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.16: (31 commits)
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
...
This is a companion patch to 'mingw: handle `subst`-ed "DOS drives"':
use the DOS drive prefix handling that is already provided by
`compat/mingw.c` (and which just learned to handle non-alphabetical
"drive letters").
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.15: (29 commits)
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
...
Since ac1fbbda20 (submodule: do not copy unknown update mode from
.gitmodules, 2013-12-02), Git has been careful to avoid copying
[submodule "foo"]
update = !run an arbitrary scary command
from .gitmodules to a repository's local config, copying in the
setting 'update = none' instead. The gitmodules(5) manpage documents
the intention:
The !command form is intentionally ignored here for security
reasons
Unfortunately, starting with v2.20.0-rc0 (which integrated ee69b2a9
(submodule--helper: introduce new update-module-mode helper,
2018-08-13, first released in v2.20.0-rc0)), there are scenarios where
we *don't* ignore it: if the config store contains no
submodule.foo.update setting, the submodule-config API falls back to
reading .gitmodules and the repository-supplied !command gets run
after all.
This was part of a general change over time in submodule support to
read more directly from .gitmodules, since unlike .git/config it
allows a project to change values between branches and over time
(while still allowing .git/config to override things). But it was
never intended to apply to this kind of dangerous configuration.
The behavior change was not advertised in ee69b2a9's commit message
and was missed in review.
Let's take the opportunity to make the protection more robust, even in
Git versions that are technically not affected: instead of quietly
converting 'update = !command' to 'update = none', noisily treat it as
an error. Allowing the setting but treating it as meaning something
else was just confusing; users are better served by seeing the error
sooner. Forbidding the construct makes the semantics simpler and
means we can check for it in fsck (in a separate patch).
As a result, the submodule-config API cannot read this value from
.gitmodules under any circumstance, and we can declare with confidence
For security reasons, the '!command' form is not accepted
here.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
Users of oneway_merge() (like "reset --hard") learned to take
advantage of fsmonitor to avoid unnecessary lstat(2) calls.
* us/unpack-trees-fsmonitor:
unpack-trees: skip stat on fsmonitor-valid files
Recently we have declared that GIT_TEST_* variables take the
usual boolean values (it used to be that some used "non-empty
means true" and taking GIT_TEST_VAR=YesPlease as true); make
sure we notice and fail when non-bool strings are given to
these variables.
* sg/test-bool-env:
t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool
tests: add 'test_bool_env' to catch non-bool GIT_TEST_* values
The revision walking machinery uses resources like per-object flag
bits that need to be reset before a new iteration of walking
begins, but the resources related to topological walk were not
cleared correctly, which has been corrected.
* mh/clear-topo-walk-upon-reset:
revision: free topo_walk_info before creating a new one in init_topo_walk
revision: clear the topo-walk flags in reset_revision_walk
We have had compatibility fallback macro definitions for "PRIuMAX",
"PRIu32", etc. but did not for "PRIdMAX", while the code used the
last one apparently without any hiccup reported recently. The
fallback macro definitions for these <inttypes.h> macros that must
appear in C99 systems have been removed.
* hv/assume-priumax-is-available-anywhere:
git-compat-util.h: drop the `PRIuMAX` and other fallback definitions
"git submodule status" that is run from a subdirectory of the
superproject did not work well, which has been corrected.
* mg/submodule-status-from-a-subdirectory:
submodule: fix 'submodule status' when called from a subdirectory
Test cleanup.
* dl/t5520-cleanup:
t5520: replace `! git` with `test_must_fail git`
t5520: remove redundant lines in test cases
t5520: replace $(cat ...) comparison with test_cmp
t5520: don't put git in upstream of pipe
t5520: test single-line files by git with test_cmp
t5520: use test_cmp_rev where possible
t5520: replace test -{n,z} with test-lib functions
t5520: use test_line_count where possible
t5520: remove spaces after redirect operator
t5520: replace test -f with test-lib functions
t5520: let sed open its own input
t5520: use sq for test case names
t5520: improve test style
t: teach test_cmp_rev to accept ! for not-equals
t0000: test multiple local assignment
"git reset --patch $object" without any pathspec should allow a
tree object to be given, but incorrectly required a committish,
which has been corrected.
* nl/reset-patch-takes-a-tree:
reset: parse rev as tree-ish in patch mode
"git submodule status" and "git submodule status --cached" show
different things, but the documentation did not cover them
correctly, which has been corrected.
* mg/doc-submodule-status-cached:
doc: document 'git submodule status --cached'
The code to parse GPG output used to assume incorrectly that the
finterprint for the primary key would always be present for a valid
signature, which has been corrected.
* hi/gpg-optional-pkfp-fix:
gpg-interface: limit search for primary key fingerprint
gpg-interface: refactor the free-and-xmemdupz pattern
The sequencer machinery compared the HEAD and the state it is
attempting to commit to decide if the result would be a no-op
commit, even when amending a commit, which was incorrect, and
has been corrected.
* pw/sequencer-compare-with-right-parent-to-check-empty-commits:
sequencer: fix empty commit check when amending
"git rev-parse --show-toplevel" run outside of any working tree did
not error out, which has been corrected.
* jk/fail-show-toplevel-outside-working-tree:
rev-parse: make --show-toplevel without a worktree an error
"git unpack-objects" used to show progress based only on the number
of received and unpacked objects, which stalled when it has to
handle an unusually large object. It now shows the throughput as
well.
* sg/unpack-progress-throughput:
builtin/unpack-objects.c: show throughput progress
CI jobs for macOS has been made less chatty when updating perforce
package used during testing.
* jc/azure-ci-osx-fix-fix:
ci(osx): update homebrew-cask repository with less noise
"git range-diff" learned to take the "--notes=<ref>" and the
"--no-notes" options to control the commit notes included in the
log message that gets compared.
* dl/range-diff-with-notes:
format-patch: pass notes configuration to range-diff
range-diff: pass through --notes to `git log`
range-diff: output `## Notes ##` header
t3206: range-diff compares logs with commit notes
t3206: s/expected/expect/
t3206: disable parameter substitution in heredoc
t3206: remove spaces after redirect operators
pretty-options.txt: --notes accepts a ref instead of treeish
rev-list-options.txt: remove reference to --show-notes
argv-array: add space after `while`
The userdiff machinery has been taught that "async def" is another
way to begin a "function" in Python.
* jh/userdiff-python-async:
userdiff: support Python async functions
The logic to avoid duplicate label names generated by "git rebase
--rebase-merges" forgot that the machinery itself uses "onto" as a
label name, which must be avoided by auto-generated labels, which
has been corrected.
* dd/rebase-merge-reserves-onto-label:
sequencer: handle rebase-merges for "onto" message
The beginning of rewriting "git add -i" in C.
* js/builtin-add-i:
built-in add -i: implement the `help` command
built-in add -i: use color in the main loop
built-in add -i: support `?` (prompt help)
built-in add -i: show unique prefixes of the commands
built-in add -i: implement the main loop
built-in add -i: color the header in the `status` command
built-in add -i: implement the `status` command
diff: export diffstat interface
Start to implement a built-in version of `git add --interactive`
A label used in the todo list that are generated by "git rebase
--rebase-merges" is used as a part of a refname; the logic to come
up with the label has been tightened to avoid names that cannot be
used as such.
* js/rebase-r-safer-label:
rebase -r: let `label` generate safer labels
rebase-merges: move labels' whitespace mangling into `label_oid()`
git-gui learned to delete untracked files when the "Revert Changes"
option is selected. Since there are two types of revert operations (one
for tracked files and one for untracked ones), the "checkout" and
"deletion" operations are done in parallel. The status bar is updated
to allow both to use it in parallel.
* jg/revert-untracked:
git-gui: revert untracked files by deleting them
git-gui: update status bar to track operations
git-gui: consolidate naming conventions
Update the revert_helper proc to check for untracked files as well as
changes, and then handle changes to be reverted and untracked files with
independent blocks of code. Prompt the user independently for untracked
files, since the underlying action is fundamentally different (rm -f).
If after deleting untracked files, the directory containing them becomes
empty, then remove the directory as well. Migrate unlocking of the index
out of _close_updateindex to a responsibility of the caller, to permit
paths that don't directly unlock the index, and refactor the error
handling added in d4e890e5 so that callers can make flow control
decisions in the event of errors. Update Tcl/Tk dependency from 8.4 to
8.6 in git-gui.sh.
A new proc delete_files takes care of actually deleting the files in
batches, using the Tcler's Wiki recommended approach for keeping the UI
responsive.
Since the checkout_index and delete_files calls are both asynchronous
and could potentially complete in any order, a "chord" is used to
coordinate unlocking the index and returning the UI to a usable state
only after both operations are complete. The `SimpleChord` class,
based on TclOO (Tcl/Tk 8.6), is added in this commit.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Update the status bar to track updates as individual "operations" that
can overlap. Update all call sites to interact with the new status bar
mechanism. Update initialization to explicitly clear status text,
since otherwise it may persist across future operations.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
A few variables in this file use camelCase, while the overall standard
is snake_case. A consistent naming scheme will improve readability of
future changes. To avoid mixing naming changes with semantic changes,
this commit contains only naming changes.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Changes involving only blank lines are hidden with --ignore-blank-lines,
unless they appear in the context lines of other changes. This is
handled by xdl_get_hunk() for context added by --inter-hunk-context, -u
and -U.
Function context for -W and --function-context added by xdl_emit_diff()
doesn't pay attention to such ignored changes; it relies fully on
xdl_get_hunk() and shows just the post-image of ignored changes
appearing in function context. That's inconsistent and confusing.
Improve the result of using --ignore-blank-lines and --function-context
together by fully showing ignored changes if they happen to fall within
function context.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the only permitted drive letters for physical drives on Windows
are letters of the US-English alphabet, this restriction does not apply
to virtual drives assigned via `subst <letter>: <path>`.
To prevent targeted attacks against systems where "funny" drive letters
such as `1` or `!` are assigned, let's handle them as regular drive
letters on Windows.
This fixes CVE-2019-1351.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, filenames cannot have trailing spaces or periods, when
opening such paths, they are stripped automatically. Read: you can open
the file `README` via the file name `README . . .`. This ambiguity can
be used in combination with other security bugs to cause e.g. remote
code execution during recursive clones. This patch series fixes that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch fixes a vulnerability in the Windows-specific code where a
submodule names ending in a backslash were quoted incorrectly, and that
bug could be abused to insert command-line parameters e.g. to `ssh` in a
recursive clone.
Note: this bug is Windows-only, as we have to construct a command line
for the process-to-spawn, unlike Linux/macOS, where `execv()` accepts an
already-split command line.
While at it, other quoting issues are fixed as well.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Recursive clones are currently affected by a vulnerability that is
caused by too-lax validation of submodule names.
This topic branch fixes that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series makes it safe to use Git on Windows drives, even if
running on a mounted network share or within the Windows Subsystem for
Linux (WSL).
This topic branch addresses CVE-2019-1353.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Over a decade ago, in 25fe217b86 (Windows: Treat Windows style path
names., 2008-03-05), Git was taught to handle absolute Windows paths,
i.e. paths that start with a drive letter and a colon.
Unbeknownst to us, while drive letters of physical drives are limited to
letters of the English alphabet, there is a way to assign virtual drive
letters to arbitrary directories, via the `subst` command, which is
_not_ limited to English letters.
It is therefore possible to have absolute Windows paths of the form
`1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode
letters can also be used, e.g. `ä:\tschibät.sch`.
While it can be sensibly argued that users who set up such funny drive
letters really seek adverse consequences, the Windows Operating System
is known to be a platform where many users are at the mercy of
administrators who have their very own idea of what constitutes a
reasonable setup.
Therefore, let's just make sure that such funny paths are still
considered absolute paths by Git, on Windows.
In addition to Unicode characters, pretty much any character is a valid
drive letter, as far as `subst` is concerned, even `:` and `"` or even a
space character. While it is probably the opposite of smart to use them,
let's safeguard `is_dos_drive_prefix()` against all of them.
Note: `[::1]:repo` is a valid URL, but not a valid path on Windows.
As `[` is now considered a valid drive letter, we need to be very
careful to avoid misinterpreting such a string as valid local path in
`url_is_local_not_ssh()`. To do that, we use the just-introduced
function `is_valid_path()` (which will label the string as invalid file
name because of the colon characters).
This fixes CVE-2019-1351.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series plugs an attack vector we had overlooked in our
December 2014 work on `core.protectNTFS`.
Essentially, the path `.git::$INDEX_ALLOCATION/config` is interpreted as
`.git/config` when NTFS Alternate Data Streams are available (which they
are on Windows, and at least on network shares that are SMB-mounted on
macOS).
Needless to say: we don't want that.
In fact, we want to stay on the very safe side and not even special-case
the `$INDEX_ALLOCATION` stream type: let's just prevent Git from
touching _any_ explicitly specified Alternate Data Stream of `.git`.
In essence, we'll prevent Git from tracking, or writing to, any path
with a segment of the form `.git:<anything>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When creating a directory on Windows whose path ends in a space or a
period (or chains thereof), the Win32 API "helpfully" trims those. For
example, `mkdir("abc ");` will return success, but actually create a
directory called `abc` instead.
This stems back to the DOS days, when all file names had exactly 8
characters plus exactly 3 characters for the file extension, and the
only way to have shorter names was by padding with spaces.
Sadly, this "helpful" behavior is a bit inconsistent: after a successful
`mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because
the directory `abc ` does not actually exist).
Even if it would work, we now have a serious problem because a Git
repository could contain directories `abc` and `abc `, and on Windows,
they would be "merged" unintentionally.
As these paths are illegal on Windows, anyway, let's disallow any
accesses to such paths on that Operating System.
For practical reasons, this behavior is still guarded by the
config setting `core.protectNTFS`: it is possible (and at least two
regression tests make use of it) to create commits without involving the
worktree. In such a scenario, it is of course possible -- even on
Windows -- to create such file names.
Among other consequences, this patch disallows submodules' paths to end
in spaces on Windows (which would formerly have confused Git enough to
try to write into incorrect paths, anyway).
While this patch does not fix a vulnerability on its own, it prevents an
attack vector that was exploited in demonstrations of a number of
recently-fixed security bugs.
The regression test added to `t/t7417-submodule-path-url.sh` reflects
that attack vector.
Note that we have to adjust the test case "prevent git~1 squatting on
Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue.
It tries to clone two submodules whose names differ only in a trailing
period character, and as a consequence their git directories differ in
the same way. Previously, when Git tried to clone the second submodule,
it thought that the git directory already existed (because on Windows,
when you create a directory with the name `b.` it actually creates `b`),
but with this patch, the first submodule's clone will fail because of
the illegal name of the git directory. Therefore, when cloning the
second submodule, Git will take a different code path: a fresh clone
(without an existing git directory). Both code paths fail to clone the
second submodule, both because the the corresponding worktree directory
exists and is not empty, but the error messages are worded differently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is unfortunate that we need to quote arguments differently on
Windows, depending whether we build a command-line for MSYS2's `sh` or
for other Windows executables.
We already have a test helper to verify the latter, with this patch we
can also verify the former.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Certain characters are not admissible in file names on Windows, even if
Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they
are, e.g. `:`, `<`, `>`, etc
Let's disallow those characters explicitly in Windows builds of Git.
Note: just like trailing spaces or periods, it _is_ possible on Windows
to create commits adding files with such illegal characters, as long as
the operation leaves the worktree untouched. To allow for that, we
continue to guard `is_valid_win32_path()` behind the config setting
`core.protectNTFS`, so that users _can_ continue to do that, as long as
they turn the protections off via that config setting.
Among other problems, this prevents Git from trying to write to an "NTFS
Alternate Data Stream" (which refers to metadata stored alongside a
file, under a special name: "<filename>:<stream-name>"). This fix
therefore also prevents an attack vector that was exploited in
demonstrations of a number of recently-fixed security bugs.
Further reading on illegal characters in Win32 filenames:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A `git clone` will end with exit code 0 when `merged_entry()` returns a
positive value during a call of `unpack_trees()` to `traverse_trees()`.
The reason is that `unpack_trees()` will interpret a positive value not
to be an error.
The problem is, however, that `add_index_entry()` (which is called by
`merged_entry()` can report an error, and we really should fail the
entire clone in such a case.
Let's fix this problem, in preparation for a Windows-specific patch
disallowing `mkdir()` with directory names that contain a trailing space
(which is illegal on NTFS): we want `git clone` to abort when a path
cannot be checked out due to that condition.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the, say, 93rd trial run fails, it is a good idea to have a way to
skip the first 92 trials and dig directly into the 93rd in a debugger.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, file names cannot contain asterisks nor newline characters.
In an upcoming commit, we will make this limitation explicit,
disallowing even the creation of commits that introduce such file names.
However, in the test scripts touched by this patch, we _know_ that those
paths won't be checked out, so we _want_ to allow such file names.
Happily, the stringent path validation will be guarded via the
`core.protectNTFS` flag, so all we need to do is to force that flag off
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the stress test reported a problem with quoting certain arguments,
it is helpful to have a facility to play with those arguments in order
to find out whether variations of those arguments are affected, too.
Let's allow `test-run-command quote-stress-test -- <args>` to be used
for that purpose.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, we have to do all the command-line argument quoting
ourselves. Worse: we have to have two versions of said quoting, one for
MSYS2 programs (which have their own dequoting rules) and the rest.
We care mostly about the rest, and to make sure that that works, let's
have a stress test that comes up with all kinds of awkward arguments,
verifying that a spawned sub-process receives those unharmed.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Currently it is technically possible to let a submodule's git
directory point right into the git dir of a sibling submodule.
Example: the git directories of two submodules with the names `hippo`
and `hippo/hooks` would be `.git/modules/hippo/` and
`.git/modules/hippo/hooks/`, respectively, but the latter is already
intended to house the former's hooks.
In most cases, this is just confusing, but there is also a (quite
contrived) attack vector where Git can be fooled into mistaking remote
content for file contents it wrote itself during a recursive clone.
Let's plug this bug.
To do so, we introduce the new function `validate_submodule_git_dir()`
which simply verifies that no git dir exists for any leading directories
of the submodule name (if there are any).
Note: this patch specifically continues to allow sibling modules names
of the form `core/lib`, `core/doc`, etc, as long as `core` is not a
submodule name.
This fixes CVE-2019-1387.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Back in the DOS days, in the FAT file system, file names always
consisted of a base name of length 8 plus a file extension of length 3.
Shorter file names were simply padded with spaces to the full 8.3
format.
Later, the FAT file system was taught to support _also_ longer names,
with an 8.3 "short name" as primary file name. While at it, the same
facility allowed formerly illegal file names, such as `.git` (empty base
names were not allowed), which would have the "short name" `git~1`
associated with it.
For backwards-compatibility, NTFS supports alternative 8.3 short
filenames, too, even if starting with Windows Vista, they are only
generated on the system drive by default.
We addressed the problem that the `.git/` directory can _also_ be
accessed via `git~1/` (when short names are enabled) in 2b4c6efc82
(read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e.
since Git v1.9.5, by introducing the config setting `core.protectNTFS`
and enabling it by default on Windows.
In the meantime, Windows 10 introduced the "Windows Subsystem for Linux"
(short: WSL), i.e. a way to run Linux applications/distributions in a
thinly-isolated subsystem on Windows (giving rise to many a "2016 is the
Year of Linux on the Desktop" jokes). WSL is getting increasingly
popular, also due to the painless way Linux application can operate
directly ("natively") on files on Windows' file system: the Windows
drives are mounted automatically (e.g. `C:` as `/mnt/c/`).
Taken together, this means that we now have to enable the safe-guards of
Git v1.9.5 also in WSL: it is possible to access a `.git` directory
inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation
was disabled manually). Since regular Linux distributions run in WSL,
this means we have to enable `core.protectNTFS` at least on Linux, too.
To enable Services for Macintosh in Windows NT to store so-called
resource forks, NTFS introduced "Alternate Data Streams". Essentially,
these constitute additional metadata that are connected to (and copied
with) their associated files, and they are accessed via pseudo file
names of the form `filename:<stream-name>:<stream-type>`.
In a recent patch, we extended `core.protectNTFS` to also protect
against accesses via NTFS Alternate Data Streams, e.g. to prevent
contents of the `.git/` directory to be "tracked" via yet another
alternative file name.
While it is not possible (at least by default) to access files via NTFS
Alternate Data Streams from within WSL, the defaults on macOS when
mounting network shares via SMB _do_ allow accessing files and
directories in that way. Therefore, we need to enable `core.protectNTFS`
on macOS by default, too, and really, on any Operating System that can
mount network shares via SMB/CIFS.
A couple of approaches were considered for fixing this:
1. We could perform a dynamic NTFS check similar to the `core.symlinks`
check in `init`/`clone`: instead of trying to create a symbolic link
in the `.git/` directory, we could create a test file and try to
access `.git/config` via 8.3 name and/or Alternate Data Stream.
2. We could simply "flip the switch" on `core.protectNTFS`, to make it
"on by default".
The obvious downside of 1. is that it won't protect worktrees that were
clone with a vulnerable Git version already. We considered patching code
paths that check out files to check whether we're running on an NTFS
system dynamically and persist the result in the repository-local config
setting `core.protectNTFS`, but in the end decided that this solution
would be too fragile, and too involved.
The obvious downside of 2. is that everybody will have to "suffer" the
performance penalty incurred from calling `is_ntfs_dotgit()` on every
path, even in setups where.
After the recent work to accelerate `is_ntfs_dotgit()` in most cases,
it looks as if the time spent on validating ten million random
file names increases only negligibly (less than 20ms, well within the
standard deviation of ~50ms). Therefore the benefits outweigh the cost.
Another downside of this is that paths that might have been acceptable
previously now will be forbidden. Realistically, though, this is an
improvement because public Git hosters already would reject any `git
push` that contains such file names.
Note: There might be a similar problem mounting HFS+ on Linux. However,
this scenario has been considered unlikely and in light of the cost (in
the aforementioned benchmark, `core.protectHFS = true` increased the
time from ~440ms to ~610ms), it was decided _not_ to touch the default
of `core.protectHFS`.
This change addresses CVE-2019-1353.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Helped-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We just safe-guarded `.git` against NTFS Alternate Data Stream-related
attack vectors, and now it is time to do the same for `.gitmodules`.
Note: In the added regression test, we refrain from verifying all kinds
of variations between short names and NTFS Alternate Data Streams: as
the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it
is enough to test one in order to know that all of them are guarded
against.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We need to be careful to follow proper quoting rules. For example, if an
argument contains spaces, we have to quote them. Double-quotes need to
be escaped. Backslashes need to be escaped, but only if they are
followed by a double-quote character.
We need to be _extra_ careful to consider the case where an argument
ends in a backslash _and_ needs to be quoted: in this case, we append a
double-quote character, i.e. the backslash now has to be escaped!
The current code, however, fails to recognize that, and therefore can
turn an argument that ends in a single backslash into a quoted argument
that now ends in an escaped double-quote character. This allows
subsequent command-line parameters to be split and part of them being
mistaken for command-line options, e.g. through a maliciously-crafted
submodule URL during a recursive clone.
Technically, we would not need to quote _all_ arguments which end in a
backslash _unless_ the argument needs to be quoted anyway. For example,
`test\` would not need to be quoted, while `test \` would need to be.
To keep the code simple, however, and therefore easier to reason about
and ensure its correctness, we now _always_ quote an argument that ends
in a backslash.
This addresses CVE-2019-1350.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Previously, this function was written without focusing on speed,
intending to make reviewing the code as easy as possible, to avoid any
bugs in this critical code.
Turns out: we can do much better on both accounts. With this patch, we
make it as fast as this developer can make it go:
- We avoid the call to `is_dir_sep()` and make all the character
comparisons explicit.
- We avoid the cost of calling `strncasecmp()` and unroll the test for
`.git` and `git~1`, not even using `tolower()` because it is faster to
compare against two constant values.
- We look for `.git` and `.git~1` first thing, and return early if not
found.
- We also avoid calling a separate function for detecting chains of
spaces and periods.
Each of these improvements has a noticeable impact on the speed of
`is_ntfs_dotgit()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Probably inspired by HFS' resource streams, NTFS supports "Alternate
Data Streams": by appending `:<stream-name>` to the file name,
information in addition to the file contents can be written and read,
information that is copied together with the file (unless copied to a
non-NTFS location).
These Alternate Data Streams are typically used for things like marking
an executable as having just been downloaded from the internet (and
hence not necessarily being trustworthy).
In addition to a stream name, a stream type can be appended, like so:
`:<stream-name>:<stream-type>`. Unless specified, the default stream
type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In
other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the
`.git` directory!
In our work in Git v2.2.1 to protect Git on NTFS drives under
`core.protectNTFS`, we focused exclusively on NTFS short names, unaware
of the fact that NTFS Alternate Data Streams offer a similar attack
vector.
Let's fix this.
Seeing as it is better to be safe than sorry, we simply disallow paths
referring to *any* NTFS Alternate Data Stream of `.git`, not just
`::$INDEX_ALLOCATION`. This also simplifies the implementation.
This closes CVE-2019-1352.
Further reading about NTFS Alternate Data Streams:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The config setting `core.protectNTFS` is specifically designed to work
not only on Windows, but anywhere, to allow for repositories hosted on,
say, Linux servers to be protected against NTFS-specific attack vectors.
As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated
paths (but does not do the same for paths separated by forward slashes),
under the assumption that the backslash might not be a valid directory
separator on the _current_ Operating System.
However, the two callers, `verify_path()` and `fsck_tree()`, are
supposed to feed only individual path segments to the `is_ntfs_dotgit()`
function.
This causes a lot of duplicate scanning (and very inefficient scanning,
too, as the inner loop of `is_ntfs_dotgit()` was optimized for
readability rather than for speed.
Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of
splitting the paths by backslashes as directory separators on the
callers of said function.
Consequently, the `verify_path()` function, which already splits the
path by directory separators, now treats backslashes as directory
separators _explicitly_ when `core.protectNTFS` is turned on, even on
platforms where the backslash is _not_ a directory separator.
Note that we have to repeat some code in `verify_path()`: if the
backslash is not a directory separator on the current Operating System,
we want to allow file names like `\`, but we _do_ want to disallow paths
that are clearly intended to cause harm when the repository is cloned on
Windows.
The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now
needs to look for backslashes in tree entries' names specifically when
`core.protectNTFS` is turned on. While it would be tempting to
completely disallow backslashes in that case (much like `fsck` reports
names containing forward slashes as "full paths"), this would be
overzealous: when `core.protectNTFS` is turned on in a non-Windows
setup, backslashes are perfectly valid characters in file names while we
_still_ want to disallow tree entries that are clearly designed to
exploit NTFS-specific behavior.
This simplification will make subsequent changes easier to implement,
such as turning `core.protectNTFS` on by default (not only on Windows)
or protecting against attack vectors involving NTFS Alternate Data
Streams.
Incidentally, this change allows for catching malicious repositories
that contain tree entries of the form `dir\.gitmodules` already on the
server side rather than only on the client side (and previously only on
Windows): in contrast to `is_ntfs_dotgit()`, the
`is_ntfs_dotgitmodules()` function already expects the caller to split
the paths by directory separators.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In preparation to flipping the default on `core.protectNTFS`, let's have
some way to measure the speed impact of this config setting reliably
(and for comparison, the `core.protectHFS` config setting).
For now, this is a manual performance benchmark:
./t/helper/test-path-utils protect_ntfs_hfs [arguments...]
where the arguments are an optional number of file names to test with,
optionally followed by minimum and maximum length of the random file
names. The default values are one million, 3 and 20, respectively.
Just like `sqrti()` in `bisect.c`, we introduce a very simple function
to approximation the square root of a given value, in order to avoid
having to introduce the first user of `<math.h>` in Git's source code.
Note: this is _not_ implemented as a Unix shell script in t/perf/
because we really care about _very_ precise timings here, and Unix shell
scripts are simply unsuited for precise and consistent benchmarking.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
With `format.useAutoBase = true`, running rebase resulted in an
error:
fatal: failed to get upstream, if you want to record base commit automatically,
please use git branch --set-upstream-to to track a remote branch.
Or you could specify base commit by --base=<base-commit-id> manually
error:
git encountered an error while preparing the patches to replay
these revisions:
ede2467cdedc63784887b587a61c36b7850ebfac..d8f581194799ae29bf5fa72a98cbae98a1198b12
As a result, git cannot rebase them.
Fix this by always passing `--no-base` to format-patch from rebase so
that the effect of `format.useAutoBase` is negated.
Reported-by: Christian Biesinger <cbiesinger@google.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If `format.useAutoBase = true`, there was no way to override this from
the command-line. Teach the `--no-base` option in format-patch to
override `format.useAutoBase`.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of manually unsetting the config after the test case is done,
use test_config() to do it automatically. While we're at it, fix a typo
in a test case name.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since bb52995f3e (format-patch: introduce format.useAutoBase
configuration, 2016-04-26), `git rebase` has been broken when
`format.useAutoBase = true`. It fails when rebasing a branch:
fatal: failed to get upstream, if you want to record base commit automatically,
please use git branch --set-upstream-to to track a remote branch.
Or you could specify base commit by --base=<base-commit-id> manually
error:
git encountered an error while preparing the patches to replay
these revisions:
ede2467cdedc63784887b587a61c36b7850ebfac..d8f581194799ae29bf5fa72a98cbae98a1198b12
As a result, git cannot rebase them.
Demonstrate that failure here.
Reported-by: Christian Biesinger <cbiesinger@google.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that we will know if a
command fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the test more hash-agnostic by renaming variables from "sha1" to
some variation of "oid" or "packid". Also, replace the regex,
`[0-9a-f]\{40\}` with `$OID_REGEX`.
A better name for "incrpackid" (incremental pack-id) might have been
just "packid". However, later in the test suite, we have other uses of
"packid". Although the scopes of these variables don't conflict, a
future developer may think that commit_and_pack() and
test_has_duplicate_object() are semantically related somehow since they
share the same variable name. Give them distinct names so that it's
clear these uses are unrelated.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The egrep expressions in this test suite were of the form `^$variable`.
Although egrep works just fine, it's overkill since we're not using any
extended regex. Replace egrep invocations with grep so that we aren't
swatting flies with a sledgehammer.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to test that objects were not duplicated from the packfile was
duplicated many times. Extract the duplicated code into
test_has_duplicate_object() and use that instead.
Refactor the resulting extraction so that if the git command fails,
the return code is not silently lost.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to test that objects were not missing from the packfile was
duplicated many times. Extract the duplicated code into
test_no_missing_in_packs() and use that instead.
Refactor the resulting extraction so that if any git commands fail,
their return codes are not silently lost.
Instead of verifying each file of `alt_objects/pack/*.idx` individually
in a for-loop, batch them together into one verification step.
The original testing construct was O(n^2): it used a grep in a loop to
test whether any objects were missing in the packfile. Rewrite this to
extract the hash using sed or cut, sort the files, then use `comm -23`
so that finding missing lines from the original file is done more
efficiently.
While we're at it, add a space to `commit_and_pack ()` for style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we're now recommending lore.kernel.org, replace LKML link
with lore.kernel.org.
Although LKML has been around for a long time, nothing lasts forever
(see Gmane). Since LKML uses opaque message identifiers, switching to
lore.kernel.org should be a strict improvement since, even if
lore.kernel.org goes down, the Message-ID will allow future readers to
look up the referenced messages on any other archive.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only references to Gmane that remain are in RelNotes. Although these
are definitely not in active use, they might be of historical interest
for future readers so let's ensure that mail references are more robust.
Replace links to Gmane with links to lore.kernel.org (which is our new
preferred mailing list archive and has the Message-ID in the URL) and
bare Gmane ID references with Message-IDs.
The Message-IDs were found by searching for "gmane:<id>" on
https://public-inbox.org/git/ and taking the resulting message.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we're now recommending lore.kernel.org, replace marc.info links
with lore.kernel.org.
Although MARC has been around for a long time, nothing lasts forever
(see Gmane). Since MARC uses opaque message identifiers, switching to
lore.kernel.org should be a strict improvement since, even if
lore.kernel.org goes down, the Message-ID will allow future readers to
look up the referenced messages on any other archive.
We leave behind one reference to MARC in the README.md since it's a
perfectly fine mail archive for personal reading, just not for linking
messages for the future.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--patch`, even when <file> is not `stdin`. Such use case it not
really expected.
2) It is not allowed to pass pathspec in both args and file.
`you must specify path(s) to restore` block was moved down to be able to
test for `pathspec.nr` instead, because testing for `argc` is no longer
correct.
`git switch` does not support the new options because it doesn't expect
`<pathspec>` arguments.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was added in [1]. I understand that the duplicate change was not
intentional and comes from an oversight.
Also, in explanation, there was only one section for two synopsis
entries.
Fix both problems by removing duplicate synopsis.
<paths> vs <pathspec> is resolved in next patch.
[1] Commit b59698ae ("checkout doc: clarify command line args for "checkout paths" mode" 2017-10-11)
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--interactive/--patch/--edit`, even when <file> is not `stdin`.
Such use case it not really expected. Also, it would require changes
to `interactive_add()` and `edit_patch()`.
2) It is not allowed to pass pathspec in both args and file.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some code blocks were moved down to be able to test for `pathspec.nr`
in the next patch. Blocks are moved as is without any changes. This
is done as separate patch to reduce the amount of diffs in next patch.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch series fixes an issue where Git could formerly have been
tricked into creating a `.git` file with an unexpected (and therefore
unprotected) NTFS short name.
Incidentally, it also fixes an issue where a tree entry containing a
backslash could be tricked into following a symbolic link, i.e. Git
could be tricked into writing files outside the worktree.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `--export-marks` option of `git fast-import` is exposed also via the
in-stream command `feature export-marks=...` and it allows overwriting
arbitrary paths.
This topic branch prevents the in-stream version, to prevent arbitrary
file accesses by `git fast-import` streams coming from untrusted sources
(e.g. in remote helpers that are based on `git fast-import`).
This fixes CVE-2019-1348.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Previously, this function was completely undocumented. It is worth,
though, to explain what is going on, as it is not really obvious at all.
Suggested-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The backslash character is not a valid part of a file name on Windows.
Hence it is dangerous to allow writing files that were unpacked from
tree objects, when the stored file name contains a backslash character:
it will be misinterpreted as directory separator.
This not only causes ambiguity when a tree contains a blob `a\b` and a
tree `a` that contains a blob `b`, but it also can be used as part of an
attack vector to side-step the careful protections against writing into
the `.git/` directory during a clone of a maliciously-crafted
repository.
Let's prevent that, addressing CVE-2019-1354.
Note: we guard against backslash characters in tree objects' file names
_only_ on Windows (because on other platforms, even on those where NTFS
volumes can be mounted, the backslash character is _not_ a directory
separator), and _only_ when `core.protectNTFS = true` (because users
might need to generate tree objects for other platforms, of course
without touching the worktree, e.g. using `git update-index
--cacheinfo`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In addition to preventing `.git` from being tracked by Git, on Windows
we also have to prevent `git~1` from being tracked, as the default NTFS
short name (also known as the "8.3 filename") for the file name `.git`
is `git~1`, otherwise it would be possible for malicious repositories to
write directly into the `.git/` directory, e.g. a `post-checkout` hook
that would then be executed _during_ a recursive clone.
When we implemented appropriate protections in 2b4c6efc82 (read-cache:
optionally disallow NTFS .git variants, 2014-12-16), we had analyzed
carefully that the `.git` directory or file would be guaranteed to be
the first directory entry to be written. Otherwise it would be possible
e.g. for a file named `..git` to be assigned the short name `git~1` and
subsequently, the short name generated for `.git` would be `git~2`. Or
`git~3`. Or even `~9999999` (for a detailed explanation of the lengths
we have to go to protect `.gitmodules`, see the commit message of
e7cb0b4455 (is_ntfs_dotgit: match other .git files, 2018-05-11)).
However, by exploiting two issues (that will be addressed in a related
patch series close by), it is currently possible to clone a submodule
into a non-empty directory:
- On Windows, file names cannot end in a space or a period (for
historical reasons: the period separating the base name from the file
extension was not actually written to disk, and the base name/file
extension was space-padded to the full 8/3 characters, respectively).
Helpfully, when creating a directory under the name, say, `sub.`, that
trailing period is trimmed automatically and the actual name on disk
is `sub`.
This means that while Git thinks that the submodule names `sub` and
`sub.` are different, they both access `.git/modules/sub/`.
- While the backslash character is a valid file name character on Linux,
it is not so on Windows. As Git tries to be cross-platform, it
therefore allows backslash characters in the file names stored in tree
objects.
Which means that it is totally possible that a submodule `c` sits next
to a file `c\..git`, and on Windows, during recursive clone a file
called `..git` will be written into `c/`, of course _before_ the
submodule is cloned.
Note that the actual exploit is not quite as simple as having a
submodule `c` next to a file `c\..git`, as we have to make sure that the
directory `.git/modules/b` already exists when the submodule is checked
out, otherwise a different code path is taken in `module_clone()` that
does _not_ allow a non-empty submodule directory to exist already.
Even if we will address both issues nearby (the next commit will
disallow backslash characters in tree entries' file names on Windows,
and another patch will disallow creating directories/files with trailing
spaces or periods), it is a wise idea to defend in depth against this
sort of attack vector: when submodules are cloned recursively, we now
_require_ the directory to be empty, addressing CVE-2019-1349.
Note: the code path we patch is shared with the code path of `git
submodule update --init`, which must not expect, in general, that the
directory is empty. Hence we have to introduce the new option
`--force-init` and hand it all the way down from `git submodule` to the
actual `git submodule--helper` process that performs the initial clone.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As with export-marks in the previous commit, import-marks can access the
filesystem. This is significantly less dangerous than export-marks
because it only involves reading from arbitrary paths, rather than
writing them. However, it could still be surprising and have security
implications (e.g., exfiltrating data from a service that accepts
fast-import streams).
Let's lump it (and its "if-exists" counterpart) in with export-marks,
and enable the in-stream version only if --allow-unsafe-features is set.
Signed-off-by: Jeff King <peff@peff.net>
The fast-import stream command "feature export-marks=<path>" lets the
stream write marks to an arbitrary path. This may be surprising if you
are running fast-import against an untrusted input (which otherwise
cannot do anything except update Git objects and refs).
Let's disallow the use of this feature by default, and provide a
command-line option to re-enable it (you can always just use the
command-line --export-marks as well, but the in-stream version provides
an easy way for exporters to control the process).
This is a backwards-incompatible change, since the default is flipping
to the new, safer behavior. However, since the main users of the
in-stream versions would be import/export-based remote helpers, and
since we trust remote helpers already (which are already running
arbitrary code), we'll pass the new option by default when reading a
remote helper's stream. This should minimize the impact.
Note that the implementation isn't totally simple, as we have to work
around the fact that fast-import doesn't parse its command-line options
until after it has read any "feature" lines from the stream. This is how
it lets command-line options override in-stream. But in our case, it's
important to parse the new --allow-unsafe-features first.
There are three options for resolving this:
1. Do a separate "early" pass over the options. This is easy for us to
do because there are no command-line options that allow the
"unstuck" form (so there's no chance of us mistaking an argument
for an option), though it does introduce a risk of incorrect
parsing later (e.g,. if we convert to parse-options).
2. Move the option parsing phase back to the start of the program, but
teach the stream-reading code never to override an existing value.
This is tricky, because stream "feature" lines override each other
(meaning we'd have to start tracking the source for every option).
3. Accept that we might parse a "feature export-marks" line that is
forbidden, as long we don't _act_ on it until after we've parsed
the command line options.
This would, in fact, work with the current code, but only because
the previous patch fixed the export-marks parser to avoid touching
the filesystem.
So while it works, it does carry risk of somebody getting it wrong
in the future in a rather subtle and unsafe way.
I've gone with option (1) here as simple, safe, and unlikely to cause
regressions.
This fixes CVE-2019-1348.
Signed-off-by: Jeff King <peff@peff.net>
When we parse the --export-marks option, we don't immediately open the
file, but we do create any leading directories. This can be especially
confusing when a command-line option overrides an in-stream one, in
which case we'd create the leading directory for the in-stream file,
even though we never actually write the file.
Let's instead create the directories just before opening the file, which
means we'll create only useful directories. Note that this could change
the handling of relative paths if we chdir() in between, but we don't
actually do so; the only permanent chdir is from setup_git_directory()
which runs before either code path (potentially we should take the
pre-setup dir into account to avoid surprising the user, but that's an
orthogonal change).
The test just adapts the existing "override" test to use paths with
leading directories. This checks both that the correct directory is
created (which worked before but was not tested), and that the
overridden one is not (our new fix here).
While we're here, let's also check the error result of
safe_create_leading_directories(). We'd presumably notice any failure
immediately after when we try to open the file itself, but we can give a
more specific error message in this case.
Signed-off-by: Jeff King <peff@peff.net>
When asked to import marks from "subdir/file.marks", we create the
leading directory "subdir" if it doesn't exist. This makes no sense for
importing marks, where we only ever open the path for reading.
Most of the time this would be a noop, since if the marks file exists,
then the leading directories exist, too. But if it doesn't (e.g.,
because --import-marks-if-exists was used), then we'd create the useless
directory.
This dates back to 580d5f83e7 (fast-import: always create marks_file
directories, 2010-03-29). Even then it was useless, so it seems to have
been added in error alongside the --export-marks case (which _is_
helpful).
Signed-off-by: Jeff King <peff@peff.net>
We parse options like "--max-pack-size=" using skip_prefix(), which
makes sense to get at the bytes after the "=". However, we also parse
"--quiet" and "--stats" with skip_prefix(), which allows things like
"--quiet-nonsense" to behave like "--quiet".
This was a mistaken conversion in 0f6927c229 (fast-import: put option
parsing code in separate functions, 2009-12-04). Let's tighten this to
an exact match, which was the original intent.
Signed-off-by: Jeff King <peff@peff.net>
Our tests confirm that providing two "import-marks" options in a
fast-import stream is an error. However, the invoked command would fail
even without covering this case, because the marks files themselves do
not actually exist. Let's create the files to make sure we fail for the
right reason (we actually do, because the option parsing happens before
we open anything, but this future-proofs our test).
Signed-off-by: Jeff King <peff@peff.net>
When recursively cloning a superproject with some shallow modules
defined in its .gitmodules, then recloning with "--reference=<path>", an
error occurs. For example:
git clone --recurse-submodules --branch=master -j8 \
https://android.googlesource.com/platform/superproject \
master
git clone --recurse-submodules --branch=master -j8 \
https://android.googlesource.com/platform/superproject \
--reference master master2
fails with:
fatal: submodule '<snip>' cannot add alternate: reference repository
'<snip>' is shallow
When a alternate computed from the superproject's alternate cannot be
added, whether in this case or another, advise about configuring the
"submodule.alternateErrorStrategy" configuration option and using
"--reference-if-able" instead of "--reference" when cloning.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 31224cbdc7 ("clone: recursive and reference option triggers
submodule alternates", 2016-08-17) taught Git to support the
configuration options "submodule.alternateLocation" and
"submodule.alternateErrorStrategy" on a superproject.
If "submodule.alternateLocation" is configured to "superproject" on a
superproject, whenever a submodule of that superproject is cloned, it
instead computes the analogous alternate path for that submodule from
$GIT_DIR/objects/info/alternates of the superproject, and references it.
The "submodule.alternateErrorStrategy" option determines what happens
if that alternate cannot be referenced. However, it is not clear that
the clone proceeds as if no alternate was specified when that option is
not set to "die" (as can be seen in the tests in 31224cbdc7). Therefore,
document it accordingly.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When loading packfiles on start-up, we traverse the internal packfile
list once per file to avoid reloading packfiles that have already
been loaded. This check runs in quadratic time, so for poorly
maintained repos with a large number of packfiles, it can be pretty
slow.
Add a hashmap containing the packfile names as we load them so that
the average runtime cost of checking for already-loaded packs becomes
constant.
Add a perf test to p5303 to show speed-up.
The existing p5303 test runtimes are dominated by other factors and do
not show an appreciable speed-up. The new test in p5303 clearly exposes
a speed-up in bad cases. In this test we create 10,000 packfiles and
measure the start-up time of git rev-parse, which does little else
besides load in the packs.
Here are the numbers for the new p5303 test:
Test HEAD^ HEAD
---------------------------------------------------------------------
5303.12: load 10,000 packs 1.03(0.92+0.10) 0.12(0.02+0.09) -88.3%
Signed-off-by: Colin Stolley <cstolley@runbox.com>
Helped-by: Jeff King <peff@peff.net>
[jc: squashed the change to call hashmap in install_packed_git() by peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Sixt pointed out that the `err_win_to_posix()` function
mishandles `ERROR_SUCCESS`: it maps it to `ENOSYS`.
The only purpose of this function is to map Win32 API errors to `errno`
ones, and there is actually no equivalent to `ERROR_SUCCESS`: the idea
of `errno` is that it will only be set in case of an error, and left
alone in case of success.
Therefore, as pointed out by Junio Hamano, it is a bug to call this
function when there was not even any error to map.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since Windows doesn't provide gmtime_r(3) and localtime_r(3),
we're providing a compat version by using non-reentrant gmtime(3) and
localtime(3) as backend. Then, we copy the returned data into the
buffer.
By doing that, in case of failure, we will dereference a NULL pointer
returned by gmtime(3), and localtime(3), and we always return a valid
pointer instead of NULL.
Drop the memcpy(3) by using gmtime_s(), and use localtime_s() as the
backend on Windows, and make sure we will return NULL in case of
failure.
Cc: Johannes Sixt <j6t@kdbg.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the 'grep with invalid UTF-8 data' tests were added/adjusted in
8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26)
and 870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching,
2019-07-26) they lacked a redirect which caused them to falsely succeed
on most systems. The 'grep -i' test failed on systems where JIT was
disabled as it never reached the portion which was missing the redirect.
A recent patch added the missing redirect and exposed the fact that the
'PCRE v2: grep non-ASCII from invalid UTF-8 data with -i' test fails
regardless of whether JIT is enabled.
Based on the final paragraph in in 870eea8166:
When grepping a non-ASCII fixed string. This is a more general problem
that's hard to fix, but we can at least fix the most common case of
grepping for a fixed string without "-i". I can't think of a reason
for why we'd turn on PCRE2_UTF when matching byte-for-byte like that.
it seems that we don't expect that the case-insensitive grep will
succeed. Adjust the test to reflect that expectation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some codepaths in "gitweb" that forgot to escape URLs generated
based on end-user input have been corrected.
* jk/gitweb-anti-xss:
gitweb: escape URLs generated by href()
t/gitweb-lib.sh: set $REQUEST_URI
t/gitweb-lib.sh: drop confusing quotes
t9502: pass along all arguments in xss helper
The completion script (in contrib/) has been taught that "git svn"
supports the "--recursive" option.
* js/complete-svn-recursive:
completion: tab-complete "git svn --recursive"
Error handling after "git push" finishes sending the packdata and
waits for the response to the remote side has been improved.
* jk/send-pack-remote-failure:
send-pack: check remote ref status on pack-objects failure
"git fetch" codepath had a big "do not lazily fetch missing objects
when I ask if something exists" switch. This has been corrected by
marking the "does this thing exist?" calls with "if not please do not
lazily fetch it" flag.
* jt/fetch-remove-lazy-fetch-plugging:
promisor-remote: remove fetch_if_missing=0
clone: remove fetch_if_missing=0
fetch: remove fetch_if_missing=0
The completion script (in contrib/) learned that the "--onto"
option of "git rebase" can take its argument as the value of the
option.
* dl/complete-rebase-onto:
completion: learn to complete `git rebase --onto=`
Recent update to "git stash pop" made the command empty the index
when run with the "--quiet" option, which has been corrected.
* tg/stash-refresh-index:
stash: make sure we have a valid index before writing it
Handling of commit objects that use non UTF-8 encoding during
"rebase -i" has been improved.
* dd/sequencer-utf8:
sequencer: reencode commit message for am/rebase --show-current-patch
sequencer: reencode old merge-commit message
sequencer: reencode squashing commit's message
sequencer: reencode revert/cherry-pick's todo list
sequencer: reencode to utf-8 before arrange rebase's todo list
t3900: demonstrate git-rebase problem with multi encoding
configure.ac: define ICONV_OMITS_BOM if necessary
t0028: eliminate non-standard usage of printf
"git bundle" has been taught to use the parse options API. "git
bundle verify" learned "--quiet" and "git bundle create" learned
options to control the progress output.
* rj/bundle-ui-updates:
bundle-verify: add --quiet
bundle-create: progress output control
bundle: framework for options before bundle file
The patterns to detect function boundary for Elixir language has
been added.
* ln/userdiff-elixir:
userdiff: add Elixir to supported userdiff languages
Docfix.
* en/doc-typofix:
Fix spelling errors in no-longer-updated-from-upstream modules
multimail: fix a few simple spelling errors
sha1dc: fix trivial comment spelling error
Fix spelling errors in test commands
Fix spelling errors in messages shown to users
Fix spelling errors in names of tests
Fix spelling errors in comments of testcases
Fix spelling errors in code comments
Fix spelling errors in documentation outside of Documentation/
Documentation: fix a bunch of typos, both old and new
Misc doc fixes.
* en/misc-doc-fixes:
name-hash.c: remove duplicate word in comment
hashmap: fix documentation misuses of -> versus .
git-filter-branch.txt: correct argument name typo
Fetching from multiple remotes into the same repository in parallel
had a bad interaction with the recent change to (optionally) update
the commit-graph after a fetch job finishes, as these parallel
fetches compete with each other. Which has been corrected.
* js/fetch-multi-lockfix:
fetch: avoid locking issues between fetch.jobs/fetch.writeCommitGraph
fetch: add the command-line option `--write-commit-graph`
The watchman integration for fsmonitor was racy, which has been
corrected to be more conservative.
* kw/fsmonitor-watchman-fix:
fsmonitor: fix watchman integration
HTTP transport had possible allocator/deallocator mismatch, which
has been corrected.
* cb/curl-use-xmalloc:
remote-curl: unbreak http.extraHeader with custom allocators
Follow recent push to move API docs from Documentation/ to header
files and update config.h
* hw/config-doc-in-header:
config: move documentation to config.h
Messages from die() etc. can be mixed up from multiple processes
without even line buffering on Windows, which has been worked
around.
* js/vreportf-wo-buffering:
vreportf(): avoid relying on stdio buffering
"git worktree add" internally calls "reset --hard" that should not
descend into submodules, even when submodule.recurse configuration
is set, but it was affected. This has been corrected.
* pb/no-recursive-reset-hard-in-worktree-add:
worktree: teach "add" to ignore submodule.recurse config
"git merge --no-commit" needs "--no-ff" if you do not want to move
HEAD, which has been corrected in the manual page for "git bisect".
* ma/bisect-doc-sample-update:
Documentation/git-bisect.txt: add --no-ff to merge command
"git rev-parse --git-path HEAD.lock" did not give the right path
when run in a secondary worktree.
* js/git-path-head-dot-lock-fix:
git_path(): handle `.lock` files correctly
t1400: wrap setup code in test case
The implementation of "git log --graph" got refactored and then its
output got simplified.
* jc/log-graph-simplify:
t4215: use helper function to check output
graph: fix coloring of octopus dashes
graph: flatten edges that fuse with their right neighbor
graph: smooth appearance of collapsing edges on commit lines
graph: rename `new_mapping` to `old_mapping`
graph: commit and post-merge lines for left-skewed merges
graph: tidy up display of left-skewed merges
graph: example of graph output that can be simplified
graph: extract logic for moving to GRAPH_PRE_COMMIT state
graph: remove `mapping_idx` and `graph_update_width()`
graph: reduce duplication in `graph_insert_into_new_columns()`
graph: reuse `find_new_column_by_commit()`
graph: handle line padding in `graph_next_line()`
graph: automatically track display width of graph lines
Crufty code and logic accumulated over time around the object
parsing and low-level object access used in "git fsck" have been
cleaned up.
* jk/cleanup-object-parsing-and-fsck: (23 commits)
fsck: accept an oid instead of a "struct tree" for fsck_tree()
fsck: accept an oid instead of a "struct commit" for fsck_commit()
fsck: accept an oid instead of a "struct tag" for fsck_tag()
fsck: rename vague "oid" local variables
fsck: don't require an object struct in verify_headers()
fsck: don't require an object struct for fsck_ident()
fsck: drop blob struct from fsck_finish()
fsck: accept an oid instead of a "struct blob" for fsck_blob()
fsck: don't require an object struct for report()
fsck: only require an oid for skiplist functions
fsck: only provide oid/type in fsck_error callback
fsck: don't require object structs for display functions
fsck: use oids rather than objects for object_name API
fsck_describe_object(): build on our get_object_name() primitive
fsck: unify object-name code
fsck: require an actual buffer for non-blobs
fsck: stop checking tag->tagged
fsck: stop checking commit->parent counts
fsck: stop checking commit->tree value
commit, tag: don't set parsed bit for parse failures
...
We do not really want to `exit()` here, of course, as this is safely
libified code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not only laziness that we simply spawn `git diff -p --cached`
here: this command needs to use the pager, and the pager needs to exit
when the diff is done. Currently we do not have any way to make that
happen if we run the diff in-process. So let's just spawn.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Well, it is not a full implementation yet. In the interest of making
this easy to review (and easy to keep bugs out), we still hand off to
the Perl script to do the actual work.
The `patch` functionality actually makes up for more than half of the
1,800+ lines of `git-add--interactive.perl`. It will be ported from Perl
to C incrementally, later.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is yet another command, ported to C. It builds nicely on the
support functions introduced for other commands, with the notable
difference that only names are displayed for untracked files, no
file type or diff summary.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a relatively straight-forward port from the Perl version, with
the notable exception that we imitate `git reset -- <paths>` in the C
version rather than the convoluted `git ls-tree HEAD -- <paths> | git
update-index --index-info` followed by `git update-index --force-remove
-- <paths>` for the missed ones.
While at it, we fix the pretty obvious bug where the `revert` command
offers to unstage files that do not have staged changes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After `status` and `help`, it is now time to port the `update` command
to C, the second command that is shown in the main loop menu of `git add
-i`.
This `git add -i` command is the first one which lets the user choose a
subset of a list of files, and as such, this patch lays the groundwork
for the other commands of that category:
- It teaches the `print_file_item()` function to show a unique prefix
if we found any (the code to find it had been added already in the
previous patch where we colored the unique prefixes of the main loop
commands, but that patch uses the `print_command_item()` function to
display the menu items).
- This patch also adds the help text that is shown when the user input
to select items from the shown list could not be parsed.
- As `get_modified_files()` clears the list of files, it now has to take
care of clearing the _full_ `prefix_item_list` lest the `sorted` and
`selected` fields go stale and inconsistent.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `update`, `revert` and `add-untracked` commands allow selecting
multiple entries. Let's extend the `list_and_choose()` function to
accommodate those use cases.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the `update` command of `git add -i`, we are primarily interested in the
list of modified files that have worktree (i.e. unstaged) changes.
At the same time, we need to determine _also_ the staged changes, to be
able to produce the full added/deleted information.
The Perl script version of `git add -i` has a parameter of the
`list_modified()` function for that matter. In C, we can be a lot more
precise, using an `enum`.
The C implementation of the filter also has an easier time to avoid
unnecessary work, simply by using an adaptive order of the `diff-index`
and `diff-files` phases, and then skipping files in the second phase
when they have not been seen in the first phase.
Seeing as we change the meaning of the `phase` field, we rename it to
`mode` to reflect that the order depends on the exact invocation of the
`git add -i` command.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During a review, Junio Hamano pointed out that the `rev.prune_data` was
copied from another pathspec but never cleaned up.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9a780a384d (mingw: spawned processes need to inherit only standard
handles, 2019-11-22), we taught the Windows-specific part to restrict
which file handles are passed on to the spawned processes.
Since this logic seemed to be a bit fragile across Windows versions (we
_still_ support Windows Vista in Git for Windows, for example), a
fall-back was added to try spawning the process again, this time without
restricting which file handles are to be inherited by the spawned
process.
In the common case (i.e. when the process could not be spawned for
reasons _other_ than the file handle inheritance), the fall-back attempt
would still fail, of course.
Crucially, one thing we missed in that code path was to set `errno`
appropriately.
This should have been caught by t0061.2 which expected `errno` to be
`ENOENT` after trying to start a process for a non-existing executable,
but `errno` was set to `ENOENT` prior to the `CreateProcessW()` call:
while looking for the config settings for trace2, Git tries to access
`xdg_config` and `user_config` via `access_or_die()`, and as neither of
those config files exists when running the test case (because in Git's
test suite, `HOME` points to the test directory), the `errno` has the
expected value, but for the wrong reasons.
Let's fix that by making sure that `errno` is set correctly. It even
appears that `errno` was set in the _wrong_ case previously:
`CreateProcessW()` returns non-zero upon success, but `errno` was set
only in the non-zero case.
It would be nice if we could somehow fix t0061 to make sure that this
does not regress again. One approach that seemed like it should work,
but did not, was to set `errno` to 0 in the test helper that is used by
t0061.2.
However, when `mingw_spawnvpe()` wants to see whether the file in
question is a script, it calls `parse_interpreter()`, which in turn
tries to `open()` the file. Obviously, this call fails, and sets `errno`
to `ENOENT`, deep inside the call chain started from that test helper.
Instead, we force re-set `errno` at the beginning of the function
`mingw_spawnve_fd()`, which _should_ be safe given that callers of that
function will want to look at `errno` if -1 was returned. And if that
`errno` is 0 ("No error"), regression tests like t0061.2 will kick in.
Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the void pcre2_free() function in grep.c returned free().
While free() itself is void, afaict it's still an expression as per
section A.2.3, subsection 6.8.6 (jump-statement) in both C99 [1] and C11
[2]:
> return expression
Section 6.8.6.4 in C99 [1] and C11 [2] says that:
> A return statement with an expression shall not appear in a function
> whose return type is void.
The consequence of the old behavior was that developer builds with
pedantic errors enabled broke Git if PCRE2 was enabled and a
smart-enough compiler to detect these errors was used. This commit
fixes pedantic builds of Git that enables --with-libpcre.
[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
[2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, we have a command substitution whose output starts with a
NUL byte. bash and dash strip out any NUL bytes from the output; zsh
does not. As a consequence, zsh fails this test, since the command line
argument we use the variable in is truncated by the NUL byte.
POSIX says of a command substitution that if "the output contains any
null bytes, the behavior is unspecified," so all of the shells are in
compliance with POSIX. To make our code more portable, let's avoid
prefacing our variables with NUL bytes and instead leave only the
trailing one behind.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit refactors the use of verify_signed_buffer() outside of
gpg-interface.c to use check_signature() instead. It also turns
verify_signed_buffer() into a file-local function since it's now only
invoked internally by check_signature().
There were previously two globally scoped functions used in different
parts of Git to perform GPG signature verification:
verify_signed_buffer() and check_signature(). Now only
check_signature() is used.
The verify_signed_buffer() function doesn't guard against duplicate
signatures as described by Michał Górny [1]. Instead it only ensures a
non-erroneous exit code from GPG and the presence of at least one
GOODSIG status field. This stands in contrast with check_signature()
that returns an error if more than one signature is encountered.
The lower degree of verification makes the use of verify_signed_buffer()
problematic if callers don't parse and validate the various parts of the
GPG status message themselves. And processing these messages seems like
a task that should be reserved to gpg-interface.c with the function
check_signature().
Furthermore, the use of verify_signed_buffer() makes it difficult to
introduce new functionality that relies on the content of the GPG status
lines.
Now all operations that does signature verification share a single entry
point to gpg-interface.c. This makes it easier to propagate changed or
additional functionality in GPG signature verification to all parts of
Git, without having odd edge-cases that don't perform the same degree of
verification.
[1] https://dev.gentoo.org/~mgorny/articles/attack-on-git-signature-verification.html
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A number of t4210-log-i18n tests added in 4e2443b181 set LC_ALL to a UTF-8
locale (is_IS.UTF-8) but then pass an invalid UTF-8 string to --grep.
FreeBSD's regcomp() fails in this case with REG_ILLSEQ, "illegal byte
sequence," which git then passes to die():
fatal: command line: '�': illegal byte sequence
When these tests were added the commit message stated:
| It's possible that this
| test breaks the "basic" and "extended" backends on some systems that
| are more anal than glibc about the encoding of locale issues with
| POSIX functions that I can remember
which seems to be the case here.
Extend test-lib.sh to add a REGEX_ILLSEQ prereq, set it on FreeBSD, and
add !REGEX_ILLSEQ to the two affected tests.
Signed-off-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, git was intended to be single-thread executable.
`localtime(3)' can be used in such codebase for cleaner code.
Overtime, we're employing multithread in our code base.
Let's phase out `gmtime(3)' in favour of `localtime_r(3)'.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, git was intended to be single-thread executable.
`gmtime(3)' and `localtime(3)' can be used in such codebase
for cleaner code.
Overtime, we're employing multithread in our code base.
Let's phase out `gmtime(3)' and `localtime(3)' in favour of
`gmtime_r(3)' and `localtime_r(3)'.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file "empty" became unused with 1c5e94f459 (tests: use
'test_must_be_empty' instead of 'test_cmp <empty> <out>', 2018-08-19);
get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The file "frontend" became unused with 4de0bbd898 (t9300: use perl
"head -c" clone in place of "dd bs=1 count=16000" kluge, 2010-12-13);
get rid of it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When continuing an interactive rebase after a merge conflict was solved,
if the resolution could not be committed, sequencer_continue() would
return early without releasing its todo list, resulting in a memory
leak. This plugs this leak by jumping to the end of the function, where
the todo list is deallocated.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we're now recommending lore.kernel.org (and because the
public-inbox.org domain might eventually go away), let's update our
internal references to use it, too. That future-proofs our references,
and sets the example we want people to follow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing, we feed pack-objects a list of both positive and negative
objects. The positive objects are what we want to send, and the negative
objects are what the other side told us they have, which we can use to
limit the size of the push.
Before passing along a negative object, send_pack() will make sure we
actually have it (since we only know about it because the remote
mentioned it, not because it's one of our refs). So it's expected that
some of these objects will be missing on the local side. But looking for
a missing object is more expensive than one that we have: it triggers
reprepare_packed_git() to handle a racy repack, plus it has to explore
every alternate's loose object tree (which can be slow if you have a lot
of them, or have a high-latency filesystem).
This isn't usually a big problem, since repositories you're pushing to
don't generally have a large number of refs that are unrelated to what
the client has. But there's no reason such a setup is wrong, and it
currently performs poorly.
We can fix this by using OBJECT_INFO_QUICK, which tells the lookup
code that we expect objects to be missing. Notably, it will not re-scan
the packs, and it will use the loose cache from 61c7711cfe (sha1-file:
use loose object cache for quick existence check, 2018-11-12).
The downside is that in the rare case that we race with a local repack,
we might fail to feed some objects to pack-objects, making the resulting
push larger. But we'd never produce an invalid or incorrect push, just a
less optimal one. That seems like a reasonable tradeoff, and we already
do similar things on the fetch side (e.g., when marking COMPLETE
commits).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we have debugging-friendly alternatives to `test -f`, replace
instances of `test -f` with `test_path_is_file` so that if a command
ever fails, we get better debugging information.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code style for tests is to have statements on their own line if
possible. Move keywords onto their own line so that they conform with
the test style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since output is silenced when running without `-v` and debugging output
is useful with `-v`, remove redirections to /dev/null as it is not
useful.
In one case where the output of stdout is consumed, redirect the output
of test_commit to stderr.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that we will know if a
command fails.
In the 'interactive add' test case, we prepend a `test_must_fail` to
`git commit --interactive`. When there are no changes to commit,
`git commit` will exit with status code 1. Following along with the rest
of the file, we use `test_must_fail` to test for this case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no git commands upstream.
The other way is when a command is in a non-assignment command
substitution. The return code will be lost in favour of the surrounding
command's. Rewrite instances of this such that git commands are in an
assignment-only command substitution.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In inconsistency(), we had two `git rev-parse` invocations in the
upstream of a pipe within a command substitution. In case this
invocation ever failed, its exit code would be swallowed up and we would
not know about it.
Pull the command substitutions out into variable assignments so that
their return codes are not lost.
Drop the pipe into `tr` because the $(...) substitution already takes
care of stripping out newlines, so the `tr` invocations in the code are
superfluous.
Finally, given the way the tests actually employ "one-time-sed" via
$(cat one-time-sed) in t/lib-httpd/apply-one-time-sed.sh, convert the
`printf` into an `echo`. This makes it consistent with the final "server
loses a ref - ref in want" test, which does use `echo` rather than
`printf`.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several times in t5317, we would use `wc -l` to ensure that a grep
result is empty. However, grep already has a way to do that... Its
return code! Use `! grep` in the cases where we are ensuring that there
are no matching lines.
While at it, drop unnecessary invocations of `awk` and `sort` in each
affected test since those commands do not influence the outcome. It's
not clear why that extra work was being done in the first place, and the
code's history doesn't shed any light on the matter since these tests
were simply born this way[1], likely due to copy-paste programming. The
unnecessary work wasn't noticed even when the code was later touched for
various cleanups[2][3].
[1]: 9535ce7337 (pack-objects: add list-objects filtering, 2017-11-21)
[2]: bdbc17e86a (tests: standardize pipe placement, 2018-10-05)
[3]: 61de0ff695 (tests: don't swallow Git errors upstream of pipes, 2018-10-05)
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no git commands upstream.
The other way is when a command is in a non-assignment command
substitution. The return code will be lost in favour of the surrounding
command's. Rewrite instances of this such that git commands output to a
file and surrounding commands only call command substitutions with
non-git commands.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that we will know if a
command fails.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of rolling our own method to write out some lines into a file,
use the existing test_write_lines().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no git commands upstream.
The other way is when a command is in a non-assignment command
substitution. The return code will be lost in favour of the surrounding
command's. Rewrite instances of this so that git commands are either run
on their own or in an assignment-only command substitution.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a comment about intentionally inducing SIGPIPE since this is unusual
and future developers should be aware. Also, even though we are trying
to refactor git commands out of the upstream of pipes, we cannot do it
here since we rely on it being upstream to induce SIGPIPE. Comment on
that as well so that future developers do not try to change it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a command is in a non-assignment command substitution, the return
code will be lost in favour of the surrounding command's. As a result,
if a git command fails, we won't know about it. Rewrite instances of
this so that git commands are either run in an assignment-only command
substitution so that their return codes aren't lost.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we have a helper function that can test the number of lines in a
file that gives better debugging information on failure, use
test_line_count() to test the number of lines.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no git commands upstream.
The other way is when a command is in a non-assignment command
substitution. The return code will be lost in favour of the surrounding
command's. Rewrite instances of this so that git commands are either run
on their own or in an assignment-only command substitution.
This patch fixes a real buggy test: in 'copy note with "git notes
copy"', `git notes` was mistyped as `git note`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In generate_expected_cache_tree_rec(), there are currently two instances
of `git ls-files` in the upstream of a pipe. In the case where the
upstream git command fails, its return code will be lost. Extract the
`git ls-files` into its own call so that if it ever fails, its return
code is not lost.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, the `git frotz` command would fail but its return code was
hidden since it was in the upstream of a pipe. Break the pipeline into
two commands so that the return code is no longer lost. Also, mark
`git frotz` with test_must_fail since it's supposed to fail.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert `[ ... ]` to use `test` and test for the existence of a regular
file (`-f`) instead of any file (`-e`).
Move the `then`s onto their own lines so that it conforms with the
general test style.
Instead of redirecting input into sed, allow it to open its own input.
Use `cmp -s` instead of `diff` since we only care about whether the two
files are equal and `diff` is overkill for this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our 'osx-gcc' build job on Travis CI relied on GCC 8 being installed
(but not linked) in the image we use [1]. Alas, since the last update
of this image a few days ago this is not the case anymore, and now it
contains GCC 9 (installed and linked) instead of GCC 8. The results
are failed 'osx-gcc' jobs, because they can't find the 'gcc-8' command
[2].
Let's move on to use GCC 9, with hopefully better error reporting and
improved -Wfoo flags and what not. On Travis CI this has the benefit
that we can spare a few seconds while installing dependencies, because
it already comes pre-installed, at least for now. The Azure Pipelines
OSX image doesn't include GCC, so we have to install it ourselves
anyway, and then we might as well install the newer version.
In a vain attempt I tried to future-proof this a bit:
- Install 'gcc@9' specifically, so we'll still get what we want even
after GCC 10 comes out, and the "plain" 'gcc' package starts to
refer to 'gcc@10'.
- Run both 'brew install gcc@9' and 'brew link gcc@9'. If 'gcc@9'
is already installed and linked, then both commands are noop and
exit with success. But as we saw in the past, sometimes the image
contains the expected GCC package installed but not linked, so
maybe it will happen again in the future as well. In that case
'brew install' is still a noop, and instructs the user to run
'brew link' instead, so that's what we'll do. And if 'gcc@9' is
not installed, then 'brew install' will install it, and the
subsequent 'brew link' becomes a noop.
An additional benefit of this patch is that from now on we won't
unnecessarily install GCC and its dependencies in the 'osx-clang' jobs
on Azure Pipelines.
[1] 7d4733c501 (ci: fix GCC install in the Travis CI GCC OSX job,
2019-10-24)
[2] https://travis-ci.org/git/git/jobs/615442297#L333
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_must_be_empty instead of comparing it to an empty file. That's
more efficient, as the function only needs built-in meta-data only check
in the usual case, and provides nicer debug output otherwise.
Helped-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two tests in t7812, added in 8a599983 ("grep: stess test PCRE v2 on
invalid UTF-8 data", 2019-07-26), were missing redirects, failing to
actually test the produced output.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_must_be_empty instead of reading the file and comparing its
contents to an empty string. That's more efficient, as the function
only needs built-in meta-data only check in the usual case, and provides
nicer debug output otherwise.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_must_be_empty instead of reading the file and comparing its
contents to an empty string. That's more efficient, as the function
only needs built-in meta-data only check in the usual case, and provides
nicer debug output otherwise.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_line_count to check if the number of lines matches
expectations, for improved consistency and nicer debug output.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_line_count to check if the number of lines matches
expectations, for improved consistency and nicer debug output.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let skip_prefix() advance refname to get rid of two magic numbers.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of a magic number by using skip_prefix().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of a magic number by using skip_prefix() instead of
starts_with().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of two magic numbers by using skip_prefix().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of magic numbers by letting skip_prefix() set the pointer
"what".
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-request-pull.sh script invokes Perl, so it requires Perl to be
available, but the associated test t5150 does not skip itself when Perl
has been disabled, which then makes subtest 4 through 10 fail. Subtest 3
still passes, but for the wrong reasons (it expects git-request-pull to
fail, and it does fail when Perl is not available). The initial two
subtests that do pass are only doing setup.
To prevent t5150 from failing the build when NO_PERL=1, add a check that
sets skip_all when "! test_have_prereq PERL", just like how for example
t3701-add-interactive skips itself when Perl has been disabled.
Signed-off-by: Ruud van Asseldonk <dev@veniogames.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a commit-graph, we show progress along several commit
walks. When we use start_delayed_progress(), the progress line will
only appear if that step takes a decent amount of time.
However, one place was missed: computing generation numbers. This is
normally a very fast operation as all commits have been parsed in a
previous step. But, this is showing up for all users no matter how few
commits are being added.
The tests that check for the progress output have already been updated
to use GIT_PROGRESS_DELAY=0 to force the expected output.
Helped-by: Jeff King <peff@peff.net>
Reported-by: ryenus <ryenus@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The start_delayed_progress() method is a preferred way to show
optional progress to users as it ignores steps that take less
than two seconds. However, this makes testing unreliable as tests
expect to be very fast.
In addition, users may want to decrease or increase this time
interval depending on their preferences for terminal noise.
Create the GIT_PROGRESS_DELAY environment variable to control
the delay set during start_delayed_progress(). Set the value
in some tests to guarantee their output remains consistent.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf suite's aggregate.perl depends on Git.pm, which is a mild
annoyance if you've built git with NO_PERL. It turns out that the only
thing we use it for is a single call of the command_oneline() helper.
We can just replace this with backticks or similar.
Annoyingly, perl has no backtick equivalent that avoids a shell eval,
which means our $arg would require quoting. This probably doesn't matter
for our purposes, but it's better to be safe and model good style. So
we'll just provide a short helper around open(), which takes its
arguments as a list.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf tests write files recording the results of tests. These
results are later aggregated by 'aggregate.perl'. If the tests are run
multiple times, those results are overwritten by the new results. This
works just fine as long as there are only perf tests measuring the
times, whose results are stored in "$base".times files.
However 22bec79d1a ("t/perf: add infrastructure for measuring sizes",
2018-08-17) introduced a new type of test for measuring the size of
something. The results of this are written to "$base".size files.
"$base" is essentially made up of the basename of the script plus the
test number. So if test numbers shift because a new test was
introduced earlier in the script we might end up with both a ".times"
and a ".size" file for the same test. In the aggregation script the
".times" file is preferred over the ".size" file, so some size tests
might end with performance numbers from a previous run of the test.
This is mainly relevant when writing perf tests that check both
performance and sizes, and can get quite confusing during
developement.
We could fix this by doing a more thorough job of cleaning out old
".times" and ".size" files before running each test. However, an even
easier solution is to just use the same filename for both types of
measurement, meaning we'll always overwrite the previous result. We
don't even need to change the file format to distinguish the two;
aggregate.perl already decides which is which based on a regex of the
content (this may become ambiguous if we add new types in the future,
but we could easily add a header field to the file at that point).
Based on an initial patch from Thomas Gummerer, who discovered the
problem and did all of the analysis (which I stole for the commit
message above):
https://public-inbox.org/git/20191119185047.8550-1-t.gummerer@gmail.com/
Helped-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'test_commit_bulk' is invoked in an empty test repository, it
prints a "fatal: Needed a single revision" error, but still does what
it's supposed to do. A test helper function displaying a fatal error
and still succeeding is always suspect to be buggy, but luckily that's
not the case here: that error comes from a 'git rev-parse --verify
HEAD' command invoked in a condition, which doesn't have anything to
verify in an empty repository.
Use the '--quiet' option to suppress that error message.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the way the `--reset-author-date` option is described,
and mark its usage string translatable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `git submodule status` while in a subdirectory, we are
incorrectly not detecting modified submodules and
thus reporting that all of the submodules are unchanged.
This is because the submodule helper is calling `diff-index` with the
submodule path assuming the path is relative to the current prefix
directory, however the submodule path used is actually relative to the root.
Always pass NULL as the `prefix` when running diff-files on the
submodule, to make sure the submodule's path is interpreted as relative
to the superproject's repository root.
Signed-off-by: Manish Goregaokar <manishsmail@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, complete_action(), used by builtin/rebase.c to start a new
rebase, calls sequencer_continue() to do it. Before the former calls
pick_commits(), it
- calls read_and_refresh_cache() -- this is unnecessary here as we've
just called require_clean_work_tree() in complete_action()
- calls read_populate_opts() -- this is unnecessary as we're starting a
new rebase, so `opts' is fully populated
- loads the todo list -- this is unnecessary as we've just populated
the todo list in complete_action()
- commits any staged changes -- this is unnecessary as we're starting a
new rebase, so there are no staged changes
- calls record_in_rewritten() -- this is unnecessary as we're starting
a new rebase.
This changes complete_action() to directly call pick_commits() to avoid
these unnecessary steps.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sequencer_continue() is called by complete_action(), `opts' has
been filled by get_replay_opts(). Currently, it does not initialise the
`squash_onto' field (used by the `--root' mode), only
read_populate_opts() does. It’s not a problem yet since
sequencer_continue() calls it before pick_commits(), but it would lead
to incorrect results once complete_action() is modified to call
pick_commits() directly.
Let’s change that.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The total number of commands can be used to show the progression of the
rebasing in a shell. It is written to the disk by read_populate_todo()
when the todo list is loaded from sequencer_continue() or
pick_commits(), but not by complete_action().
This moves the part writing total_nr to a new function so it can be
called from complete_action().
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a todo list, `done_nr' is the number of commands that were executed
or skipped, but skip_unnecessary_picks() did not update it.
This variable is mostly used by command prompts (ie. git-prompt.sh and
the like). As in the previous commit, this inconsistent behaviour is
not a problem yet, but it would start to matter at the end of this
series the same reason.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`total_nr' is the total number of items, counting both done and todo,
that are in a todo list. But unlike `nr', it was not updated when an
item was appended to the list.
This variable is mostly used by command prompts (ie. git-prompt.sh and
the like). By forgetting to update it, the original code made it not
reflect the reality, but this flaw was masked by the code calling
unnecessarily read_populate_todo() again to update the variable to its
correct value. At the end of this series, the unnecessary call will be
removed, and the inconsistency addressed by this patch would start to
matter.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
init_topo_walk doesn't reuse an existing topo_walk_info, and currently
leaks the one that might exist on the current rev_info if it was already
used for a topo walk beforehand.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not doing so can lead to wrong topo-walks when using the revision walk API
consecutively.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's code base already seems to be using `PRIdMAX` without any such
fallback definition for quite a while (75459410ed (json_writer: new
routines to create JSON data, 2018-07-13), to be precise, and the
first Git version to include that commit was v2.19.0). Having a
fallback definition only for `PRIuMAX` is a bit inconsistent.
We do sometimes get portability reports more than a year after the
problem was introduced. This one should be fairly safe. PRIuMAX is
in C99 (for that matter, SCNuMAX, PRIu32 and others also are), and
we've been picking up other C99-isms without complaint.
The PRIuMAX fallback definition was originally added in 3efb1f343a
(Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.,
2007-02-20). But it was replacing a construct that was introduced in
an even earlier commit, 579d1fbfaf (Add NO_C99_FORMAT to support
older compilers., 2006-07-30), which talks about gcc 2.95.
That's pretty ancient at this point.
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Jeff King <peff@peff.net>
[jc: tweaked both message and code, taking what peff wrote]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 2f328c3d ("reset $sha1 $pathspec: require $sha1 only to be
treeish", 2013-01-14), we allowed "git reset $object -- $path" to reset
individual paths that match the pathspec to take the blob from a tree
object, not necessarily a commit, while the form to reset the tip of the
current branch to some other commit still must be given a commit.
Like resetting with paths, "git reset --patch" does not update HEAD, and
need not require a commit. The path-filtered form, "git reset --patch
$object -- $pathspec", has accepted a tree-ish since 2f328c3d.
"git reset --patch" is documented as accepting a <tree-ish> since
bf44142f ("reset: update documentation to require only tree-ish with
paths", 2013-01-16). Documentation changes are not required.
Loosen the restriction that requires a commit for the unfiltered "git
reset --patch $object".
Signed-off-by: Nika Layzell <nika@thelayzells.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git submodule update' will fetch new commits from the submodule remote
if the SHA-1 recorded in the superproject is not found. This was not
mentioned in the documentation.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git submodule status --cached' reports the SHAs recorded in the
index of the superproject, instead of the SHAs that are checked out
in the submodule.
Signed-off-by: Manish Goregaokar <manishsmail@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git revert' or 'git cherry-pick --edit' is invoked with multiple
commits, then after editing the first commit message is finished both
these commands should continue with processing the second commit and
launch another editor for its commit message, assuming there are
no conflicts, of course.
Alas, this inadvertently changed with commit a47ba3c777 (rebase -i:
check for updated todo after squash and reword, 2019-08-19): after
editing the first commit message is finished, both 'git revert' and
'git cherry-pick --edit' exit with error, claiming that "nothing to
commit, working tree clean".
The reason for the changed behaviour is twofold:
- Prior to a47ba3c777 the up-to-dateness of the todo list file was
only checked after 'exec' instructions, and that commit moved
those checks to the common code path. The intention was that this
check should be performed after instructions spawning an editor
('squash' and 'reword') as well, so the ongoing 'rebase -i'
notices when the user runs a 'git rebase --edit-todo' while
squashing/rewording a commit message.
However, as it happened that check is now performed even after
'revert' and 'pick' instructions when they involved editing the
commit message. And 'revert' by default while 'pick' optionally
(with 'git cherry-pick --edit') involves editing the commit
message.
- When invoking 'git revert' or 'git cherry-pick --edit' with
multiple commits they don't read a todo list file but assemble the
todo list in memory, thus the associated stat data used to check
whether the file has been updated is all zeroed out initially.
Then the sequencer writes all instructions (including the very
first) to the todo file, executes the first 'revert/pick'
instruction, and after the user finished editing the commit
message the changes of a47ba3c777 kick in, and it checks whether
the todo file has been modified. The initial all-zero stat data
obviously differs from the todo file's current stat data, so the
sequencer concludes that the file has been modified. Technically
it is not wrong, of course, because the file just has been written
indeed by the sequencer itself, though the file's contents still
match what the sequencer was invoked with in the beginning.
Consequently, after re-reading the todo file the sequencer
executes the same first instruction _again_, thus ending up in
that "nothing to commit" situation.
The todo list was never meant to be edited during multi-commit 'git
revert' or 'cherry-pick' operations, so perform that "has the todo
file been modified" check only when the sequencer was invoked as part
of an interactive rebase.
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turns out that it don't work so well on Vista, see
https://github.com/git-for-windows/git/issues/1742 for details.
According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it
*should* work on Windows Vista and later.
But apparently there are issues on Windows Vista when pipes are
involved. Given that Windows Vista is past its end of life (official
support ended on April 11th, 2017), let's not spend *too* much time on
this issue and just disable the file handle inheritance restriction on
any Windows version earlier than Windows 7.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, CreateProcess() does not inherit any open file handles,
unless the bInheritHandles parameter is set to TRUE. Which we do need to
set because we need to pass in stdin/stdout/stderr to talk to the child
processes. Sadly, this means that all file handles (unless marked via
O_NOINHERIT) are inherited.
This lead to problems in VFS for Git, where a long-running read-object
hook is used to hydrate missing objects, and depending on the
circumstances, might only be called *after* Git opened a file handle.
Ideally, we would not open files without O_NOINHERIT unless *really*
necessary (i.e. when we want to pass the opened file handle as standard
handle into a child process), but apparently it is all-too-easy to
introduce incorrect open() calls: this happened, and prevented updating
a file after the read-object hook was started because the hook still
held a handle on said file.
Happily, there is a solution: as described in the "Old New Thing"
https://blogs.msdn.microsoft.com/oldnewthing/20111216-00/?p=8873 there
is a way, starting with Windows Vista, that lets us define precisely
which handles should be inherited by the child process.
And since we bumped the minimum Windows version for use with Git for
Windows to Vista with v2.10.1 (i.e. a *long* time ago), we can use this
method. So let's do exactly that.
We need to make sure that the list of handles to inherit does not
contain duplicates; Otherwise CreateProcessW() would fail with
ERROR_INVALID_ARGUMENT.
While at it, stop setting errno to ENOENT unless it really is the
correct value.
Also, fall back to not limiting handle inheritance under certain error
conditions (e.g. on Windows 7, which is a lot stricter in what handles
you can specify to limit to).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some reason, when being called via TortoiseGit the standard handles,
or at least what is returned by _get_osfhandle(0) for standard input,
can take on the value (HANDLE)-2 (which is not a legal value, according
to the documentation).
Even if this value is not documented anywhere, CreateProcess() seems to
work fine without complaints if hStdInput set to this value.
In contrast, the upcoming code to restrict which file handles get
inherited by spawned processes would result in `ERROR_INVALID_PARAMETER`
when including such handle values in the list.
To help this, special-case the value (HANDLE)-2 returned by
_get_osfhandle() and replace it with INVALID_HANDLE_VALUE, which will
hopefully let the handle inheritance restriction work even when called
from TortoiseGit.
This fixes https://github.com/git-for-windows/git/issues/1481
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When spawning child processes, we really should be careful which file
handles we let them inherit.
This is doubly important on Windows, where we cannot rename, delete, or
modify files if there is still a file handle open.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_CLONE_2GB environment variable is only ever checked with
'test -z' in 't5608-clone-2gb.sh', so any non-empty value is
interpreted as "yes, run these expensive tests", even
'GIT_TEST_CLONE_2GB=NoThanks'.
Similar GIT_TEST_* environment variables have already been turned into
bools in 3b072c577b (tests: replace test_tristate with "git
env--helper", 2019-06-21), so let's turn GIT_TEST_CLONE_2GB into a
bool as well, to follow suit.
Our CI builds set GIT_TEST_CLONE_2GB=YesPlease, so adjust them
accordingly, thus removing the last 'YesPlease' from our CI scripts.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) we get the normalized bool values of various GIT_TEST_*
environment variables via 'git env--helper'. Now, while the 'git
env--helper' command itself does catch invalid values in the
environment variable or in the given --default and exits with error
(exit code 128 or 129, respectively), it's invoked in conditions like
'if ! git env--helper ...', which means that all invalid bool values
are interpreted the same as the ordinary 'false' (exit code 1). This
has led to inadvertently skipped httpd tests in our CI builds for a
couple of weeks, see 3960290675 (ci: restore running httpd tests,
2019-09-06).
Let's be more careful about what the test suite accepts as bool values
in GIT_TEST_* environment variables, and error out loud and clear on
invalid values instead of simply skipping tests. Add the
'test_bool_env' helper function to encapsulate the invocation of 'git
env--helper' and the verification of its exit code, and replace all
invocations of that command in our test framework and test suite with
a call to this new helper (except in 't0017-env-helper.sh', of
course).
$ GIT_TEST_GIT_DAEMON=YesPlease ./t5570-git-daemon.sh
fatal: bad numeric config value 'YesPlease' for 'GIT_TEST_GIT_DAEMON': invalid unit
error: test_bool_env requires bool values both for $GIT_TEST_GIT_DAEMON and for the default fallback
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a regression introduced in 356ee4659b ("sequencer: try to
commit without forking 'git commit'", 2017-11-24). When amending a
commit try_to_commit() was using the wrong parent when checking if the
commit would be empty. When amending we need to check against HEAD^ not
HEAD.
t3403 may not seem like the natural home for the new tests but a further
patch series will improve the advice printed by `git commit`. That
series will mutate these tests to check that the advice includes
suggesting `rebase --skip` to skip the fixup that would empty the
commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We deprecated `--preserve-merges` in favor of `--rebase-merges`; Let's
reflect that in `git svn`.
Note: Even when the user asks for `--preserve-merges`, we now silently
pass `--rebase-merges` to `git rebase` instead. Technically, this is a
change of behavior. But practically, `git svn` only ever asks for a
non-interactive rebase, and `--preserve-merges` and `--rebase-merges`
are on par with regard to that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The VALIDSIG status line from GnuPG with --status-fd is documented to
have 9 required and 1 optional fields [1]. The final, and optional,
field is used to specify the fingerprint of the primary key that made
the signature in case it was made by a subkey. However, this field is
only available for OpenPGP signatures; not for CMS/X.509.
If the VALIDSIG status line does not have the optional 10th field, the
current code will continue reading onto the next status line. And this
is the case for non-OpenPGP signatures [1].
The consequence is that a subsequent status line may be considered as
the "primary key" for signatures that does not have an actual primary
key.
Limit the search of these 9 or 10 fields to the single line to avoid
this problem. If the 10th field is missing, report that there is no
primary key fingerprint.
[Reference]
[1] GnuPG Details, General status codes
https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=blob;f=doc/DETAILS;h=6ce340e8c04794add995e84308bb3091450bd28f;hb=HEAD#l483
The documentation says:
VALIDSIG <args>
The args are:
- <fingerprint_in_hex>
- <sig_creation_date>
- <sig-timestamp>
- <expire-timestamp>
- <sig-version>
- <reserved>
- <pubkey-algo>
- <hash-algo>
- <sig-class>
- [ <primary-key-fpr> ]
This status indicates that the signature is cryptographically
valid. [...] PRIMARY-KEY-FPR is the fingerprint of the primary key
or identical to the first argument.
The primary-key-fpr parameter is used for OpenPGP and not available
for CMS signatures. [...]
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a static replace_cstring() function to simplify repeated
pattern of free-and-xmemdupz() for GPG status line parsing.
This also helps us avoid potential memleaks if parsing of new status
lines are introduced in the future.
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index-merge performed by 'git sparse-checkout' will erase any staged
changes, which can lead to data loss. Prevent these attempts by requiring
a clean 'git status' output.
Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout init' subcommand previously wrote directly
to the sparse-checkout file and then updated the working directory.
This may fail if there are modified files not included in the initial
pattern set. However, that left a populated sparse-checkout file.
Use the in-process working directory update to guarantee that the
init subcommand only changes the sparse-checkout file if the working
directory update succeeds.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During the development of the sparse-checkout "cone mode" feature,
an incorrect placement of the initializer for "use_cone_patterns = 1"
caused warnings to show up when a .gitignore file was present with
non-cone-mode patterns. This was fixed in the original commit
introducing the cone mode, but now we should add a test to avoid
hitting this problem again in the future.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If two 'git sparse-checkout set' subcommands are launched at the
same time, the behavior can be unexpected as they compete to write
the sparse-checkout file and update the working directory.
Take a lockfile around the writes to the sparse-checkout file. In
addition, acquire this lock around the working directory update
to avoid two commands updating the working directory in different
ways.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout disable' subcommand returns a user to a
full working directory. The old process for doing this required
updating the sparse-checkout file with the "/*" pattern and then
updating the working directory with core.sparseCheckout enabled.
Finally, the sparse-checkout file could be removed and the config
setting disabled.
However, it is valuable to keep a user's sparse-checkout file
intact so they can re-enable the sparse-checkout they previously
used with 'git sparse-checkout init'. This is now possible with
the in-process mechanism for updating the working directory.
Reported-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin used 'git read-tree -mu HEAD' to update the
skip-worktree bits in the index and to update the working directory.
This extra process is overly complex, and prone to failure. It also
requires that we write our changes to the sparse-checkout file before
trying to update the index.
Remove this extra process call by creating a direct call to
unpack_trees() in the same way 'git read-tree -mu HEAD' does. In
addition, provide an in-memory list of patterns so we can avoid
reading from the sparse-checkout file. This allows us to test a
proposed change to the file before writing to it.
An earlier version of this patch included a bug when the 'set' command
failed due to the "Sparse checkout leaves no entry on working directory"
error. It would not rollback the index.lock file, so the replay of the
old sparse-checkout specification would fail. A test in t1091 now
covers that scenario.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user provides folders A/ and A/B/ for inclusion in a cone-mode
sparse-checkout file, the parsing logic will notice that A/ appears
both as a "parent" type pattern and as a "recursive" type pattern.
This is unexpected and hence will complain via a warning and revert
to the old logic for checking sparse-checkout patterns.
Prevent this from happening accidentally by sanitizing the folders
for this type of inclusion in the 'git sparse-checkout' builtin.
This happens in two ways:
1. Do not include any parent patterns that also appear as recursive
patterns.
2. Do not include any recursive patterns deeper than other recursive
patterns.
In order to minimize duplicate code for scanning parents, create
hashmap_contains_parent() method. It takes a strbuf buffer to
avoid reallocating a buffer when calling in a tight loop.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a large repository has many sparse-checkout patterns, the
process for updating the skip-worktree bits can take long enough
that a user gets confused why nothing is happening. Update the
clear_ce_flags() method to write progress.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature in "cone mode" can use the fact that
the recursive patterns are "connected" to the root via parent
patterns to decide if a directory is entirely contained in the
sparse-checkout or entirely removed.
In these cases, we can skip hashing the paths within those
directories and simply set the skipworktree bit to the correct
value.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make the cone pattern set easy to use, update the behavior of
'git sparse-checkout (init|set)'.
Add '--cone' flag to 'git sparse-checkout init' to set the config
option 'core.sparseCheckoutCone=true'.
When running 'git sparse-checkout set' in cone mode, a user only
needs to supply a list of recursive folder matches. Git will
automatically add the necessary parent matches for the leading
directories.
When testing 'git sparse-checkout set' in cone mode, check the
error stream to ensure we do not see any errors. Specifically,
we want to avoid the warning that the patterns do not match
the cone-mode patterns.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parent and recursive patterns allowed by the "cone mode"
option in sparse-checkout are restrictive enough that we
can avoid using the regex parsing. Everything is based on
prefix matches, so we can use hashsets to store the prefixes
from the sparse-checkout file. When checking a path, we can
strip path entries from the path and check the hashset for
an exact match.
As a test, I created a cone-mode sparse-checkout file for the
Linux repository that actually includes every file. This was
constructed by taking every folder in the Linux repo and creating
the pattern pairs here:
/$folder/
!/$folder/*/
This resulted in a sparse-checkout file sith 8,296 patterns.
Running 'git read-tree -mu HEAD' on this file had the following
performance:
core.sparseCheckout=false: 0.21 s (0.00 s)
core.sparseCheckout=true: 3.75 s (3.50 s)
core.sparseCheckoutCone=true: 0.23 s (0.01 s)
The times in parentheses above correspond to the time spent
in the first clear_ce_flags() call, according to the trace2
performance traces.
While this example is contrived, it demonstrates how these
patterns can slow the sparse-checkout feature.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.
Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.
The config option does nothing right now, but will be expanded
upon in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git updates the working directory with the sparse-checkout
feature enabled, the unpack_trees() method calls clear_ce_flags()
to update the skip-wortree bits on the cache entries. This
check can be expensive, depending on the patterns used.
Add trace2 regions around the method, including some flag
information, so we can get granular performance data during
experiments. This data will be used to measure improvements
to the pattern-matching algorithms for sparse-checkout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The instructions for disabling a sparse-checkout to a full
working directory are complicated and non-intuitive. Add a
subcommand, 'git sparse-checkout disable', to perform those
steps for the user.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
and places them in the sparse-checkout file. Then, it updates the
working directory to match those patterns. For a large list of
patterns, the command-line call can get very cumbersome.
Add a '--stdin' option to instead read patterns over standard in.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
as arguments and writes them to the sparse-checkout file. Then, it
updates the working directory using 'git read-tree -mu HEAD'.
The 'set' subcommand will replace the entire contents of the
sparse-checkout file. The write_patterns_and_update() method is
extracted from cmd_sparse_checkout() to make it easier to implement
'add' and/or 'remove' subcommands in the future.
If the core.sparseCheckout config setting is disabled, then enable
the config setting in the worktree config. If we set the config
this way and the sparse-checkout fails, then re-disable the config
setting.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When someone wants to clone a large repository, but plans to work
using a sparse-checkout file, they either need to do a full
checkout first and then reduce the patterns they included, or
clone with --no-checkout, set up their patterns, and then run
a checkout manually. This requires knowing a lot about the repo
shape and how sparse-checkout works.
Add a new '--sparse' option to 'git clone' that initializes the
sparse-checkout file to include the following patterns:
/*
!/*/
These patterns include every file in the root directory, but
no directories. This allows a repo to include files like a
README or a bootstrapping script to grow enlistments from that
point.
During the 'git sparse-checkout init' call, we must first look
to see if HEAD is valid, since 'git clone' does not have a valid
HEAD at the point where it initializes the sparse-checkout. The
following checkout within the clone command will create the HEAD
ref and update the working directory correctly.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Getting started with a sparse-checkout file can be daunting. Help
users start their sparse enlistment using 'git sparse-checkout init'.
This will set 'core.sparseCheckout=true' in their config, write
an initial set of patterns to the sparse-checkout file, and update
their working directory.
Make sure to use the `extensions.worktreeConfig` setting and write
the sparse checkout config to the worktree-specific config file.
This avoids confusing interactions with other worktrees.
The use of running another process for 'git read-tree' is sub-
optimal. This will be removed in a later change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature is mostly hidden to users, as its
only documentation is supplementary information in the docs for
'git read-tree'. In addition, users need to know how to edit the
.git/info/sparse-checkout file with the right patterns, then run
the appropriate 'git read-tree -mu HEAD' command. Keeping the
working directory in sync with the sparse-checkout file requires
care.
Begin an effort to make the sparse-checkout feature a porcelain
feature by creating a new 'git sparse-checkout' builtin. This
builtin will be the preferred mechanism for manipulating the
sparse-checkout file and syncing the working directory.
The documentation provided is adapted from the "git read-tree"
documentation with a few edits for clarity in the new context.
Extra sections are added to hint toward a future change to
a more restricted pattern set.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index might be aware that a file hasn't modified via fsmonitor, but
unpack-trees did not pay attention to it and checked via ie_match_stat
which can be inefficient on certain filesystems. This significantly slows
down commands that run oneway_merge, like checkout and reset --hard.
This patch makes oneway_merge check whether a file is considered
unchanged through fsmonitor and skips ie_match_stat on it. unpack-trees
also now correctly copies over fsmonitor validity state from the source
index. Finally, for correctness, we force a refresh of fsmonitor state in
tweak_fsmonitor.
After this change, commands like stash (that use reset --hard
internally) go from 8s or more to ~2s on a 250k file repository on a
mac.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Utsav Shah <utsav@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code style for tests is to have statements on their own line if
possible. Move the `then` onto its own line so that it conforms with the
test style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, if a git command fails in an unexpected way, such as a
segfault, it will be masked and ignored. Replace the ! with
test_must_fail so that only expected failures pass.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patches, the mechanical application of changes left some
duplicate statements in the test case which were not strictly incorrect
but were redundant and possibly misleading. Remove these duplicate
statements so that it is clear that the intent behind the tests are that
the content of the file stays the same throughout the whole test case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently have many instances of `test <line> = $(cat <file>)` and
`test $(cat <file>) = <line>`. In the case where this fails, it will be
difficult for a developer to debug since the output will be masked.
Replace these instances with invocations of test_cmp().
This change was done with the following GNU sed expressions:
s/\(\s*\)test \([^=]*\)= "$(cat \([^)]*\))"/\1echo \2>expect \&\&\n\1test_cmp expect \3/
s/\(\s*\)test "$(cat \([^)]*\))" = \([^&]*\)\( &&\)\?$/\1echo \3 >expect \&\&\n\1test_cmp expect \2\4/
A future patch will clean up situations where we have multiple duplicate
statements within a test case. This is done to keep this patch purely
mechanical.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, if the invocation of git failed, it would be masked by the pipe
since only the return code of the last element of a pipe is used.
Rewrite the test to put the git command on its own line so its return
code is not masked.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case an invocation of a git command fails within the command
substitution, the failure will be masked. Replace the command
substitution with a file-redirection and a call to test_cmp.
This change was done with the following GNU sed expressions:
s/\(\s*\)test \([^ ]*\) = "$(\(git [^)]*\))"/\1echo \2 >expect \&\&\n\1\3 >actual \&\&\n\1test_cmp expect actual/
s/\(\s*\)test "$(\(git [^)]*\))" = \([^ ]*\)/\1echo \3 >expect \&\&\n\1\2 >actual \&\&\n\1test_cmp expect actual/
A future patch will clean up situations where we have multiple duplicate
statements within a test case. This is done to keep this patch purely
mechanical.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case an invocation of `git rev-list` fails within the command
substitution, the failure will be masked. Remove the command
substitution and use test_cmp_rev() so that failures can be discovered.
This change was done with the following sed expressions:
s/test "$(git rev-parse.* \([^)]*\))" = "$(git rev-parse \([^)]*\))"/test_cmp_rev \1 \2/
s/test \([^ ]*\) = "$(git rev-parse.* \([^)]*\))"/test_cmp_rev \1 \2/
s/test "$(git rev-parse.* \([^)]*\))" != "$(git rev-parse.* \([^)]*\))"/test_cmp_rev ! \1 \2/
s/test \([^ ]*\) != "$(git rev-parse.* \([^)]*\))"/test_cmp_rev ! \1 \2/
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When wrapping a git command in a command substitution within another
command, we throw away the git command's exit code. In case the git
command fails, we would like to know about it rather than the failure
being silent. Extract git commands so that their exit codes are not
lost.
Instead of using `test -n` or `test -z`, replace them respectively with
invocations of test_file_not_empty() and test_must_be_empty() so that we
get better debugging information in the case of a failure.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of rolling our own functionality to test the number of lines a
command outputs, use test_line_count() which provides better debugging
information in the case of a failure.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The style for tests in Git is to have the redirect operator attached to
the filename with no spaces. Fix test cases where this is not the case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although `test -f` has the same functionality as test_path_is_file(), in
the case where test_path_is_file() fails, we get much better debugging
information.
Replace `test -f` with test_path_is_file() so that future developers
will have a better experience debugging these test cases.
Also, in the case of `! test -f`, not only should that path not be a
file, it shouldn't exist at all so replace it with
test_path_is_missing().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were using a redirection operator to feed input into sed. However,
since sed is capable of opening its own files, make sed open its own
files instead of redirecting input into it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usual convention is for test case names to be written between
single-quotes. Change all double-quoted test case names to single-quotes
except for two test case names that use variables within.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the test style by removing leading and trailing empty lines
within test cases. Also, reformat multi-line subshells to conform to the
existing style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case where we are using test_cmp_rev() to report not-equals, we
write `! test_cmp_rev`. However, since test_cmp_rev() contains
r1=$(git rev-parse --verify "$1") &&
r2=$(git rev-parse --verify "$2") &&
`! test_cmp_rev` will succeed if any of the rev-parses fail. This
behavior is not desired. We want the rev-parses to _always_ be
successful.
Rewrite test_cmp_rev() to optionally accept "!" as the first argument to
do a not-equals comparison. Rewrite `! test_cmp_rev` to `test_cmp_rev !`
in all tests to take advantage of this new functionality.
Also, rewrite the rev-parse logic to end with a `|| return 1` instead of
&&-chaining into the rev-comparison logic. This makes it obvious to
future readers that we explicitly intend on returning early if either of
the rev-parses fail.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to POSIX enhancement request '0000767: Add built-in
"local"'[1],
dash only allows one variable in a local definition; it permits
assignment though it doesn't document that clearly.
however, this isn't true since t0000 still passes with this patch
applied on dash 0.5.10.2. Needless to say, since `local` isn't POSIX
standardized, it is not exactly clear what `local` entails on different
versions of different shells.
We currently already have many instances of multiple local assignments
in our codebase. Ensure that this is actually supported by explicitly
testing that it is sane.
[1]: http://austingroupbugs.net/bug_view_page.php?bug_id=767
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since format-patch accepts `--[no-]notes`, one would expect the
range-diff generated to also respect the setting. Unfortunately, the
range-diff we currently generate only uses the default option (which
always outputs default notes, even when notes are not being used
elsewhere).
Pass the notes configuration to range-diff so that it can honor it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a commit being range-diff'd has a note attached to it, the note
will be compared as well. However, if a user has multiple notes refs or
if they want to suppress notes from being printed, there is currently no
way to do this.
Pass through `--[no-]notes[=<ref>]` to the `git log` call so that this
option is customizable.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When notes were included in the output of range-diff, they were just
mashed together with the rest of the commit message. As a result, users
wouldn't be able to clearly distinguish where the commit message ended
and where the notes started.
Output a `## Notes ##` header when notes are detected so that notes can
be compared more clearly.
Note that we handle case of `Notes (<ref>): -> ## Notes (<ref>) ##` with
this code as well. We can't test this in this patch, however, since
there is currently no way to pass along different notes refs to `git
log`. This will be fixed in a future patch.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test suite had a blindspot where it did not check the behavior of
range-diff and format-patch when notes were present. Cover this
blindspot.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For test cases, the usual convention is to name expected output files
"expect", not "expected". Replace all instances of "expected" with
"expect".
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the first heredoc, parameter substitution is not used so prevent it
from happening in the future (perhaps by accident) by escaping the limit
EOF.
The remaining heredocs use parameter substitution so they cannot be
changed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove the
one instance of this.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although `--notes=` accepts and handles a tree-ish just fine, it isn't a
common use-case for users to pass in bare trees. By having "treeish", it
makes it harder to click in users' minds that the argument here is the
same type as the `notes.displayRef` configuration variable, for example.
Change `treeish` to `ref` so that it will be easier for users to make
this connection.
Note that we don't completely lose the notion that `--notes=` can accept
a tree-ish. In git-notes.txt, we have
It is also permitted for a notes ref to point directly to a tree
object, in which case the history of the notes can be read with
`git log -p -g <refname>`.
which means that a hardcore user who wants to take advantage of this
obscure use-case will be able to infer the connection and not be
completely left in the dark.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python's async functions (declared with "async def" rather than "def")
were not being displayed in hunk headers. This commit teaches git about
the async function syntax, and adds tests for the Python userdiff regex.
Signed-off-by: Josh Holland <anowlcalledjosh@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since Git was taught the `--pretty=reference` option, it is no longer
necessary to manually specify the format string to get the commit
reference. Teach users to use the new option while keeping the old
invocation around in case they have an older version of Git.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The standard format for referencing other commits within some projects
(such as git.git) is the reference format. This is described in
Documentation/SubmittingPatches as
If you want to reference a previous commit in the history of a stable
branch, use the format "abbreviated hash (subject, date)", like this:
....
Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30)
noticed that ...
....
Since this format is so commonly used, standardize it as a pretty
format.
The tests that are implemented essentially show that the format-string
does not change in response to various log options. This is useful
because, for future developers, it shows that we've considered the
limitations of the "canned format-string" approach and we are fine with
them.
Based-on-a-patch-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we plan on having a pretty format which will use a
default date format unless otherwise overidden. Add support for this by
adding a `default_date_mode_type` member in `struct cmt_fmt_map`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the placeholders %as and %cs to format author date and committer
date, respectively, without the time part, like --date=short does, i.e.
like YYYY-MM-DD.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test suite does not include any tests where `--reflog` and `-z` are
used together in `git log`. Cover this blindspot. Note that the
`--pretty=oneline` case is written separately because it follows a
slightly different codepath.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of memsetting and then initializing the fields in the struct,
move the initialization of `format_context` to its assignment.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_revision_mark() used to return a `char *`, even though all of the
strings it was returning were string literals. Make get_revision_mark()
return a `const char *` so that callers won't be tempted to modify the
returned string.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Quoting SZEDER Gábor[1],
SubmittingPatches is simply wrong: our de-facto standard format for
referencing other commits does not enclose the subject in a pair of
double-quotes:
$ git log v2.24.0 |grep -E '[0-9a-f]{7} \("' |wc -l
785
$ git log v2.24.0 |grep -E '[0-9a-f]{7} \([^"]' |wc -l
2276
Those double-quotes don't add any value to the references, but they
result in weird looking references for 1083 of our commits whose
subject lines happen to end with double-quotes, e.g.:
f23a465132 ("hashmap_get{,_from_hash} return "struct hashmap_entry *"", 2019-10-06)
and without those unnecessary pair of double-quotes we would have
~3000 more commits whose summary would fit on a single line.
Remove references to the enclosing double-quotes from SubmittingPatches
since our de-facto standard for referencing commits does not actually
use them.
[1]: cf. <20191114011048.GS4348@szeder.dev>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since Git is planning on upgrading from SHA-1 to be more hash-agnostic,
replace specific references to SHA-1 with more generic terminology.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since Git is planning on upgrading from SHA-1 to be more hash-agnostic,
replace specific references to SHA-1 with more generic terminology.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ab18b2c0df ("log/pretty-options: Document --[no-]notes and deprecate
old notes options", 2011-03-30), the `--show-notes` option was
deprecated. However, this reference to it still remains. Change it to
reference the replacement option: `--notes`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--interactive/--patch`, even when <file> is not `stdin`. Such use
case it not really expected. Also, it would require changes to
`interactive_add()`.
2) It is not allowed to pass pathspec in both args and file.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add` shows an example of good writing, follow it.
This also better disambiguates <file>... header.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--patch`, even when <file> is not `stdin`. Such use case it not
really expected. Also, it is harder to support in `git commit`, so
I decided to make it incompatible in all places.
2) It is not allowed to pass pathspec in both args and file.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will be used to support the new option '--pathspec-from-file' in
`git add`, `git-commit`, `git reset` etc.
Note also that we specifically handle CR/LF line endings to support
Windows better.
To simplify code, file is first parsed into `argv_array`. This allows
to avoid refactoring `parse_pathspec()`.
I considered adding `nul_term_line` to `flags` instead, but decided
that it doesn't fit there.
The new code is mostly taken from `cmd_update_index()` and
`split_mail_conv()`.
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The OSX CI build procedure updates the homebrew-cask repository
before attempting to install perforce again, after seeing an
installation failure. This involves a "git pull" that by default
computes and outputs diffstat, which would only grow as the time
goes by and the repository cast in stone in the CI build image
becomes more and more stale relative to the upstream repository in
the outside world.
Suppress the diffstat to both save cycles to generate it, and strain
on the eyeballs to skip it.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to work correctly, git-rebase --rebase-merges needs to make
initial todo list with unique labels.
Those unique labels is being handled by employing a hashmap and
appending an unique number if any duplicate is found.
But, we forget that beside those labels for side branches,
we also have a special label `onto' for our so-called new-base.
In a special case that any of those labels for side branches named
`onto', git will run into trouble.
Correct it.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git unpack-objects' shows a progress line only counting the number of
unpacked objects, so if some of the received objects are unusually
large, then that progress might appear to be frozen while processing
such a larger object. I just stared at a seemingly stuck progress
line for over half a minute, while 'git fetch' was busy receiving a
pack with only a couple of objects (i.e. fewer than
'fetch.unpackLimit'), with one of them being over 80MB.
Display throughput in 'git unpack-objects' progress line, so we show
that something is going on even when receiving and processing a large
object.
Counting the consumed bytes is far away from the place that
counts objects and displays progress, and to pass around the 'struct
progress' instance we would have to modify the signature of five
functions and 14 of their callsites: this is just too much churn, so
let's rather make it file-scope static.
'git index-pack', i.e. the non-unpacking cousin of 'git
unpack-objects' already includes throughput in its progress line, and
it uses a file-scope static 'struct progress' instance as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since it was introduced in 7cceca5ccc (Add 'git rev-parse
--show-toplevel' option., 2010-01-12), the --show-toplevel option has
treated a missing working tree as a quiet success: it neither prints a
toplevel path, but nor does it report any kind of error.
While a caller could distinguish this case by looking for an empty
response, the behavior is rather confusing. We're better off complaining
that there is no working tree, as other internal commands would do in
similar cases (e.g., "git status" or any builtin with NEED_WORK_TREE set
would just die()). So let's do the same here.
While we're at it, let's clarify the documentation and add some tests,
both for the new behavior and for the more mundane case (which was not
covered).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 regions to fetch-pack.c to better track time spent in the various
phases of a fetch:
* parsing remote refs and finding a cutoff
* marking local refs as complete
* marking complete remote refs as common
All stages could potentially be slow for repositories with many refs.
Signed-off-by: Erik Chen <erikchen@chromium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-run-command.txt
to run-command.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Documentation/technical/api-run-command.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the functions documentation from
Documentation/technical/api-trace2.txt to trace2.h as it's easier for the
developers to find the usage information beside the code instead of looking
for it in another doc file.
Only the functions documentation section is removed from
Documentation/technical/api-trace2.txt as the file is full of
details that seemed more appropriate to be in a separate doc file
as it is, with a link to the doc file added in the trace2.h.
Also the functions doc is removed to avoid having redundandt info which
will be hard to keep syncronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a link to Documentation/technical/api-parse-options.txt in parse-options.h
So the developers would know where to find more info about the API.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-submodule-config.txt
to submodule-config.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Documentation/technical/api-submodule-config.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
The documentation of parse_submodule_config_option() is discarded as the
function was removed 2 years ago.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-credentials.txt
to credential.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Documentation/technical/api-credentials.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Documentation/git-credential.txt and Documentation/gitcredentials.txt now link
to credential.h instead of Documentation/technical/api-credentials.txt for
details about the credetials API.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-tree-walking.txt
to tree-walk.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Documentation/technical/api-tree-walking.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-argv-array.txt
to argv-array.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-argv-array.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-trace.txt
to trace.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Documentation/technical/api-trace.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-allocation-growing.txt
to cache.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-allocation-growing.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-sigchain.txt
to sigchain.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-sigchain.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-setup.txt
to pathspec.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-setup.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-revision-walking.txt
to revision.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-revision-walking.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-gitattributes.txt
to attr.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-gitattributes.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-ref-iteration.txt
to refs.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-ref-iteration.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-remote.txt to
remote.h and refspec.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
N.B. The doc for both push and fetch members of the remote struct aren't
moved because they are out of date, as the members were changed from arrays
of rspecs to struct refspec 2 years ago.
Also documentation/technical/api-remote.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-oid-array.txt to
sha1-array.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-oid-array.txt is removed because the
information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the related documentation from Documentation/technical/api-merge.txt
to ll-merge.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Only the ll-merge related doc is removed from
documentation/technical/api-merge.txt because this information will be
redundant and it'll be hard to keep it up to date and synchronized with
the documentation in ll-merge.h.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-history-graph.txt to
graph.h and graph.c as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
The graph library was already well documented, so few comments were added to
both graph.h and graph.c
Also documentation/technical/api-history-graph.txt is removed because
the information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header file.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-directory-listing.txt
to dir.h as it's easier for the developers to find the usage information
beside the code instead of looking for it in another doc file.
Also documentation/technical/api-directory-listing.txt is removed because
the information it has is now redundant and it'll be hard to keep it up to
date and synchronized with the documentation in the header files.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-diff.txt to both
diff.h and diffcore.h as it's easier for the developers to find the usage
information beside the code instead of looking for it in another doc file.
Also documentation/technical/api-diff.txt is removed because the information
it has is now redundant and it'll be hard to keep it up to date and
synchronized with the documentation in the header files.
There are three members documented in the doc file that weren't found in
the header files, assuming the doc wasn't up to date and the members
no longer exist:
touched_flags, COLOR_DIFF_WORDS and QUIET.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `label` todo command in interactive rebases creates temporary refs
in the `refs/rewritten/` namespace. These refs are stored as loose refs,
i.e. as files in `.git/refs/rewritten/`, therefore they have to conform
with file name limitations on the current filesystem in addition to the
accepted ref format.
This poses a problem in particular on NTFS/FAT, where e.g. the colon,
double-quote and pipe characters are disallowed as part of a file name.
Let's safeguard against this by replacing not only white-space
characters by dashes, but all non-alpha-numeric ones.
However, we exempt non-ASCII UTF-8 characters from that, as it should be
quite possible to reflect branch names such as `↯↯↯` in refs/file names.
Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the trickier aspects of the design of `git rebase
--rebase-merges` is the way labels are generated for the initial todo
list: those labels are supposed to be intuitive and first and foremost
unique.
To that end, `label_oid()` appends a unique suffix when necessary.
Those labels not only need to be unique, but they also need to be valid
refs. To make sure of that, `make_script_with_merges()` replaces
whitespace by dashes.
That would appear to be the wrong layer for that sanitizing step,
though: all callers of `label_oid()` should get that same benefit.
Even if it does not make a difference currently (the only called of
`label_oid()` that passes a label that might need to be sanitized _is_
`make_script_with_merges()`), let's move the responsibility for
sanitizing labels into the `label_oid()` function.
This commit is best viewed with `-w` because it unfortunately needs to
change the indentation of a large block of code in `label_oid()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This imitates the code to show the help text from the Perl script
`git-add--interactive.perl` in the built-in version.
To make sure that it renders exactly like the Perl version of `git add
-i`, we also add a test case for that to `t3701-add-interactive.sh`.
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error messages as well as the unique prefixes are colored in `git
add -i` by default; We need to do the same in the built-in version.
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this change, we print out the same colored help text that the
Perl-based `git add -i` prints in the main loop when question mark is
entered.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like in the Perl script `git-add--interactive.perl`, for each
command a unique prefix is determined (if there exists any within the
given parameters), and shown in the list, and accepted as a shortcut for
the command.
To determine the unique prefixes, as well as to look up the command in
question, we use a copy of the list and sort it.
While this might seem like overkill for a single command, it will make
much more sense when all the commands are implemented, and when we reuse
the same logic to present a list of files to edit, with convenient
unique prefixes.
At the start of the development of this patch series, a dedicated data
structure was introduced that imitated the Trie that the Perl version
implements. However, this was deemed overkill, and we now simply sort
the list before determining the length of the unique prefixes by looking
at each item's neighbor. As a bonus, we now use the same sorted list to
perform a binary search using the user-provided prefix as search key.
Original-patch-by: Slavica Đukić <slawica92@hotmail.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reason why we did not start with the main loop to begin with is that
it is the first user of `list_and_choose()`, which uses the `list()`
function that we conveniently introduced for use by the `status`
command.
In contrast to the Perl version, in the built-in interactive `add`, we
will keep the `list()` function (which only displays items) and the
`list_and_choose()` function (which uses `list()` to display the items,
and only takes care of the "and choose" part) separate.
The `list_and_choose()` function, as implemented in
`git-add--interactive.perl` knows a few more tricks than the function we
introduce in this patch:
- There is a flag to let the user select multiple items.
- In multi-select mode, the list of items is prefixed with a marker
indicating what items have been selected.
- Initially, for each item a unique prefix is determined (if there
exists any within the given parameters), and shown in the list, and
accepted as a shortcut for the selection.
These features will be implemented in the C version later.
This patch does not add any new main loop command, of course, the
built-in `git add -i` still only supports the `status` command. The
remaining commands to follow over the course of the next commits.
To accommodate for listing the commands in columns, preparing for the
commands that will be implemented over the course of the next
patches/patch series, we teach the `list()` function to do precisely
that.
Note that we only have a prompt ending in a single ">" at this stage;
later commits will add commands that display a double ">>" to indicate
that the user is in a different loop than the main one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a cross-site scripting problem in gitweb, where it will print
URLs generated by its href() helper without further quoting. This allows
an attacker to point a victim to a specially crafted gitweb URL and
inject arbitrary HTML into the resulting page (which the victim sees as
coming from gitweb).
The base of the URL comes from evaluate_uri(), which pulls the value of
$REQUEST_URI via the CGI module. It tries to strip off $PATH_INFO, but
fails to do so in some cases (including ones that contain special
characters, like "+"). Most of the uses of the URL end up being passed
to "$cgi->a(-href = href())", which will get quoted properly by the CGI
module. But in a few places, we output them ourselves as part of
manually-generated HTML, and whatever was in the original URL will
appear unquoted in the output.
Given that all of the nearby variables placed into this manual HTML
_are_ quoted, it seems like the authors assumed that these URLs would
not need quoting. So it's possible that the bug is actually in
evaluate_uri(), which should be doing a more careful job of stripping
$PATH_INFO. There's some discussion in a comment in that function, as
well as the commit message in 81d3fe9f48 (gitweb: fix wrong base URL
when non-root DirectoryIndex, 2009-02-15). But I'm not sure I understand
it.
Regardless, it's a good idea to quote these values at the point of
insertion into the HTML output:
1. Even if there is a bug in evaluate_uri(), this would give us
belt-and-suspenders protection.
2. evaluate_uri() is only handling the base. Some generated URLs will
also mention arbitrary refs or filenames in the repositories, and
these should be quoted anyway.
3. It should never _hurt_ to quote (and that's what all of the
$cgi->a() calls are doing already).
So there may be further work here, but this patch at least prevents the
XSS vulnerability, and shouldn't make anything worse.
The test here covers the calls in print_feed_meta(), but I manually
audited every call to href() to see how its output was used, and quoted
appropriately. Most of them are esc_attr(), as they're used in tag
attributes, but I used esc_html() when the URLs were printed bare. The
distinction is largely academic, as one is implemented as a wrapper for
the other.
Reported-by: NAKAYAMA DAISUKE <nakyamad@icloud.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a real webserver's CGI call, gitweb.cgi would typically see
$REQUEST_URI set. This variable does impact how we display our URL in
the resulting page, so let's try to make our test as realistic as
possible (we can just use the $PATH_INFO our caller passed in, if any).
This doesn't change the outcome of any tests, but it will help us add
some new tests in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some variables assignments in gitweb_run() look like this:
FOO=""$1""
The extra quotes aren't doing anything. Each set opens and closes an
empty string, and $1 is actually outside of any double-quotes (which is
OK, because variable assignment does not do whitespace splitting on the
expanded value).
Let's drop them, as they're simply confusing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is just a thin wrapper around gitweb_run(), which takes
multiple arguments. But we only pass along "$1". Let's pass everything
we get, which will let a future patch add an XSS test that affects
PATH_INFO (which gitweb_run() takes as $2).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Script git-pull.sh has been removed in commit [1]. Use command
"request-pull" as an example of a shell script instead. Recently, many
of shell script commands have been re-written in C, so tweak the wording
of the sentence, while we're here.
[1]: b1456605c2 (pull: remove redirection to git-pull.sh, 2015-06-18)
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We try to delete the non-existing tag "anothertag", but for the
verifications, we check that the tag "myhead" doesn't exist. "myhead"
isn't used in this test except for this checking. Comparing to the test
two tests earlier, it looks like a copy-paste mistake.
Perhaps it's overkill to check that `git tag -d` didn't decide to
*create* a tag. But since we're trying to be this careful, let's
actually check the correct tag. While we're doing this, let's use a more
descriptive tag name instead -- "nonexistingtag" should be obvious.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For simplicity, we only implemented the `status` command without colors.
This patch starts adding color, matching what the Perl script
`git-add--interactive.perl` does.
Original-Patch-By: Daniel Ferreira <bnmvco@gmail.com>
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This implements the `status` command of `git add -i`. The data
structures introduced in this commit will be extended later, as needed.
At this point, we re-implement only part of the `list_and_choose()`
function of the Perl script `git-add--interactive.perl` and call it
`list()`. It does not yet color anything, or do columns, or allow user
input.
Over the course of the next commits, we will introduce a
`list_and_choose()` function that uses `list()` to display the list of
options and let the user choose one or more of the displayed items. This
will be used to implement the main loop of the built-in `git add -i`, at
which point the new `status` command can actually be used.
Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the diffstat interface (namely, the diffstat_t struct and
compute_diffstat) no longer be internal to diff.c and allow it to be used
by other parts of git.
This is helpful for code that may want to easily extract information
from files using the diff machinery, while flushing it differently from
how the show_* functions used by diff_flush() do it. One example is the
builtin implementation of git-add--interactive's status.
Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
Signed-off-by: Slavica Đukić <slawica92@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike previous conversions to C, where we started with a built-in
helper, we start this conversion by adding an interception in the
`run_add_interactive()` function when the new opt-in
`add.interactive.useBuiltin` config knob is turned on (or the
corresponding environment variable `GIT_TEST_ADD_I_USE_BUILTIN`), and
calling the new internal API function `run_add_i()` that is implemented
directly in libgit.a.
At this point, the built-in version of `git add -i` only states that it
cannot do anything yet. In subsequent patches/patch series, the
`run_add_i()` function will gain more and more functionality, until it
is feature complete. The whole arc of the conversion can be found in the
PRs #170-175 at https://github.com/gitgitgadget/git.
The "--helper approach" can unfortunately not be used here: on Windows
we face the very specific problem that a `system()` call in
Perl seems to close `stdin` in the parent process when the spawned
process consumes even one character from `stdin`. Which prevents us from
implementing the main loop in C and still trying to hand off to the Perl
script.
The very real downside of the approach we have to take here is that the
test suite won't pass with `GIT_TEST_ADD_I_USE_BUILTIN=true` until the
conversion is complete (the `--helper` approach would have let it pass,
even at each of the incremental conversion steps).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 'do_apply_stash()' we refresh the index in the end. Since
34933d0eff ("stash: make sure to write refreshed cache", 2019-09-11),
we also write that refreshed index when --quiet is given to 'git stash
apply'.
However if '--index' is not given to 'git stash apply', we also
discard the index in the else clause just before. We need to do so
because we use an external 'git update-index --add --stdin', which
leads to an out of date in-core index.
Later we call 'refresh_and_write_cache', which now leads to writing
the discarded index, which means we essentially write an empty index
file. This is obviously not correct, or the behaviour the user
wanted. We should not modify the users index without being asked to
do so.
Make sure to re-read the index after discarding the current in-core
index, to avoid dealing with outdated information. Instead we could
also drop the 'discard_cache()' + 'read_cache()', however that would
make it easy to fall into the same trap as 34933d0eff did, so it's
better to avoid that.
We can also drop the 'refresh_and_write_cache' completely in the quiet
case. Previously in legacy stash we relied on 'git status' to refresh
the index after calling 'git read-tree' when '--index' was passed to
'git apply'. However the 'reset_tree()' call that replaced 'git
read-tree' always passes options that are equivalent to '-m', making
the refresh of the index unnecessary.
Reported-by: Grzegorz Rajchman <rayman17@gmail.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're pushing a pack and our local pack-objects fails, we enter an
error code path that does a few things:
1. Set the status of every ref to REF_STATUS_NONE
2. Call receive_unpack_status() to try to get an error report from
the other side
3. Return an error to the caller
If pack-objects failed because the connection to the server dropped,
there's not much more we can do than report the hangup. And indeed, step
2 will try to read a packet from the other side, which will die() in the
packet-reading code with "the remote end hung up unexpectedly".
But if the connection _didn't_ die, then the most common issue is that
the remote index-pack or unpack-objects complained about our pack (we
could also have a local pack-objects error, but this ends up being the
same; we'd send an incomplete pack and the remote side would complain).
In that case we do report the error from the other side (because of step
2), but we fail to say anything further about the refs. The issue is
two-fold:
- in step 1, the "NONE" status is not "we saw an error, so we have no
status". It generally means "this ref did not match our refspecs, so
we didn't try to push it". So when we print out the push status
table, we won't mention any refs at all!
But even if we had a status enum for "pack-objects error", we
wouldn't want to blindly set it for every ref. For example, in a
non-atomic push we might have rejected some refs already on the
client side (e.g., REF_STATUS_REJECT_NODELETE) and we'd want to
report that.
- in step 2, we read just the unpack status. But receive-pack will
also tell us about each ref (usually that it rejected them because
of the unpacker error).
So a much better strategy is to leave the ref status fields as they are
(usually EXPECTING_REPORT) and then actually receive (and print) the
full per-ref status.
This case is actually covered in the test suite, as t5504.8, which
writes a pack that will be rejected by the remote unpack-objects. But
it's racy. Because our pack is small, most of the time pack-objects
manages to write the whole thing before the remote rejects it, and so it
returns success and we print out the errors from the remote. But very
occasionally (or when run under --stress) it goes slow enough to see a
failure in writing, and git-push reports nothing for the refs.
With this patch, the test should behave consistently.
There shouldn't be any downside to this approach. If we really did see
the connection drop, we'd already die in receive_unpack_status(), and
we'll continue to do so. If the connection drops _after_ we get the
unpack status but before we see any ref status, we'll still print the
remote unpacker error, but will now say "remote end hung up" instead of
returning the error up the call-stack. But as discussed, we weren't
showing anything more useful than that with the current code. And
anyway, that case is quite unlikely (the connection dropping at that
point would have to be unrelated to the pack-objects error, because of
the ordering of events).
In the future we might want to handle packet-read errors ourself instead
of dying, which would print a full ref status table even for hangups.
But in the meantime, this patch should be a strict improvement.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the top of 't6120-describe.sh' an ASCII graph illustrates the
repository's history used in this test script. This graph is a bit
misleading, because it swapped the second merge commit's first and
second parents.
When describing/naming a commit it does make a difference which parent
is the first and which is the second/Nth, so update this graph to
accurately represent that second merge.
While at it, move this history graph from the 'test_description'
variable to a regular comment.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6462d5eb9a ("fetch: remove fetch_if_missing=0", 2019-11-08)
strove to remove the need for fetch_if_missing=0 from the fetching
mechanism, so it is plausible to attempt removing fetch_if_missing=0
from the lazy-fetching mechanism in promisor-remote as well.
But doing so reveals a bug - when the server does not send an object
pointed to by a tag object, an infinite loop occurs: Git attempts to
fetch the missing object, which causes a deferencing of all refs (for
negotiation), which causes a lazy fetch of that missing object, and so
on. This bug is because of unnecessary use of the fetch negotiator
during lazy fetching - it is not used after initialization, but it is
still initialized (which causes the dereferencing of all refs).
Thus, when the negotiator is not used during fetching, refrain from
initializing it. Then, remove fetch_if_missing from promisor-remote.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6462d5eb9a ("fetch: remove fetch_if_missing=0", 2019-11-08)
strove to remove the need for fetch_if_missing=0 from the fetching
mechanism, so it is plausible to attempt removing fetch_if_missing=0
from clone as well. But doing so reveals a bug - when the server does
not send an object directly pointed to by a ref, this should be an
error, not a trigger for a lazy fetch. (This case in the fetching
mechanism was covered by a test using "git clone", not "git fetch",
which is why the aforementioned commit didn't uncover the bug.)
The bug can be fixed by suppressing lazy-fetching during the
connectivity check. Fix this bug, and remove fetch_if_missing from
clone.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_options_dup() counts the number of elements in the given array
without the end marker, allocates enough memory to hold all of them plus
an end marker, then copies them and terminates the new array. The
counting part is done by advancing a pointer through the array, and the
original pointer is reconstructed using pointer subtraction before the
copy operation.
The function is also prepared to handle a NULL pointer passed to it.
None of its callers do that currently, but this feature was used by
46e91b663b ("checkout: split part of it to new command 'restore'",
2019-04-25); it seems worth keeping.
It ends up doing arithmetic on that NULL pointer, though, which is
undefined in standard C, when it tries to calculate "NULL - 0". Better
avoid doing that by remembering the originally given pointer value.
There is another issue, though. memcpy(3) does not support NULL
pointers, even for empty arrays. Use COPY_ARRAY instead, which does
support such empty arrays. Its call is also shorter and safer by
inferring the element type automatically.
Coccinelle and contrib/coccinelle/array.cocci did not propose to use
COPY_ARRAY because of the pointer subtraction and because the source is
const -- the semantic patch cautiously only considers pointers and array
references of the same type.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the macro COPY_ARRAY to copy array elements. The result is shorter
and safer, as it infers the element type automatically and does a (very)
basic type compatibility check for its first two arguments.
Coccinelle and contrib/coccinelle/array.cocci did not generate this
conversion due to the offset of 1 at both source and destination and
because the source is a const pointer; the semantic patch cautiously
handles only pure pointers and array references of the same type.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git commit-graph read' subcommand is used in test scripts to check
that the commit-graph contents match the expected data. Mostly, this
helps check the header information and the list of chunks. Users do not
need this information, so move the functionality to a test helper.
Reported-by: Bryan Turner <bturner@atlassian.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With './t1234-foo.sh -r 5,6' we can run only specific test cases in a
test script, but our test framwork still evaluates all lazy prereqs
that the excluded test cases might depend on. This is unnecessary and
produces verbose and trace output that can be distracting. This has
been an issue ever since the '-r|--run=' options were introduced in
0445e6f0a1 (test-lib: '--run' to run only specific tests, 2014-04-30),
because that commit added the check of the list of test cases
specified with '-r' after evaluating the prereqs.
Avoid this unnecessary prereq evaluation by checking the list of test
cases specified with '-r' before looking at the prereqs.
Note that GIT_SKIP_TESTS has always been checked before the prereqs,
so prereqs necessary for tests skipped that way were not evaluated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git commands are placed in the upstream of a pipe, their return
codes are lost. In this particular case, it is especially bad since we
are testing the intricacies of `git log --graph` behavior and if we hit
an unexpected failure or segfault, we want to know this.
Extract the common output checking logic into check_graph() where we
redirect the output of git commands upstream of pipe into a file and
have sed read from that file so that git failures are detected.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
3444ec2e ("fsmonitor: don't fill bitmap with entries to be removed",
2019-10-11) added a handful of sanity checks that make sure that a
bit position in fsmonitor bitmap does not go beyond the end of the
index. As each bit in the bitmap corresponds to a path in the
index, this is the right check most of the time.
Except for the case when we are in the split-index mode and looking
at a delta index that is to be overlayed on the base index but
before the base index has actually been merged in, namely in read_
and write_fsmonitor_extension(). In these codepaths, the entries in
the split/delta index is typically a small subset of the entire set
of paths (otherwise why would we be using split-index?), so the
bitmap used by the fsmonitor is almost always larger than the number
of entries in the partial index, and the incorrect comparison would
trigger the BUG().
Reported-by: Utsav Shah <ukshah2@illinois.edu>
Helped-by: Kevin Willford <Kevin.Willford@microsoft.com>
Helped-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's only a single caller left of sha1_to_hex(), since everybody
that has an object name in "unsigned char[]" now uses hash_to_hex()
instead.
This case is in the sha1dc wrapper, where we print a hex sha1 when
we find a collision. This one will always be sha1, regardless of the
current hash algorithm, so we can't use hash_to_hex() here. In
practice we'd probably not be running sha1 at all if it isn't the
current algorithm, but it's possible we might still occasionally
need to compute a sha1 in a post-sha256 world.
Since sha1_to_hex() is just a wrapper for hash_to_hex_algop(), let's
call that ourselves. There's value in getting rid of the sha1-specific
wrapper to de-clutter the global namespace, and to make sure nobody uses
it (and as with sha1_to_hex_r() in the previous patch, we'll drop the
coccinelle transformations, too).
The sha1_to_hex() function is mentioned in a comment; we can easily
swap that out for oid_to_hex() to give a better example. Also
update the comment that was left stale when we added "struct
object_id *" as a way to name an object and added functions to
convert it to hex.
The function is also mentioned in some test vectors in t4100, but
that's not runnable code, so there's no point in trying to clean it
up.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2b9bd488ae ("completion: teach rebase to use __gitcomp_builtin",
2019-09-12), the completion script learned to complete rebase using
__gitcomp_builtin(). However, this resulted in `--onto=` being suggested
instead of `--onto `.
Before, when there was a space, we'd start a new word and, as a result,
fallback to __git_complete_refs() and `--onto` would be completed this
way. However, now we match the `--*` case which does not know how to
offer completions for refs.
Teach _git_rebase() to complete refs in the `--onto=` case so that we
fix this regression.
Reported-by: Paul Jolly <paul@myitcv.io>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch fixes an extreme slowdown in pack-objects when you have more
than 1023 packs. See below for numbers.
Since 43fa44fa3b (pack-objects: move in_pack out of struct object_entry,
2018-04-14), we use a complicated system to save some per-object memory.
Each object_entry structs gets a 10-bit field to store the index of the
pack it's in. We map those indices into pointers using
packing_data->in_pack_by_idx, which we initialize at the start of the
program. If we have 2^10 or more packs, then we instead create an array
of pack pointers, one per object. This is packing_data->in_pack.
So far so good. But there's one other tricky case: if a new pack arrives
after we've initialized in_pack_by_idx, it won't have an index yet. We
solve that by calling oe_map_new_pack(), which just switches on the fly
to the less-optimal in_pack mechanism, allocating the array and
back-filling it for already-seen objects.
But that logic kicks in even when we've switched to it already (whether
because we really did see a new pack, or because we had too many packs
in the first place). The result doesn't produce a wrong outcome, but
it's very slow. What happens is this:
- imagine you have a repo with 500k objects and 2000 packs that you
want to repack.
- before looking at any objects, we call prepare_in_pack_by_idx(). It
starts allocating an index for each pack. On the 1024th pack, it
sees there are too many, so it bails, leaving in_pack_by_idx as
NULL.
- while actually adding objects to the packing list, we call
oe_set_in_pack(), which checks whether the pack already has an
index. If it's one of the packs after the first 1023, then it
doesn't have one, and we'll call oe_map_new_pack().
But there's no useful work for that function to do. We're already
using in_pack, so it just uselessly walks over the complete list of
objects, trying to backfill in_pack.
And we end up doing this for almost 1000 packs (each of which may be
triggered by more than one object). And each time it triggers, we
may iterate over up to 500k objects. So in the absolute worst case,
this is quadratic in the number of objects.
The solution is simple: we don't need to bother checking whether the
pack has an index if we've already converted to using in_pack, since by
definition we're not going to use it. So we can just push the "does the
pack have a valid index" check down into that half of the conditional,
where we know we're going to use it.
The current test in p5303 sadly doesn't notice this problem, since it
maxes out at 1000 packs. If we add a new test to it at 2000 packs, it
does show the improvement:
Test HEAD^ HEAD
----------------------------------------------------------------------
5303.12: repack (2000) 26.72(39.68+0.67) 15.70(28.70+0.66) -41.2%
However, these many-pack test cases are rather expensive to run, so
adding larger and larger numbers isn't appealing. Instead, we can show
it off more easily by using GIT_TEST_FULL_IN_PACK_ARRAY, which forces us
into the absolute worst case: no pack has an index, so we'll trigger
oe_map_new_pack() pointlessly for every single object, making it truly
quadratic.
Here are the numbers (on git.git) with the included change to p5303:
Test HEAD^ HEAD
----------------------------------------------------------------------
5303.3: rev-list (1) 2.05(1.98+0.06) 2.06(1.99+0.06) +0.5%
5303.4: repack (1) 33.45(33.46+0.19) 2.75(2.73+0.22) -91.8%
5303.6: rev-list (50) 2.07(2.01+0.06) 2.06(2.01+0.05) -0.5%
5303.7: repack (50) 34.21(35.18+0.16) 3.49(4.50+0.12) -89.8%
5303.9: rev-list (1000) 2.87(2.78+0.08) 2.88(2.80+0.07) +0.3%
5303.10: repack (1000) 41.26(51.30+0.47) 10.75(20.75+0.44) -73.9%
Again, those improvements aren't realistic for the 1-pack case (because
in the real world, the full-array solution doesn't kick in), but it's
more useful to be testing the more-complicated code path.
While we're looking at this issue, we'll tweak one more thing: in
oe_map_new_pack(), we call REALLOC_ARRAY(pack->in_pack). But we'd never
expect to get here unless we're back-filling it for the first time, in
which case it would be NULL. So let's switch that to ALLOC_ARRAY() for
clarity, and add a BUG() to document the expectation. Unfortunately this
code isn't well-covered in the test suite because it's inherently racy
(it only kicks in if somebody else adds a new pack while we're in the
middle of repacking).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --preserve-merges was deprecated in 427c3bd28a a sentence
was introduced describing the difference between --rebase-merges and
--preserve-merges which is a little unclear and difficult to parse.
This patch improves readability while retaining original meaning.
Signed-off-by: Naveen Nathan <naveen@lastninja.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no callers left; everybody uses oid_to_hex_r() or
hash_to_hex_algop_r(). This used to actually be the underlying
implementation for oid_to_hex_r(), but that's no longer the case since
47edb64997 (hex: introduce functions to print arbitrary hashes,
2018-11-14).
Let's get rid of it to de-clutter and to make sure nobody uses it.
Likewise we can drop the coccinelle rules that mention it, since the
compiler will make it quite clear that the code does not work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The message file will be used as commit message for the
git-{am,rebase} --continue.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During rebasing, old merge's message (encoded in old encoding)
will be used as message for new merge commit (created by rebase).
In case of the value of i18n.commitencoding has been changed after the
old merge time. We will receive an unusable message for this new merge.
Correct it.
This change also notice a breakage with git-rebase label system.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it possible for any of the git-bundle subcommands to include
options:
- before the sub-command
- after the sub-command, before the bundle filename
There is an immediate gain in support for help with all of the
sub-commands, where 'git bundle list-heads -h' previously returned an
error.
Downside here is an increase in code duplication that cannot be
trivially avoided short of shared global static options.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On fixup/squash-ing rebase, git will create new commit in
i18n.commitencoding, reencode the commit message to that said encode.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Keep revert/cherry-pick's todo list in line with rebase todo list.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On musl libc, ISO-2022-JP encoder is too eager to switch back to
1 byte encoding, musl's iconv always switch back after every combining
character. Comparing glibc and musl's output for this command
$ sed q t/t3900/ISO-2022-JP.txt| iconv -f ISO-2022-JP -t utf-8 |
iconv -f utf-8 -t ISO-2022-JP | xxd
glibc:
00000000: 1b24 4224 4f24 6c24 5224 5b24 551b 2842 .$B$O$l$R$[$U.(B
00000010: 0a .
musl:
00000000: 1b24 4224 4f1b 2842 1b24 4224 6c1b 2842 .$B$O.(B.$B$l.(B
00000010: 1b24 4224 521b 2842 1b24 4224 5b1b 2842 .$B$R.(B.$B$[.(B
00000020: 1b24 4224 551b 2842 0a .$B$U.(B.
Although musl iconv's output isn't optimal, it's still correct.
From commit 7d509878b8, ("pretty.c: format string with truncate respects
logOutputEncoding", 2014-05-21), we're encoding the message to utf-8
first, then format it and convert the message to the actual output
encoding on git commit --squash.
Thus, t3900::test_commit_autosquash_flags is failing on musl libc.
Reencode to utf-8 before arranging rebase's todo list.
By doing this, we also remove a breakage noticed by a test added in the
previous commit.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using fixup!/squash! <subject> to mark if current commit will be
used to be fixed up or squashed to a previous commit.
However, if we're changing i18n.commitencoding after making the
original commit but before making the fixing up, we couldn't find the
original commit to do the fixup/squash.
Add a test to demonstrate that problem.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From commit 79444c9294, ("utf8: handle systems that don't write BOM for
UTF-16", 2019-02-12), we're supporting those systems with iconv that
omits BOM with:
make ICONV_OMITS_BOM=Yes
However, configure script wasn't taught to detect those systems.
Teach configure to do so.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash save" in a working tree that is sparsely checked out
mistakenly removed paths that are outside the area of interest.
* js/update-index-ignore-removal-for-skip-worktree:
stash: handle staged changes in skip-worktree files correctly
update-index: optionally leave skip-worktree entries alone
The custom format for "git log --format=<format>" learned the l/L
placeholder that is similar to e/E that fills in the e-mail
address, but only the local part on the left side of '@'.
* pb/pretty-email-without-domain-part:
pretty: add "%aL" etc. to show local-part of email addresses
t4203: use test-lib.sh definitions
t6006: use test-lib.sh definitions
Code clean-up and a bugfix in the logic used to tell worktree local
and repository global refs apart.
* sg/dir-trie-fixes:
path.c: don't call the match function without value in trie_find()
path.c: clarify two field names in 'struct common_dir'
path.c: mark 'logs/HEAD' in 'common_list' as file
path.c: clarify trie_find()'s in-code comment
Documentation: mention more worktree-specific exceptions
The code to generate multi-pack index learned to show (or not to
show) progress indicators.
* wb/midx-progress:
multi-pack-index: add [--[no-]progress] option.
midx: honor the MIDX_PROGRESS flag in midx_repack
midx: honor the MIDX_PROGRESS flag in verify_midx_file
midx: add progress to expire_midx_packs
midx: add progress to write_midx_file
midx: add MIDX_PROGRESS flag
When all files from some subdirectory were renamed to the root
directory, the directory rename heuristics would fail to detect that
as a rename/merge of the subdirectory to the root directory, which has
been corrected.
* en/merge-recursive-directory-rename-fixes:
t604[236]: do not run setup in separate tests
merge-recursive: fix merging a subdirectory into the root directory
merge-recursive: clean up get_renamed_dir_portion()
"git rebase --preserve-merges" has been marked as deprecated; this
release stops advertising it in the "git rebase -h" output.
* js/rebase-deprecate-preserve-merges:
rebase: hide --preserve-merges option
Move the definition of a set of bitmask constants from 0ctal
literal to (1U<<count) notation.
* hv/bitshift-constants-in-blame:
builtin/blame.c: constants into bit shift format
"git notes copy $original" ought to copy the notes attached to the
original object to HEAD, but a mistaken tightening to command line
parameter validation made earlier disabled that feature by mistake.
* dd/notes-copy-default-dst-to-head:
notes: fix minimum number of parameters to "copy" subcommand
t3301: test diagnose messages for too few/many paramters
"rebase -i" ceased to run post-commit hook by mistake in an earlier
update, which has been corrected.
* pw/post-commit-from-sequencer:
sequencer: run post-commit hook
move run_commit_hook() to libgit and use it there
sequencer.h fix placement of #endif
t3404: remove uneeded calls to set_fake_editor
t3404: set $EDITOR in subshell
t3404: remove unnecessary subshell
The branch description ("git branch --edit-description") has been
used to fill the body of the cover letters by the format-patch
command; this has been enhanced so that the subject can also be
filled.
* dl/format-patch-cover-from-desc:
format-patch: teach --cover-from-description option
format-patch: use enum variables
format-patch: replace erroneous and condition
Debugging support for lazy cloning has been a bit improved.
* jt/fetch-pack-record-refs-in-the-dot-promisor:
fetch-pack: write fetched refs to .promisor
Use skip_iprefix() to parse "UTF" case-insensitively instead of checking
with istarts_with(), building an upper-case version and then using
skip_prefix() on it. This gets rid of duplicate code and of a small
allocation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of magic numbers by using skip_iprefix() and skip_prefix() for
parsing the leading "[uU][tT][fF]-?" of both strings instead of checking
with istarts_with() and an explicit comparison.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have several modules originally taken from some upstream source,
and which as far as I can tell we no longer update from the upstream
anymore. As such, I have not submitted these spelling fixes to any
external projects but just include them directly here.
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply several spelling fixes that technically change what the tests are
executing, but do so in a way that is not tested and does not affect results
(e.g. modify the commit message to remove a typo, remove spelling mistakes
from refnames, etc.)
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-shortlog, like git-log, supports options to filter what commits are
used to generate the log. These options come from git-rev-list, and are
documented in Documentation/rev-list-options.txt. Include those options
in shortlog's documentation.
But since rev-list-options.txt contains some other options that don't
really apply in the context of shortlog (like diff formatting, commit
ordering, etc), add a switch in rev-list-options.txt that excludes those
sections from the shortlog documentation. To be more specific, include
only the "Commit Limiting" and "History Simplification" sections.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fetch_pack() (and all functions it calls), pass
OBJECT_INFO_SKIP_FETCH_OBJECT whenever we query an object that could be
a tree or blob that we do not want to be lazy-fetched even if it is
absent. Thus, the only lazy-fetches occurring for trees and blobs are
when resolving deltas.
Thus, we can remove fetch_if_missing=0 from builtin/fetch.c. Remove
this, and also add a test ensuring that such objects are not
lazy-fetched. (We might be able to remove fetch_if_missing=0 from other
places too, but I have limited myself to builtin/fetch.c in this commit
because I have not written tests for the other commands yet.)
Note that commits and tags may still be lazy-fetched. I limited myself
to objects that could be trees or blobs here because Git does not
support creating such commit- and tag-excluding clones yet, and even if
such a clone were manually created, Git does not have good support for
fetching a single commit (when fetching a commit, it and all its
ancestors would be sent).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add missing headers to prevent ill-effects from multiple inclusion.
Found by the LGTM source code analyzer.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
man 1p printf:
In addition to the escape sequences shown in the Base Definitions
volume of POSIX.1‐2008, Chapter 5, File Format Notation ('\\',
'\a', '\b', '\f', '\n', '\r', '\t', '\v'), "\ddd", where ddd is a
one, two, or three-digit octal number, shall be written as a byte
with the numeric value specified by the octal number.
printf '\xfe\xff' is an extension of some shell.
Dash, a popular yet simple shell, do not implement this extension.
This wasn't caught by most people running the tests, even though
common shells like dash don't handle hex escapes, because their
systems don't trigger the NO_UTF16_BOM prereq. But systems with musl
libc do; when combined with dash, the test fails.
Correct it.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support for various porcelain commands will arrive via additional
patches.
`--pathspec-from-file` solves the problem of commandline length limit
for UIs built on top of git. Plumbing commands are not always a good
fit, for two major reasons:
1) Some UIs show executed commands to user. In this case, porcelain
commands are expected. One reason for that is letting user learn git
commands by clicking UI buttons. The other reason is letting user
study the history of commands in case of any unexpected results. Both
of these will lose most of their value if UI uses combinations of
arcane plumbing commands.
2) Some UIs have started and grown with porcelain commands. Replacing
existing logic with plumbing commands could be cumbersome and prone
to various new problems.
`--pathspec-from-file` will behave very close to pathspec passed in
commandline args, so that switching from one to another is simple.
`--pathspec-from-file` will read either a specified file or `stdin`
(when file is exactly "-"). Reading from file is a good way to avoid
competing for `stdin`, and also gives some extra flexibility.
`--pathspec-file-nul` switch mirrors `-z` already used in various
places. Some porcelain commands, such as `git commit`, already use
`-z`, therefore it needed a new unambiguous name.
New options do not have shorthands to avoid shorthand conflicts. It is
not expected that they will be typed in console.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix two test descriptions which stated "git -ls-files" when the actual
command being tested was "git ls-files".
Signed-off-by: Nathan Stocks <cleancut@github.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No substantive changes, just a few cosmetic changes:
* Indent steps of an individual test
* Don't have logic between the "test_expect_success" blocks that
the next block will depend upon, move it into the
test_expect_success section itself
* Fix spacing around redirection operators to match git style
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 93b980e58f (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running Git commands quickly -- such as in a shell script or the
test suite -- the Git commands frequently complete and start again
during the same second. The example fsmonitor hooks to integrate with
Watchman truncate the nanosecond times to seconds. In principle, this is
fine, as Watchman claims to use inclusive comparisons [1]. The result
should only be an over-representation of the changed paths since the
last Git command.
However, Watchman's own documentation claims "Using a timestamp is prone
to race conditions in understanding the complete state of the file tree"
[2]. All of their documented examples use a "clockspec" that looks like
'c:123:234'. Git should eventually learn how to store this type of
string to provide a stronger integration, but that will be a more
invasive change.
When using GIT_TEST_FSMONITOR="$(pwd)/t7519/fsmonitor-watchman", scripts
such as t7519-wtstatus.sh fail due to these race conditions. In fact,
running any test script with GIT_TEST_FSMONITOR pointing at
t/t7519/fsmonitor-wathcman will cause failures in the test_commit
function. The 'git add "$indir$file"' command fails due to not enough
time between the creation of '$file' and the 'git add' command.
For now, subtract one second from the timestamp we pass to Watchman.
This will make our window large enough to avoid these race conditions.
Increasing the window causes tests like t7519-wtstatus.sh to pass.
When the integration was introduced in def437671 (fsmonitor: add a
sample integration script for Watchman, 2018-09-22), the query included
an expression that would ignore files created and deleted in that
window. The performance reason for this change was to ignore temporary
files created by a build between Git commands. However, this causes
failures in script scenarios where Git is creating or deleting files
quickly.
When using GIT_TEST_FSMONITOR as before, t2203-add-intent.sh fails
due to this add-and-delete race condition.
By removing the "expression" from the Watchman query, we remove this
race condition. It will lead to some performance degradation in the case
of users creating and deleting temporary files inside their working
directory between Git commands. However, that is a cost we need to pay
to be correct.
[1] https://github.com/facebook/watchman/blob/master/query/since.cpp#L35-L39
[2] https://facebook.github.io/watchman/docs/clockspec.html
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The initialization function of the Trace2 performance format target sets
aside a stash of dots for indenting output. Get rid of it and use
strbuf_addchars() to provide dots on demand instead. This shortens the
code, gets rid of a small heap allocation and is a bit more efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When both `fetch.jobs` and `fetch.writeCommitGraph` is set, we currently
try to write the commit graph in each of the concurrent fetch jobs,
which frequently leads to error messages like this one:
fatal: Unable to create '.../.git/objects/info/commit-graphs/commit-graph-chain.lock': File exists.
Let's avoid this by holding off from writing the commit graph until all
fetch jobs are done.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option overrides the config setting `fetch.writeCommitGraph`, if
both are set.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finishing touches to the recent update to the build procedure for
the documentation.
* bc/doc-use-docbook-5:
manpage-bold-literal.xsl: match for namespaced "d:literal" in template
* https://github.com/prati0100/git-gui:
git-gui: improve Japanese translation
git-gui: add a readme
git-gui: support for diff3 conflict style
git-gui: use existing interface to query a path's attribute
git-gui (Windows): use git-bash.exe if it is available
treewide: correct several "up-to-date" to "up to date"
Fix build with core.autocrlf=true
The previous commit introduced --ignore-date flag to interactive
rebase, but the name is actually very vague in context of rebase -i
since there are two dates we can work with. Add an alias to convey
the precise purpose.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase am already has this flag to "lie" about the author date
by changing it to the committer (current) date. Let's add the same
for interactive machinery.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The purpose of amend_author was to free() the malloc()'d string
obtained from get_author() while amending a commit. But we can
also use the variable to free() the author at our convenience.
Rename it to convey this meaning.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase am already has this flag to "lie" about the committer date
by changing it to the author date. Let's add the same for
interactive machinery.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current callers of the read_author_script() function read name,
email and date from the author script. Allow callers to signal that
they are not interested in some among these three fields by passing
NULL.
Note that fields that are ignored still must exist and be formatted
correctly in the author script.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two backends available for rebasing, viz, the am and the
interactive. Naturally, there shall be some features that are
implemented in one but not in the other. One such flag is
--ignore-whitespace which indicates merge mechanism to treat lines
with only whitespace changes as unchanged. Wire the interactive
rebase to also understand the --ignore-whitespace flag by
translating it to -Xignore-space-change.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GitGitGadget, a handy tool for converting pull requests against Git into
Git-mailing-list-friendly-patch-emails, requires as anti-spam that all
new users be "/allow"ed by an existing user once before it will do
anything for that new user. While this tutorial explained that
mechanism, it did not give much hint on how to go about finding someone
to allow your new pull request. So, teach our new GitGitGadget user
where to look for someone who can add their name to the list.
The advice in this patch is based on the advice proposed for
GitGitGadget: https://github.com/gitgitgadget/gitgitgadget/pull/138
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Indicate that the user needs some dependencies before the build will run
happily on their machine; this dependency list doesn't seem to be made
clear anywhere else in the project documentation. Then, so the user can
be certain any build failures are due to their code and not their
environment, perform a build on a clean checkout of 'master'. Also, move
the note about build parallelization up here, so that it appears next to
the very first build invocation in the tutorial.
Reported-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users can discover commands and their brief usage by running 'git help
git' or 'git help -a'; both of these pages list all available commands
based on the contents of 'command-list.txt'. That means adding a new
command there is an important part of the new command process, and
therefore belongs in the new command tutorial.
Teach new users how to add their command, and include a brief overview
of how to discover which attributes to place on the command in the list.
Since 'git psuh' prints some workspace info, doesn't modify anything,
and is targeted as a user-facing porcelain command, list it as a
'mainporcelain' and 'info' command.
As the usage string is required to generate this documentation, don't
add the command to the list until after the usage string is added to the
tutorial.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `git stash` while changes were staged for files that are
marked with the `skip-worktree` bit (e.g. files that are excluded in a
sparse checkout), the files are recorded as _deleted_ instead.
The reason is that `git stash` tries to construct the tree reflecting
the worktree essentially by copying the index to a temporary one and
then updating the files from the worktree. Crucially, it calls `git
diff-index` to update also those files that are in the HEAD but have
been unstaged in the index.
However, when the temporary index is updated via `git update-index --add
--remove`, skip-worktree entries mark the files as deleted by mistake.
Let's use the newly-introduced `--ignore-skip-worktree-entries` option
of `git update-index` to prevent exactly this from happening.
Note that the regression test case deliberately avoids replicating the
scenario described above and instead tries to recreate just the symptom.
Reported by Dan Thompson.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While `git update-index` mostly ignores paths referring to index entries
whose skip-worktree bit is set, in b4d1690df1 (Teach Git to respect
skip-worktree bit (reading part), 2009-08-20), for reasons that are not
entirely obvious, the `--remove` option was made special: it _does_
remove index entries even if their skip-worktree bit is set.
Seeing as this behavior has been in place for a decade now, it does not
make sense to change it.
However, in preparation for fixing a bug in `git stash` where it
pretends that skip-worktree entries have actually been removed, we need
a mode where `git update-index` leaves all skip-worktree entries alone,
even if the `--remove` option was passed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.
In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:
fatal: git uploadfata-lp: raemcokte :error: upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
8e86a771eda009872d6ab2886
Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`write_in_full()`.
We still clip the message to at most 4095 characters.
The history of `vreportf()` with regard to this issue includes the
following commits:
d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again
Note that we print nothing if the `vsnprintf()` call failed to render
the error message; There is little we can do in that case, and it should
not happen anyway.
The process may have written to `stderr` and there may be something left
in the buffer kept in the stdio layer. Call `fflush(stderr)` before
writing the message we prepare in this function.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As per Wikipedia, "In current technical usage, for one to state that a
feature is deprecated is merely a recommendation against using it." It
is thus contradictory to claim that something is not "officially
deprecated" and then to immediately state that we are both discouraging
its use and pointing people elsewhere.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We recently regressed our rendering with Asciidoctor of "literal"
elements in our manpages, i.e, stuff we have placed within `backticks`
in order to render as monospace. In particular, we lost the bold
rendering of such literal text.
The culprit is f6461b82b9 ("Documentation: fix build with Asciidoctor 2",
2019-09-15), where we switched from DocBook 4.5 to DocBook 5 with
Asciidoctor. As part of the switch, we started using the namespaced
DocBook XSLT stylesheets rather than the non-namespaced ones. (See
f6461b82b9 for more details on why we changed to the namespaced ones.)
The bold literals are implemented as an XSLT snippet <xsl:template
match="literal">...</xsl:template>. Now that we use namespaces, this
doesn't pick up our literals like it used to.
Match for "d:literal" in addition to just "literal", after defining the
d namespace. ("d" is what
http://docbook.sourceforge.net/release/xsl-ns/current/manpages/docbook.xsl
uses.) Note that we need to keep matching without the namespace for
AsciiDoc.
This boldness was introduced by 5121a6d993 ("Documentation: option to
render literal text as bold for manpages", 2009-03-27) and made the
default in 5945717009 ("Documentation: bold literals in man",
2016-05-31).
One reason this was not caught in review is that our doc-diff tool diffs
without any boldness, i.e., it "only" compares text. As pointed out by
Peff in review of this patch, one can use `MAN_KEEP_FORMATTING=1
./doc-diff <...>`
This has been optically tested with AsciiDoc 8.6.10, Asciidoctor 1.5.5
and Asciidoctor 2.0.10. I've also verified that doc-diff produces the
empty output for all three programs, as expected, and that with the
MAN_KEEP_FORMATTING trick, AsciiDoc yields no diff, whereas with
Asciidoctor, we get bold literals, just like we want.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Within diff_no_index(), we have the following:
revs->diffopt.flags.exit_with_status = 1;
...
/*
* The return code for --no-index imitates diff(1):
* 0 = no changes, 1 = changes, else error
*/
return diff_result_code(&revs->diffopt, 0);
Which means when `git diff` is run in `--no-index` mode, `--exit-code`
is implied. However, the documentation for this is missing in
git-diff.txt.
Add a note about how `--exit-code` is implied in the `--no-index`
documentation to cover this documentation blindspot.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a rather old bug in gitweb---incremental blame output in
javascript actions mode never worked.
* rl/gitweb-blame-prev-fix:
gitweb: correctly store previous rev in javascript-actions mode
Currently, in the event that a submodule's upstream URL changes, users
have to manually alter the URL in the .gitmodules file then run
`git submodule sync`. Let's make that process easier.
Teach submodule the set-url subcommand which will automatically change
the `submodule.$name.url` property in the .gitmodules file and then run
`git submodule sync` to complete the process.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comments for the staging/unstaging test did not accurately
describe the scenario being tested. It is not essential that
the test files being staged/unstaged appear at the end of the
index. All that is required is that the test files are not
flagged with CE_FSMONITOR_VALID and have a position in the
index greater than the number of entries in the index after
unstaging.
The comment for this test has been updated to be more
accurate with respect to the scenario that's being tested.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many projects the number of contributors is low enough that users know
each other and the full email address doesn't need to be displayed.
Displaying only the author's username saves a lot of columns on the screen.
Existing 'e/E' (as in "%ae" and "%aE") placeholders would show the
author's address as "prarit@redhat.com", which would waste columns to show
the same domain-part for all contributors when used in a project internal
to redhat. Introduce 'l/L' placeholders that strip '@' and domain part from
the e-mail address.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"worktree add" internally calls "reset --hard", but if
submodule.recurse is set, reset tries to recurse into
initialized submodules, which makes start_command try to
cd into non-existing submodule paths and die.
Fix that by making sure that the call to reset in "worktree add"
does not recurse.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The guide "gitsubmodules" was added in d480345 (submodules: overhaul
documentation, 2017-06-22), but it was not added to
command-list.txt when commit 1b81d8c (help: use command-list.txt
for the source of guides, 2018-05-20) taught "git help" to obtain the
guide list from this file.
Add it now, and capitalize the first word of the description of
gitsubmodules, as was done in 1b81d8c (help: use command-list.txt
for the source of guides, 2018-05-20) for the other guides.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since worktrees were introduced, the `git_path()` function _really_
needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
specific to the worktree, and therefore so is its reflog). However, the
wrong path is returned for `logs/HEAD.lock`.
This does not matter as long as the Git executable is doing the asking,
as the path for that `logs/HEAD.lock` file is constructed from
`git_path("logs/HEAD")` by appending the `.lock` suffix.
However, Git GUI just learned to use `--git-path` instead of appending
relative paths to what `git rev-parse --git-dir` returns (and as a
consequence not only using the correct hooks directory, but also using
the correct paths in worktrees other than the main one). While it does
not seem as if Git GUI in particular is asking for `logs/HEAD.lock`,
let's be safe rather than sorry.
Side note: Git GUI _does_ ask for `index.lock`, but that is already
resolved correctly, due to `update_common_dir()` preferring to leave
unknown paths in the (worktree-specific) git directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this, you cannot use `--run=<...>` to skip that part, and a run
with `--run=0` (which is a common way to determine the test case number
corresponding to a given test case title).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The indent heuristic is our default diff heuristic since 33de716387
(diff: enable indent heuristic by default, 2017-05-08), but the usage
string of 'git blame' still mentions it as "experimental heuristic".
We could simply update the short help associated with the option, but
according to the comment above the option's declaration it was "only
included here to get included in the "-h" output". That made sense
while the feature was still experimental and we wanted to give it more
exposure, but nowadays it's unnecessary.
So let's rather remove the '--indent-heuristic' option from 'git
blame's usage string. Note that 'git blame' will still accept this
option, as it is parsed in parse_revision_opt().
Astute readers may notice that this patch removes a comment mentioning
"the following two options", but it only removes one option. The
reason is that the comment is outdated: that other options was
'--compaction-heuristic', and it has already been removed in
3cde4e02ee (diff: retire "compaction" heuristics, 2016-12-23), but
that commit forgot to update this comment.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin/clone.c has a static function dir_exists() that
checks if a given path exists on the filesystem. It returns
true (and it is correct for it to return true) when the
given path exists as a non-directory (e.g. a regular file).
This is confusing. What the caller wants to check, and what
this function wants to return, is if the path exists, so
rename it to path_exists().
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hotfix application example uses `git merge --no-commit` to apply
temporary changes to the working tree during a bisect operation. In some
situations this can be a fast-forward and `merge` will apply the hotfix
branch's commits regardless of `--no-commit` (as documented in the `git
merge` manual).
In the pathological case this will make a `git bisect run` invocation
loop indefinitely between the first bisect step and the fast-forwarded
post-merge HEAD.
Add `--no-ff` to the merge command to avoid this issue.
Signed-off-by: Mihail Atanassov <m.atanassov92@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.24-rc1
* tag 'v2.24.0-rc1' of github.com:git/git:
Git 2.24-rc1
repo-settings: read an int for index.version
ci: fix GCC install in the Travis CI GCC OSX job
Eleventh batch
ci(osx): use new location of the `perforce` cask
t7419: change test_must_fail to ! for grep
t4014: make output-directory tests self-contained
ci(visual-studio): actually run the tests in parallel
ci(visual-studio): use strict compile flags, and optimization
userdiff: fix some corner cases in dts regex
test-progress: fix test failures on big-endian systems
completion: clarify installation instruction for zsh
grep: avoid leak of chartables in PCRE2
grep: make PCRE2 aware of custom allocator
grep: make PCRE1 aware of custom allocator
remote-curl: pass on atomic capability to remote side
diff-highlight: fix a whitespace nit
fsmonitor: don't fill bitmap with entries to be removed
We don't actually look at the tree struct in fsck_tree() beyond its oid
and type (which is obviously OBJ_TREE). Just taking an oid gives our
callers more flexibility to avoid creating a struct, and makes it clear
that we are fscking just what is in the buffer, not any pre-parsed bits
from the struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually look at the commit struct in fsck_commit() beyond its
oid and type (which is obviously OBJ_COMMIT). Just taking an oid gives
our callers more flexibility to avoid creating or parsing a struct, and
makes it clear that we are fscking just what is in the buffer, not any
pre-parsed bits from the struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually look at the tag struct in fsck_tag() beyond its oid
and type (which is obviously OBJ_TAG). Just taking an oid gives our
callers more flexibility to avoid creating or parsing a struct, and
makes it clear that we are fscking just what is in the buffer, not any
pre-parsed bits from the struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In fsck_commit() and fsck_tag(), we have local "oid" variables used for
parsing parent and tagged-object oids. Let's give these more specific
names in preparation for the functions taking an "oid" parameter for the
object we're checking.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We only need the oid and type to pass on to report(). Let's accept the
broken-out parameters to give our callers more flexibility.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only thing we do with the struct is pass its oid and type to
report(). We can just take those explicitly, which gives our callers
more flexibility.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since fsck_blob() no longer requires us to have a "struct blob", we
don't need to create one. Which also means we don't need to worry about
handling the case that lookup_blob() returns NULL (we'll still catch
wrongly-identified blobs when we read the actual object contents and
type from disk).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually need any information from the object struct except its
oid (and the type, of course, but that's implicitly OBJ_BLOB). This
gives our callers more flexibility to drop the object structs, too.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The report() function really only cares about the oid and type of the
object, not the full object struct. Let's convert it to take those two
items separately, which gives our callers more flexibility.
This makes some already-long lines even longer. I've mostly left them,
as our eventual goal is to shrink these down as we continue refactoring
(e.g., "&item->object" becomes "&item->object.oid, item->object.type",
but will eventually shrink down to "oid, type").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The skiplist is inherently an oidset, so we don't need a full object
struct. Let's take just the oid to give our callers more flexibility.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
None of the callbacks actually care about having a "struct object";
they're happy with just the oid and type information. So let's give
ourselves more flexibility to avoid having a "struct object" by just
passing the broken-down fields.
Note that the callback already takes a "type" field for the fsck message
type. We'll rename that to "msg_type" (and use "object_type" for the
object type) to make the distinction explicit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our printable_type() and describe_object() functions take whole object
structs, but they really only care about the oid and type. Let's take
those individually in order to give our callers more flexibility.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't actually care about having object structs; we only need to look
up decorations by oid. Let's accept this more limited form, which will
give our callers more flexibility.
Note that the decoration API we rely on uses object structs itself (even
though it only looks at their oids). We can solve this by switching to
a kh_oid_map (we could also use the hashmap oidmap, but it's more
awkward for the simple case of just storing a void pointer).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This isolates the implementation detail of using the decoration code to
our put/get functions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 90cf590f53 (fsck: optionally show more helpful info for broken
links, 2016-07-17) added a system for decorating objects with names. The
code is split across builtin/fsck.c (which gives the initial names) and
fsck.c (which adds to the names as it traverses the object graph). This
leads to some duplication, where both sites have near-identical
describe_object() functions (the difference being that the one in
builtin/fsck.c uses a circular array of buffers to allow multiple calls
in a single printf).
Let's provide a unified object_name API for fsck. That lets us drop the
duplication, as well as making the interface boundaries more clear
(which will let us refactor the implementation more in a future patch).
We'll leave describe_object() in builtin/fsck.c as a thin wrapper around
the new API, as it relies on a static global to make its many callers a
bit shorter.
We'll also convert the bare add_decoration() calls in builtin/fsck.c to
put_object_name(). This fixes two minor bugs:
1. We leak many small strings. add_decoration() has a last-one-wins
approach: it updates the decoration to the new string and returns
the old one. But we ignore the return value, leaking the old
string. This is quite common to trigger, since we look at reflogs:
the tip of any ref will be described both by looking at the actual
ref, as well as the latest reflog entry. So we'd always end up
leaking one of those strings.
2. The last-one-wins approach gives us lousy names. For instance, we
first look at all of the refs, and then all of the reflogs. So
rather than seeing "refs/heads/master", we're likely to overwrite
it with "HEAD@{12345678}". We're generally better off using the
first name we find.
And indeed, the test in t1450 expects this ugly HEAD@{} name. After
this patch, we've switched to using fsck_put_object_name()'s
first-one-wins semantics, and we output the more human-friendly
"refs/tags/julius" (and the test is updated accordingly).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fsck_object() function takes in a buffer, but also a "struct
object". The rules for using these vary between types:
- for a commit, we'll use the provided buffer; if it's NULL, we'll
fall back to get_commit_buffer(), which loads from either an
in-memory cache or from disk. If the latter fails, we'd die(), which
is non-ideal for fsck.
- for a tag, a NULL buffer will fall back to loading the object from
disk (and failure would lead to an fsck error)
- for a tree, we _never_ look at the provided buffer, and always use
tree->buffer
- for a blob, we usually don't look at the buffer at all, unless it
has been marked as a .gitmodule file. In that case we check the
buffer given to us, or assume a NULL buffer is a very large blob
(and complain about it)
This is much more complex than it needs to be. It turns out that nobody
ever feeds a NULL buffer that isn't a blob:
- git-fsck calls fsck_object() only from fsck_obj(). That in turn is
called by one of:
- fsck_obj_buffer(), which is a callback to verify_pack(), which
unpacks everything except large blobs into a buffer (see
pack-check.c, lines 131-141).
- fsck_loose(), which hits a BUG() on non-blobs with a NULL buffer
(builtin/fsck.c, lines 639-640)
And in either case, we'll have just called parse_object_buffer()
anyway, which would segfault on a NULL buffer for commits or tags
(not for trees, but it would install a NULL tree->buffer which would
later cause a segfault)
- git-index-pack asserts that the buffer is non-NULL unless the object
is a blob (see builtin/index-pack.c, line 832)
- git-unpack-objects always writes a non-NULL buffer into its
obj_buffer hash, which is then fed to fsck_object(). (There is
actually a funny thing here where it does not store blob buffers at
all, nor does it call fsck on them; it does check any needed blobs
via fsck_finish() though).
Let's make the rules simpler, which reduces the amount of code and gives
us more flexibility in refactoring the fsck code. The new rules are:
- only blobs are allowed to pass a NULL buffer
- we always use the provided buffer, never pulling information from
the object struct
We don't have to adjust any callers, because they were already adhering
to these. Note that we do drop a few fsck identifiers for missing tags,
but that was all dead code (because nobody passed a NULL tag buffer).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Way back in 92d4c85d24 (fsck-cache: fix SIGSEGV on bad tag object,
2005-05-03), we added an fsck check that the "tagged" field of a tag
struct isn't NULL. But that was mainly protecting the printing code for
"--tags", and that code wasn't moved along with the check as part of
ba002f3b28 (builtin-fsck: move common object checking code to fsck.c,
2008-02-25).
It could also serve to detect type mismatch problems (where a tag points
to object X as a commit, but really X is a blob), but it couldn't do so
reliably (we'd call lookup_commit(X), but it will only notice the
problem if we happen to have previously called lookup_blob(X) in the
same process). And as of a commit earlier in this series, we'd consider
that a parse error and complain about the object even before getting to
this point anyway.
So let's drop this "tag->tagged" check. It's not helping anything, and
getting rid of it makes the function conceptually cleaner, as it really
is just checking the buffer we feed it. In fact, we can get rid of our
one-line wrapper and just unify fsck_tag() and fsck_tag_buffer().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4516338243 (builtin-fsck: reports missing parent commits,
2008-02-25), we added code to check that fsck found the same number of
parents from parsing the commit itself as we see in the commit struct we
got from parse_commit_buffer(). Back then the rationale was that the
normal commit parser might skip some bad parents.
But earlier in this series, we started treating that reliably as a
parsing error, meaning that we'd complain about it before we even hit
the code in fsck.c.
Let's drop this code, which now makes fsck_commit_buffer() completely
independent of any parsed values in the commit struct (that's
conceptually cleaner, and also opens up more refactoring options).
Note that we can also drop the MISSING_PARENT and MISSING_GRAFT fsck
identifiers. This is no loss, as these would not trigger reliably
anyway. We'd hit them only when lookup_commit() failed, which occurs
only if we happen to have seen the object with another type already in
the same process. In most cases, we'd actually run into the problem
during the connectivity walk, not here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check in fsck_commit_buffer() that commit->tree isn't NULL, which in
turn generally comes from a previous parse by parse_commit(). But this
isn't really accomplishing anything. The two things we might care about
are:
- was there a syntactically valid "tree <oid>" line in the object? But
we've just done our own parse in fsck_commit_buffer() to check this.
- does it point to a valid tree object? But checking the "tree"
pointer here doesn't actually accomplish that; it just shows that
lookup_tree() didn't return NULL, which only means that we haven't
yet seen that oid as a non-tree in this process.
A real connectivity check would exhaustively walk all graph links,
and we do that already in a separate function.
So this code isn't helping anything. And it makes the fsck code slightly
more confusing and rigid (e.g., it requires that any commit structs have
already been parsed). Let's drop it.
As a bit of history, the presence of this code looks like a leftover
from early fsck code (which did rely on parse_commit() to do most of the
parsing). The check comes from ff5ebe39b0 (Port fsck-cache to use
parsing functions, 2005-04-18), but we later added an explicit walk in
355885d531 (add generic, type aware object chain walker, 2008-02-25).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we can't parse a commit, then parse_commit() will return an error
code. But it _also_ sets the "parsed" flag, which tells us not to bother
trying to re-parse the object. That means that subsequent parses have no
idea that the information in the struct may be bogus. I.e., doing this:
parse_commit(commit);
...
if (parse_commit(commit) < 0)
die("commit is broken");
will never trigger the die(). The second parse_commit() will see the
"parsed" flag and quietly return success.
There are two obvious ways to fix this:
1. Stop setting "parsed" until we've successfully parsed.
2. Keep a second "corrupt" flag to indicate that we saw an error (and
when the parsed flag is set, return 0/-1 depending on the corrupt
flag).
This patch does option 1. The obvious downside versus option 2 is that
we might continually re-parse a broken object. But in practice,
corruption like this is rare, and we typically die() or return an error
in the caller. So it's OK not to worry about optimizing for corruption.
And it's much simpler: we don't need to use an extra bit in the object
struct, and callers which check the "parsed" flag don't need to learn
about the corrupt bit, too.
There's no new test here, because this case is already covered in t5318.
Note that we do need to update the expected message there, because we
now detect the problem in the return from "parse_commit()", and not with
a separate check for a NULL tree. In fact, we can now ditch that
explicit tree check entirely, as we're covered robustly by this change
(and the previous recent change to treat a NULL tree as a parse error).
We'll also give tags the same treatment. I don't know offhand of any
cases where the problem can be triggered (it implies somebody ignoring a
parse error earlier in the process), but consistently returning an error
should cause the least surprise.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first line in 'git commit-graph's usage string indicates that this
command can be invoked without specifying a subcommand. However, this
is not the case:
$ git commit-graph
usage: git commit-graph [--object-dir <objdir>]
or: git commit-graph read [--object-dir <objdir>]
[...]
$ echo $?
129
Remove this line from the usage string.
The synopsis in the manpage doesn't contain this line.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test produces pseudo-collisions and tests git diff's behavior with
them, and is therefore sensitive to the hash in use. Update the test to
compute the collisions for both SHA-1 and SHA-256 using appropriate
constants. Move the heredocs inside the setup block so that all of the
setup code can be tested for failure.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute several object IDs that exist in expected output, since we don't
care about the specific object IDs, only that the format of the output
is syntactically correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes. Move some expected result heredocs around so
that they can use computed variables.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the length of an object ID, look this value up
using the translation tables. Similarly, compute input data for invalid
submodule entries using the tables as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test passes successfully with SHA-256, so remove the annotation
which limits it to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A repository using a hash other than SHA-1 will need to have an
extension in the config file. Ignore any extensions when comparing
config files, since they don't usefully contribute to the goal of the
test.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an option to print the object format used for input, output, or
storage. This allows shell scripts to discover the hash algorithm in
use.
Since the transition plan allows for multiple input algorithms, document
that we may provide multiple results for input, and the format that the
results may take. While we don't support this now, documenting it early
means that script authors can future-proof their scripts for when we do.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this change, the setting
$feature{'javascript-actions'}{'default'} = [1];
in gitweb.conf breaks gitweb's blame page: clicking on line numbers
displayed in the second column on the page has no effect.
For comparison, with javascript-actions disabled, clicking on line
numbers loads the previous version of the line.
Addresses https://bugs.debian.org/741883.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Robert Luberda <robert@debian.org>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit includes a failing test for an issue around
fetch.writeCommitGraph and fetching in a repo with a submodule. Here, we
fix that bug and set the test to "test_expect_success".
The problem arises with this set of commands when the remote repo at
<url> has a submodule. Note that --recurse-submodules is not needed to
demonstrate the bug.
$ git clone <url> test
$ cd test
$ git -c fetch.writeCommitGraph=true fetch origin
Computing commit graph generation numbers: 100% (12/12), done.
BUG: commit-graph.c:886: missing parent <hash1> for commit <hash2>
Aborted (core dumped)
As an initial fix, I converted the code in builtin/fetch.c that calls
write_commit_graph_reachable() to instead launch a "git commit-graph
write --reachable --split" process. That code worked, but is not how we
want the feature to work long-term.
That test did demonstrate that the issue must be something to do with
internal state of the 'git fetch' process.
The write_commit_graph() method in commit-graph.c ensures the commits we
plan to write are "closed under reachability" using close_reachable().
This method walks from the input commits, and uses the UNINTERESTING
flag to mark which commits have already been visited. This allows the
walk to take O(N) time, where N is the number of commits, instead of
O(P) time, where P is the number of paths. (The number of paths can be
exponential in the number of commits.)
However, the UNINTERESTING flag is used in lots of places in the
codebase. This flag usually means some barrier to stop a commit walk,
such as in revision-walking to compare histories. It is not often
cleared after the walk completes because the starting points of those
walks do not have the UNINTERESTING flag, and clear_commit_marks() would
stop immediately.
This is happening during a 'git fetch' call with a remote. The fetch
negotiation is comparing the remote refs with the local refs and marking
some commits as UNINTERESTING.
I tested running clear_commit_marks_many() to clear the UNINTERESTING
flag inside close_reachable(), but the tips did not have the flag, so
that did nothing.
It turns out that the calculate_changed_submodule_paths() method is at
fault. Thanks, Peff, for pointing out this detail! More specifically,
for each submodule, the collect_changed_submodules() runs a revision
walk to essentially do file-history on the list of submodules. That
revision walk marks commits UNININTERESTING if they are simplified away
by not changing the submodule.
Instead, I finally arrived on the conclusion that I should use a flag
that is not used in any other part of the code. In commit-reach.c, a
number of flags were defined for commit walk algorithms. The REACHABLE
flag seemed like it made the most sense, and it seems it was not
actually used in the file. The REACHABLE flag was used in early versions
of commit-reach.c, but was removed by 4fbcca4 (commit-reach: make
can_all_from_reach... linear, 2018-07-20).
Add the REACHABLE flag to commit-graph.c and use it instead of
UNINTERESTING in close_reachable(). This fixes the bug in manual
testing.
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While dogfooding, Johannes found a bug in the fetch.writeCommitGraph
config behavior. His example initially happened during a clone with
--recurse-submodules, we found that this happens with the first fetch
after cloning a repository that contains a submodule:
$ git clone <url> test
$ cd test
$ git -c fetch.writeCommitGraph=true fetch origin
Computing commit graph generation numbers: 100% (12/12), done.
BUG: commit-graph.c:886: missing parent <hash1> for commit <hash2>
Aborted (core dumped)
In the repo I had cloned, there were really 60 commits to scan, but
only 12 were in the list to write when calling
compute_generation_numbers(). A commit in the list expects to see a
parent, but that parent is not in the list.
A follow-up will fix the bug, but first we create a test that
demonstrates the problem. This test must be careful about an existing
commit-graph file, since GIT_TEST_COMMIT_GRAPH=1 will cause the repo we
are cloning to already have one. This then prevents the incremtnal
commit-graph write during the first 'git fetch'.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove empty and redundant documentation files from the
Documentation/technical/ directory.
The empty doc files included only TODO messages with no documentation for
years. Instead an approach is being taken to keep all doc beside the code
in the relevant header files.
Having empty doc files is confusing and disappointing to anybody looking
for information, besides having the documentation in header files makes it
easier for developers to find the information they are looking for.
Some of the content which could have gone here already exists elsewhere:
- api-object-access.txt -> sha1-file.c and object.h have some details.
- api-quote.txt -> quote.h has some details.
- api-xdiff-interface.txt -> xdiff-interface.h has some details.
- api-grep.txt -> grep.h does not have enough documentation at the moment.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The codepath that reads the index.version configuration was broken
with a recent update, which has been corrected.
* ds/feature-macros:
repo-settings: read an int for index.version
Update installation procedure for Perforce on MacOS in the CI jobs
running on Azure pipelines, which was failing.
* js/azure-ci-osx-fix:
ci(osx): use new location of the `perforce` cask
Suppose, from a repository that has ".gitmodules", we clone with
--filter=blob:none:
git clone --filter=blob:none --no-checkout \
https://kernel.googlesource.com/pub/scm/git/git
Then we fetch:
git -C git fetch
This will cause a "unable to load config blob object", because the
fetch_config_from_gitmodules() invocation in cmd_fetch() will attempt to
load ".gitmodules" (which Git knows to exist because the client has the
tree of HEAD) while fetch_if_missing is set to 0.
fetch_if_missing is set to 0 too early - ".gitmodules" here should be
lazily fetched. Git must set fetch_if_missing to 0 before the fetch
because as part of the fetch, packfile negotiation happens (and we do
not want to fetch any missing objects when checking existence of
objects), but we do not need to set it so early. Move the setting of
fetch_if_missing to the earliest possible point in cmd_fetch(), right
before any fetching happens.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several config options were combined into a repo_settings struct in
ds/feature-macros, including a move of the "index.version" config
setting in 7211b9e (repo-settings: consolidate some config settings,
2019-08-13).
Unfortunately, that file looked like a lot of boilerplate and what is
clearly a factor of copy-paste overload, the config setting is parsed
with repo_config_ge_bool() instead of repo_config_get_int(). This means
that a setting "index.version=4" would not register correctly and would
revert to the default version of 3.
I caught this while incorporating v2.24.0-rc0 into the VFS for Git
codebase, where we really care that the index is in version 4.
This was not caught by the codebase because the version checks placed
in t1600-index.sh did not test the "basic" scenario enough. Here, we
modify the test to include these normal settings to not be overridden by
features.manyFiles or GIT_INDEX_VERSION. While the "default" version is
3, this is demoted to version 2 in do_write_index() when not necessary.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, when doing a 3-way merge, the merge.conflictStyle option was not
respected and the "merge" style was always used, even if "diff3" was
specified.
Call git_xmerge_config() at the end of git_apply_config() so that the
merge.conflictStyle config is read.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, apply does not respect the merge.conflictStyle setting.
Demonstrate this by making the 'apply with --3way' test case generic and
extending it to show that the configuration of
merge.conflictStyle = diff3 causes a breakage.
Change print_sanitized_conflicted_diff() to also sanitize `|||||||`
conflict markers.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since `git config` leaves the configurations set even after the test
case completes, use `test_config` instead so that the configurations are
reset once the test case finishes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, the output of `git diff HEAD` would always be piped to
sanitize_conflicted_diff(). However, since the Git command was upstream
of the pipe, in case the Git command fails, the return code would be
lost. Rewrite into separate statements so that the return code is no
longer lost.
Since only the command `git diff HEAD` was being piped to
sanitize_conflicted_diff(), move the command into the function and rename
it to print_sanitized_conflicted_diff().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the locally defined create_file() duplicates the functionality of
the test_write_lines() helper function, remove create_file() and replace
all instances with test_write_lines(). While we're at it, move
redirection operators to the end of the command which is the more
conventional place to put it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few days ago Travis CI updated their existing OSX images, including
the Homebrew database in the xcode10.1 OSX image that we use. Since
then installing dependencies in the 'osx-gcc' job fails when it tries
to link gcc@8:
+ brew link gcc@8
Error: No such keg: /usr/local/Cellar/gcc@8
GCC8 is still installed but not linked to '/usr/local' in the updated
image, as it was before this update, but now we have to link it by
running 'brew link gcc'. So let's do that then, and fall back to
linking gcc@8 if it doesn't, just to be sure.
Our builds on Azure Pipelines are unaffected by this issue. The OSX
image over there doesn't contain the gcc@8 package, so we have to
'brew install' it, which already takes care of linking it to
'/usr/local'. After that the 'brew link gcc' command added by this
patch fails, but the ||-chained fallback 'brew link gcc@8' command
succeeds with an "already linked" warning.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation from Documentation/technical/api-config.txt into
config.h as it's easier for the developers to find the usage information
beside the code instead of looking for it in another doc file, also
documentation/technical/api-config.txt is removed because the information
it has is now redundant and it'll be hard to keep it up to date and
syncronized with the documentation in config.h
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leakfix.
* cb/pcre2-chartables-leakfix:
grep: avoid leak of chartables in PCRE2
grep: make PCRE2 aware of custom allocator
grep: make PCRE1 aware of custom allocator
The atomic push over smart HTTP transport did not work, which has
been corrected.
* bc/smart-http-atomic-push:
remote-curl: pass on atomic capability to remote side
The installation instruction for zsh completion script (in
contrib/) has been a bit improved.
* mb/clarify-zsh-completion-doc:
completion: clarify installation instruction for zsh
'logs/refs' is not a working tree-specific path, but since commit
b9317d55a3 (Make sure refs/rewritten/ is per-worktree, 2019-03-07)
'git rev-parse --git-path' has been returning a bogus path if a
trailing '/' is present:
$ git -C WT/ rev-parse --git-path logs/refs --git-path logs/refs/
/home/szeder/src/git/.git/logs/refs
/home/szeder/src/git/.git/worktrees/WT/logs/refs/
We use a trie data structure to efficiently decide whether a path
belongs to the common dir or is working tree-specific. As it happens
b9317d55a3 triggered a bug that is as old as the trie implementation
itself, added in 4e09cf2acf (path: optimize common dir checking,
2015-08-31).
- According to the comment describing trie_find(), it should only
call the given match function 'fn' for a "/-or-\0-terminated
prefix of the key for which the trie contains a value". This is
not true: there are three places where trie_find() calls the match
function, but one of them is missing the check for value's
existence.
- b9317d55a3 added two new keys to the trie: 'logs/refs/rewritten'
and 'logs/refs/worktree', next to the already existing
'logs/refs/bisect'. This resulted in a trie node with the path
'logs/refs/', which didn't exist before, and which doesn't have a
value attached. A query for 'logs/refs/' finds this node and then
hits that one callsite of the match function which doesn't check
for the value's existence, and thus invokes the match function
with NULL as value.
- When the match function check_common() is invoked with a NULL
value, it returns 0, which indicates that the queried path doesn't
belong to the common directory, ultimately resulting the bogus
path shown above.
Add the missing condition to trie_find() so it will never invoke the
match function with a non-existing value. check_common() will then no
longer have to check that it got a non-NULL value, so remove that
condition.
I believe that there are no other paths that could cause similar bogus
output. AFAICT the only other key resulting in the match function
being called with a NULL value is 'co' (because of the keys 'common'
and 'config'). However, as they are not in a directory that belongs
to the common directory the resulting working tree-specific path is
expected.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An array of 'struct common_dir' instances is used to specify whether
various paths in $GIT_DIR are specific to a worktree, or are common,
i.e. belong to main worktree. The names of two fields in this
struct are somewhat confusing or ambigious:
- The path is recorded in the struct's 'dirname' field, even though
several entries are regular files e.g. 'gc.pid', 'packed-refs',
etc.
Rename this field to 'path' to reduce confusion.
- The field 'exclude' tells whether the path is excluded... from
where? Excluded from the common dir or from the worktree? It
means the former, but it's ambigious.
Rename this field to 'is_common' to make it unambigious what it
means. This, however, means the exact opposite of what 'exclude'
meant, so we have to negate the field's value in all entries as
well.
The diff is best viewed with '--color-words'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'logs/HEAD', i.e. HEAD's reflog, is a file, but its entry in
'common_list' has the 'is_dir' bit set.
Unset that bit to make it consistent with what 'logs/HEAD' is supposed
to be.
This doesn't make a difference in behavior: check_common() is the only
function that looks at the 'is_dir' bit, and that function either
returns 0, or '!exclude', which for 'logs/HEAD' results in 0 as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fairly long comment describes trie_find()'s behavior and shows
examples, but it's slightly incomplete/inaccurate. Update this
comment to specify how trie_find() handles a negative return value
from the given match function.
Furthermore, update the list of examples to include not only two but
three levels of path components. This makes the examples slightly
more complicated, but it can illustrate the behavior in more corner
cases.
Finally, basically everything refers to the data stored for a key as
"value", with two confusing exceptions:
- The type definition of the match function calls its corresponding
parameter 'data'.
Rename that parameter to 'value'. (check_common(), the only
function of this type already calls it 'value').
- The table of examples above trie_find() has a "val from node"
column, which has nothing to do with the value stored in the trie:
it's a "prefix of the key for which the trie contains a value"
that led to that node.
Rename that column header to "prefix to node".
Note that neither the original nor the updated description and
examples correspond 100% to the current implementation, because the
implementation is a bit buggy, but the comment describes the desired
behavior. The bug will be fixed in the last patch of this series.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a directory in $GIT_DIR is overridden when $GIT_COMMON_DIR is set,
then usually all paths within that directory are overridden as well.
There are a couple of exceptions, though, and two of them, namely
'refs/rewritten' and 'logs/HEAD' are not mentioned in
'gitrepository-layout'. Document them as well.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the --[no-]progress option to git multi-pack-index.
Pass the MIDX_PROGRESS flag to the subcommand functions
when progress should be displayed by multi-pack-index.
The progress feature was added to 'verify' in 144d703
("multi-pack-index: report progress during 'verify'", 2018-09-13)
but some subcommands were not updated to display progress, and
the ability to opt-out was overlooked.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update midx_repack to only display progress when
the MIDX_PROGRESS flag is set.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update verify_midx_file to only display progress when
the MIDX_PROGRESS flag is set.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add progress to expire_midx_packs. Progress is
displayed when the MIDX_PROGRESS flag is set.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add progress to write_midx_file. Progress is displayed
when the MIDX_PROGRESS flag is set.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the MIDX_PROGRESS flag and update the
write|verify|expire|repack functions in midx.h
to accept a flags parameter. The MIDX_PROGRESS
flag indicates whether the caller of the function
would like progress information to be displayed.
This patch only changes the method prototypes
and does not change the functionality. The
functionality change will be handled by a later patch.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Azure Pipelines builds are failing for macOS due to a change in the
location of the perforce cask. The command outputs the following error:
+ brew install caskroom/cask/perforce
Error: caskroom/cask was moved. Tap homebrew/cask-cask instead.
So let's try to call `brew cask install perforce` first (which is what
that error message suggests, in a most round-about way).
Prior to 672f51cb we used to install the 'perforce' package with 'brew
install perforce' (note: no 'cask' in there). The justification for
672f51cb was that the command 'brew install perforce' simply stopped
working, after Homebrew folks decided that it's better to move the
'perforce' package to a "cask". Their justification for this move was
that 'brew install perforce' "can fail due to a checksum mismatch ...",
and casks can be installed without checksum verification. And indeed,
both 'brew cask install perforce' and 'brew install
caskroom/cask/perforce' printed something along the lines of:
==> No checksum defined for Cask perforce, skipping verification
It is unclear why 672f51cb used 'brew install caskroom/cask/perforce'
instead of 'brew cask install perforce'. It appears (by running both
commands on old Travis CI macOS images) that both commands worked all
the same already back then.
In any case, as the error message at the top of this commit message
shows, 'brew install caskroom/cask/perforce' has stopped working
recently, but 'brew cask install perforce' still does, so let's use
that.
CI servers are typically fresh virtual machines, but not always. To
accommodate for that, let's try harder if `brew cask install perforce`
fails, by specifically pulling the latest `master` of the
`homebrew-cask` repository.
This will still fail, of course, when `homebrew-cask` falls behind
Perforce's release schedule. But once it is updated, we can now simply
re-run the failed jobs and they will pick up that update.
As for updating `homebrew-cask`: the beginnings of automating this in
https://dev.azure.com/gitgitgadget/git/_build?definitionId=11&_a=summary
will be finished once the next Perforce upgrade comes around.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Transform the setup "tests" to setup functions, and have the actual
tests call the setup functions. Advantages:
* Should make life easier for people working with webby CI/PR builds
who have to abuse mice (and their own index finger as well) in
order to switch from viewing one testcase to another. Sounds
awful; hopefully this will improve things for them.
* Improves re-runnability: any failed test in any of these three
files can now be re-run in isolation, e.g.
./t6042* --ver --imm -x --run=21
whereas before it would require two tests to be specified to the
--run argument, the other needing to be picked out as the relevant
setup test from one or two tests before.
* Importantly, this still keeps the "setup" and "test" sections
somewhat separate to make it easier for readers to discern what is
just ancillary setup and what the intent of the test is.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allow renaming all entries in e.g. a directory named z/ into a
directory named y/ to be detected as a z/ -> y/ rename, so that if the
other side of history adds any files to the directory z/ in the mean
time, we can provide the hint that they should be moved to y/.
There is no reason to not allow 'y/' to be the root directory, but the
code did not handle that case correctly. Add a testcase and the
necessary special checks to support this case.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dscho noted a few things making this function hard to follow.
Restructure it a bit and add comments to make it easier to follow. The
restructurings include:
* There was a special case if-check at the end of the function
checking whether someone just renamed a file within its original
directory, meaning that there could be no directory rename involved.
That check was slightly convoluted; it could be done in a more
straightforward fashion earlier in the function, and can be done
more cheaply too (no call to strncmp).
* The conditions for advancing end_of_old and end_of_new before
calling strchr were both confusing and unnecessary. If either
points at a '/', then they need to be advanced in order to find the
next '/'. If either doesn't point at a '/', then advancing them one
char before calling strchr() doesn't hurt. So, just rip out the
if conditions and advance both before calling strchr().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing wording gives an impression that it only gives the
contents of the $GIT_DIR/rebase-apply/patch file, i.e. the patch
proper, but the option actually emits the entire e-mail message
being processed (iow, one of the output files from "git mailsplit").
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to t/README, test_must_fail() should only be used to test for
failure in Git commands. Replace the invocations of
`test_must_fail grep` with `! grep`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, the CI/PR builds that build and test using Visual Studio
were implemented imitating `linux-clang`, i.e. still using the
`Makefile`-based build infrastructure.
Later (but still before the patches made their way into git.git's
`master`), however, this was changed to generate Visual Studio project
files and build the binaries using `MSBuild`, as this reflects more
accurately how Visual Studio users would want to build Git (internally,
Visual Studio uses `MSBuild`, or at least something very similar).
During that transition, we needed to implement a new way to run the test
suite in parallel, as Visual Studio users typically will only have a Git
Bash available (which does not ship with `make` nor with support for
`prove`): we simply implemented a new test helper to run the test suite.
This helper even knows how to run the tests in parallel, but due to a
mistake on this developer's part, it was never turned on in the CI/PR
builds. This results in 2x-3x longer run times of the test phase.
Let's use the `--jobs=10` option to fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make full use of the work that went into the Visual Studio build &
test jobs in our CI/PR builds, let's turn on strict compiler flags. This
will give us the benefit of Visual C's compiler warnings (which, at
times, seem to catch things that GCC does not catch, and vice versa).
While at it, also turn on optimization; It does not make sense to
produce binaries with debug information, and we can use any ounce of
speed that we get (because the test suite is particularly slow on
Windows, thanks to the need to run inside a Unix shell, which
requires us to use the POSIX emulation layer provided by MSYS2).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While reviewing some dts diffs recently I noticed that the hunk header
logic was failing to find the containing node. This is because the regex
doesn't consider properties that may span multiple lines, i.e.
property = <something>,
<something_else>;
and it got hung up on comments inside nodes that look like the root node
because they start with '/*'. Add tests for these cases and update the
regex to find them. Maybe detecting the root node is too complicated but
forcing it to be a backslash with any amount of whitespace up to an open
bracket seemed OK. I tried to detect that a comment is in-between the
two parts but I wasn't happy so I just dropped it.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are looking at bitfield constants, and elsewhere in the Git source
code, such cases are handled via bit shift operators rather than octal
numbers, which also makes it easier to spot holes in the range
(if, say, 1<<5 was missing, it is easier to spot it between 1<<4
and 1<<6 than it is to spot a missing 040 between a 020 and a 0100).
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since --preserve-merges has been deprecated in favour of
--rebase-merges, mark this option as hidden so it no longer shows up in
the usage and completions.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the command description, including paragraph spacing.
Git URLs can accept bundle files for fetch, pull and clone, include
in that section. Include git clone in the bundle usage description.
Correct the quoting of <git-rev-list-args>.
Detail the <git-rev-list-args> options for cloning a complete repo.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing a tag, we may end up with a NULL "tagged" field when
there's a type mismatch (e.g., the tag claims to point to object X as a
commit, but we previously saw X as a blob in the same process), but we
do not otherwise indicate a parse failure to the caller.
This is similar to the case discussed in the previous commit, where a
commit could end up with a NULL tree field: while slightly convenient
for callers who want to overlook a corrupt object, it means that normal
callers have to explicitly deal with this case (rather than just relying
on the return code from parsing). And most don't, leading to segfault
fixes like the one in c77722b3ea (use get_tagged_oid(), 2019-09-05).
Let's address this more centrally, by returning an error code from the
parse itself, which most callers would already notice (adventurous
callers are free to ignore the error and continue looking at the
struct).
This also covers the case where the tag contains a nonsensical "type"
field (there we produced a user-visible error but still returned success
to the caller; now we'll produce a slightly better message and return an
error).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If parsing a commit yields a valid tree oid, but we've seen that same
oid as a non-tree in the same process, the resulting commit struct will
end up with a NULL tree pointer, but not otherwise report a parsing
failure.
That's perhaps convenient for callers which want to operate on even
partially corrupt commits (e.g., by still looking at the parents). But
it leaves a potential trap for most callers, who now have to manually
check for a NULL tree. Most do not, and it's likely that there are
possible segfaults lurking. I say "possible" because there are many
candidates, and I don't think it's worth following through on
reproducing them when we can just fix them all in one spot. And
certainly we _have_ seen real-world cases, such as the one fixed by
806278dead (commit-graph.c: handle corrupt/missing trees, 2019-09-05).
Note that we can't quite drop the check in the caller added by that
commit yet, as there's some subtlety with repeated parsings (which will
be addressed in a future commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While parsing the parents of a commit, if we are able to parse an actual
oid but lookup_commit() fails on it (because we previously saw it in
this process as a different object type), we silently omit the parent
and do not report any error to the caller.
The caller has no way of knowing this happened, because even an empty
parent list is a valid parse result. As a result, it's possible to fool
our "rev-list" connectivity check into accepting a corrupted set of
objects.
There's a test for this case already in t6102, but unfortunately it has
a slight error. It creates a broken commit with a parent line pointing
to a blob, and then checks that rev-list notices the problem in two
cases:
1. the "lone" case: we traverse the broken commit by itself (here we
try to actually load the blob from disk and find out that it's not
a commit)
2. the "seen" case: we parse the blob earlier in the process, and then
when calling lookup_commit() we realize immediately that it's not a
commit
The "seen" variant for this test mistakenly parsed another commit
instead of the blob, meaning that we were actually just testing the
"lone" case again. Changing that reveals the breakage (and shows that
this fixes it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 't0500-progress-display.sh' all tests running 'test-tool progress
--total=<N>' fail on big-endian systems, e.g. like this:
+ test-tool progress --total=3 Working hard
[...]
+ test_i18ncmp expect out
--- expect 2019-10-18 23:07:54.765523916 +0000
+++ out 2019-10-18 23:07:54.773523916 +0000
@@ -1,4 +1,2 @@
-Working hard: 33% (1/3)<CR>
-Working hard: 66% (2/3)<CR>
-Working hard: 100% (3/3)<CR>
-Working hard: 100% (3/3), done.
+Working hard: 0% (1/12884901888)<CR>
+Working hard: 0% (3/12884901888), done.
The reason for that bogus value is that '--total's parameter is parsed
via parse-options's OPT_INTEGER into a uint64_t variable [1], so the
two bits of 3 end up in the "wrong" bytes on big-endian systems
(12884901888 = 0x300000000).
Change the type of that variable from uint64_t to int, to match what
parse-options expects; in the tests of the progress output we won't
use values that don't fit into an int anyway.
[1] start_progress() expects the total number as an uint64_t, that's
why I chose the same type when declaring the variable holding the
value given on the command line.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
[jpag: Debian unstable/ppc64 (big-endian)]
Tested-By: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
[tz: Fedora s390x (big-endian)]
Tested-By: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original comment does not describe type of ~/.zsh/_git explicitly
and zsh does not warn or fail if a user create it as a dictionary.
So unexperienced users could be misled by the original comment.
There is a small update to clarify it.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Maxim Belsky <public.belsky@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash save" lost local changes to submodules, which has been
corrected.
* jj/stash-reset-only-toplevel:
stash: avoid recursive hard reset on submodules
"git format-patch -o <outdir>" did an equivalent of "mkdir <outdir>"
not "mkdir -p <outdir>", which is being corrected.
* bw/format-patch-o-create-leading-dirs:
format-patch: create leading components of output directory
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) introduced
a small memory leak visible with valgrind in t7813.
Complete the creation of a PCRE2 specific variable that was missing from
the original change and free the generated table just like it is done
for PCRE1.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include
a way to override the system allocator, and so it is incompatible with
custom allocators (e.g. nedmalloc). This problem became obvious when we
tried to plug a memory leak by `free()`ing a data structure allocated by
PCRE2, triggering a segfault in Windows (where we use nedmalloc by
default).
PCRE2 requires the use of a general context to override the allocator
and therefore, there is a lot more code needed than in PCRE1, including
a couple of wrapper functions.
Extend the grep API with a "destructor" that could be called to cleanup
any objects that were created and used globally.
Update `builtin/grep.c` to use that new API, but any other future users
should make sure to have matching `grep_init()`/`grep_destroy()` calls
if they are using the pattern matching functionality.
Move some of the logic that was before done per thread (in the workers)
into an earlier phase to avoid degrading performance, but as the use of
PCRE2 with custom allocators is better understood it is expected more of
its functions will be instructed to use the custom allocator as well as
was done in the original code[1] this work was based on.
[1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) didn't include a way
to override the system alocator, and so it is incompatible with
USE_NED_ALLOCATOR as reported by Dscho[1] (in similar code from PCRE2)
Note that nedmalloc, as well as other custom allocators like jemalloc
and mi-malloc, can be configured at runtime (via `LD_PRELOAD`),
therefore we cannot know at compile time whether a custom allocator is
used or not.
Make the minimum change possible to ensure this combination is supported
by extending `grep_init()` to set the PCRE1 specific functions to Git's
idea of `malloc()` and `free()` and therefore making sure all
allocations are done inside PCRE1 with the same allocator than the rest
of Git.
This change has negligible performance impact: PCRE needs to allocate
memory once per program run for the character table and for each pattern
compilation. These are both rare events compared to matching patterns
against lines. Actual measurements[2] show that the impact is lost in
the noise.
[1] https://public-inbox.org/git/pull.306.git.gitgitgadget@gmail.com
[2] https://public-inbox.org/git/7f42007f-911b-c570-17f6-1c6af0429586@web.de
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The builtin/notes.c::copy() function is prepared to handle either
one or two arguments given from the command line; when one argument
is given, to-obj defaults to HEAD.
bbb1b8a3 ("notes: check number of parameters to "git notes copy"",
2010-06-28) tried to make sure "git notes copy" (with *no* other
argument) does not dereference NULL by checking the number of
parameters, but it incorrectly insisted that we need two arguments,
instead of either one or two. This disabled the defaulting to-obj
to HEAD.
Correct it.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit bbb1b8a35a ("notes: check number of parameters to "git notes
copy"", 2010-06-28) added a test for too many or too few of
parameters provided to `git notes copy'.
However, the test only ensures that the command will fail but it
doesn't really check if it fails because of number of parameters.
If we accidentally lifted the check inside our code base, the test
may still have failed because the provided parameter is not a valid
ref.
Correct it.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing more than one reference with the --atomic option, the
server is supposed to perform a single atomic transaction to update the
references, leaving them either all to succeed or all to fail. This
works fine when pushing locally or over SSH, but when pushing over HTTP,
we fail to pass the atomic capability to the remote side. In fact, we
have not reported this capability to any remote helpers during the life
of the feature.
Now normally, things happen to work nevertheless, since we actually
check for most types of failures, such as non-fast-forward updates, on
the client side, and just abort the entire attempt. However, if the
server side reports a problem, such as the inability to lock a ref, the
transaction isn't atomic, because we haven't passed the appropriate
capability over and the remote side has no way of knowing that we wanted
atomic behavior.
Fix this by passing the option from the transport code through to remote
helpers, and from the HTTP remote helper down to send-pack. With this
change, we can detect if the server side rejects the push and report
back appropriately. Note the difference in the messages: the remote
side reports "atomic transaction failed", while our own checking rejects
pushes with the message "atomic push failed".
Document the atomic option in the remote helper documentation, so other
implementers can implement it if they like.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 04005834ed ("log: fix coloring of certain octopus merge shapes",
2018-09-01) there is a fix for the coloring of dashes following an
octopus merge. It makes a distinction between the case where all parents
introduce a new column, versus the case where the first parent collapses
into an existing column:
| *-. | *-.
| |\ \ | |\ \
| | | | |/ / /
The latter case means that the columns for the merge parents begin one
place to the left in the `new_columns` array compared to the former
case.
However, the implementation only works if the commit's parents are kept
in order as they map onto the visual columns, as we get the colors by
iterating over `new_columns` as we print the dashes. In general, the
commit's parents can arbitrarily merge with existing columns, and change
their ordering in the process.
For example, in the following diagram, the number of each column
indicates which commit parent appears in each column.
| | *---.
| | |\ \ \
| | |/ / /
| |/| | /
| |_|_|/
|/| | |
3 1 0 2
If the columns are colored (red, green, yellow, blue), then the dashes
will currently be colored yellow and blue, whereas they should be blue
and red.
To fix this, we need to look up each column in the `mapping` array,
which before the `GRAPH_COLLAPSING` state indicates which logical column
is displayed in each visual column. This implementation is simpler as it
doesn't have any edge cases, and it also handles how left-skewed first
parents are now displayed:
| *-.
|/|\ \
| | | |
0 1 2 3
The color of the first dashes is always the color found in `mapping` two
columns to the right of the commit symbol. Because commits are displayed
after all edges have been collapsed together and the visual columns
match the logical ones, we can find the visual offset of the commit
symbol using `commit_index`.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a merge commit is printed and its final parent is the same commit
that occupies the column to the right of the merge, this results in a
kink in the displayed edges:
* |
|\ \
| |/
| *
Graphs containing these shapes can be hard to read, as the expansion to
the right followed immediately by collapsing back to the left creates a
lot of zig-zagging edges, especially when many columns are present.
We can improve this by eliminating the zig-zag and having the merge's
final parent edge fuse immediately with its neighbor:
* |
|\|
| *
This reduces the horizontal width for the current commit by 2, and
requires one less row, making the graph display more compact. Taken in
combination with other graph-smoothing enhancements, it greatly
compresses the space needed to display certain histories:
*
|\
| * *
| |\ |\
| | * | *
| | | | |\
| | \ | | *
| *-. \ | * |
| |\ \ \ => |/|\|
|/ / / / | | *
| | | / | * |
| | |/ | |/
| | * * /
| * | |/
| |/ *
* |
|/
*
One of the test cases here cannot be correctly rendered in Git v2.23.0;
it produces this output following commit E:
| | *-. \ 5_E
| | |\ \ \
| |/ / / /
| | | / _
| |_|/
|/| |
The new implementation makes sure that the rightmost edge in this
history is not left dangling as above.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a graph contains edges that are in the process of collapsing to the
left, but those edges cross a commit line, the effect is that the edges
have a jagged appearance:
*
|\
| *
| \
*-. \
|\ \ \
| | * |
| * | |
| |/ /
* | |
|/ /
* |
|/
*
We already takes steps to smooth edges like this when they're expanding;
when an edge appears to the right of a merge commit marker on a
GRAPH_COMMIT line immediately following a GRAPH_POST_MERGE line, we
render it as a `\`:
* \
|\ \
| * \
| |\ \
We can make a similar improvement to collapsing edges, making them
easier to follow and giving the overall graph a feeling of increased
symmetry:
*
|\
| *
| \
*-. \
|\ \ \
| | * |
| * | |
| |/ /
* / /
|/ /
* /
|/
*
To do this, we introduce a new special case for edges on GRAPH_COMMIT
lines that immediately follow a GRAPH_COLLAPSING line. By retaining a
copy of the `mapping` array used to render the GRAPH_COLLAPSING line in
the `old_mapping` array, we can determine that an edge is collapsing
through the GRAPH_COMMIT line and should be smoothed.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The change I'm about to make requires being able to inspect the mapping
array that was used to render the last GRAPH_COLLAPSING line while
rendering a GRAPH_COMMIT line. The `new_mapping` array currently exists
as a pre-allocated space for computing the next `mapping` array during
`graph_output_collapsing_line()`, but we can repurpose it to let us see
the previous `mapping` state.
To support this use it will make more sense if this array is named
`old_mapping`, as it will contain the mapping data for the previous line
we rendered, at the point we're rendering a commit line.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following the introduction of "left-skewed" merges, which are merges
whose first parent fuses with another edge to its left, we have some
more edge cases to deal with in the display of commit and post-merge
lines.
The current graph code handles the following cases for edges appearing
to the right of the commit (*) on commit lines. A 2-way merge is usually
followed by vertical lines:
| | |
| * |
| |\ \
An octopus merge (more than two parents) is always followed by edges
sloping to the right:
| | \ | | \
| *-. \ | *---. \
| |\ \ \ | |\ \ \ \
A 2-way merge is followed by a right-sloping edge if the commit line
immediately follows a post-merge line for a commit that appears in the
same column as the current commit, or any column to the left of that:
| * | * |
| |\ | |\ \
| * \ | | * \
| |\ \ | | |\ \
This commit introduces the following new cases for commit lines. If a
2-way merge skews to the left, then the edges to its right are always
vertical lines, even if the commit follows a post-merge line:
| | | | |\
| * | | * |
|/| | |/| |
A commit with 3 parents that skews left is followed by vertical edges:
| | |
| * |
|/|\ \
If a 3-way left-skewed merge commit appears immediately after a
post-merge line, then it may be followed the right-sloping edges, just
like a 2-way merge that is not skewed.
| |\
| * \
|/|\ \
Octopus merges with 4 or more parents that skew to the left will always
be followed by right-sloping edges, because the existing columns need to
expand around the merge.
| | \
| *-. \
|/|\ \ \
On post-merge lines, usually all edges following the current commit
slope to the right:
| * | |
| |\ \ \
However, if the commit is a left-skewed 2-way merge, the edges to its
right remain vertical. We also need to display a space after the
vertical line descending from the commit marker, whereas this line would
normally be followed by a backslash.
| * | |
|/| | |
If a left-skewed merge has more than 2 parents, then the edges to its
right are still sloped as they bend around the edges introduced by the
merge.
| * | |
|/|\ \ \
To handle these new cases, we need to know not just how many parents
each commit has, but how many new columns it adds to the display; this
quantity is recorded in the `edges_added` field for the current commit,
and `prev_edges_added` field for the previous commit.
Here, "column" refers to visual columns, not the logical columns of the
`columns` array. This is because even if all the commit's parents end up
fusing with existing edges, they initially introduce distinct edges in
the commit and post-merge lines before those edges collapse. For
example, a 3-way merge whose 2nd and 3rd parents fuse with existing
edges still introduces 2 visual columns that affect the display of edges
to their right.
| | | \
| | *-. \
| | |\ \ \
| |_|/ / /
|/| | / /
| | |/ /
| |/| |
| | | |
This merge does not introduce any *logical* columns; there are 4 edges
before and after this commit once all edges have collapsed. But it does
initially introduce 2 new edges that need to be accommodated by the
edges to their right.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, when we display a merge whose first parent is already present
in a column to the left of the merge commit, we display the first parent
as a vertical pipe `|` in the GRAPH_POST_MERGE line and then immediately
enter the GRAPH_COLLAPSING state. The first-parent line tracks to the
left and all the other parent lines follow it; this creates a "kink" in
those lines:
| *---.
| |\ \ \
|/ / / /
| | | *
This change tidies the display of such commits such that if the first
parent appears to the left of the merge, we render it as a `/` and the
second parent as a `|`. This reduces the horizontal and vertical space
needed to render the merge, and makes the resulting lines easier to
read.
| *-.
|/|\ \
| | | *
If the first parent is separated from the merge by several columns, a
horizontal line is drawn in a similar manner to how the GRAPH_COLLAPSING
state displays the line.
| | | *-.
| |_|/|\ \
|/| | | | *
This effect is applied to both "normal" two-parent merges, and to
octopus merges. It also reduces the vertical space needed for pre-commit
lines, as the merge occupies one less column than usual.
Before: After:
| * | *
| |\ | |\
| | \ | * \
| | \ |/|\ \
| *-. \
| |\ \ \
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commits following this one introduce a series of improvements to the
layout of graphs, tidying up a few edge cases, namely:
- merge whose first parent fuses with an existing column to the left
- merge whose last parent fuses with its immediate neighbor on the right
- edges that collapse to the left above and below a commit line
This test case exemplifies these cases and provides a motivating example
of the kind of history I'm aiming to clear up.
The first parent of merge E is the same as the parent of H, so those
edges fuse together.
* H
|
| *-. E
| |\ \
|/ / /
|
* B
We can "skew" the display of this merge so that it doesn't introduce
additional columns that immediately collapse:
* H
|
| * E
|/|\
|
* B
The last parent of E is D, the same as the parent of F which is the edge
to the right of the merge.
* F
|
\
*-. \ E
|\ \ \
/ / / /
| /
|/
* D
The two edges leading to D could be fused sooner: rather than expanding
the F edge around the merge and then letting the edges collapse, the F
edge could fuse with the E edge in the post-merge line:
* F
|
\
*-. | E
|\ \|
/ / /
|
* D
If this is combined with the "skew" effect above, we get a much cleaner
graph display for these edges:
* F
|
* | E
/|\|
|
* D
Finally, the edge leading from C to A appears jagged as it passes
through the commit line for B:
| * | C
| |/
* | B
|/
* A
This can be smoothed out so that such edges are easier to read:
| * | C
| |/
* / B
|/
* A
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This computation is repeated in a couple of places and I need to add
another condition to it to implement a further improvement to the graph
rendering, so I'm extracting this into a function.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a duplication of logic between `graph_insert_into_new_columns()`
and `graph_update_width()`. `graph_insert_into_new_columns()` is called
repeatedly by `graph_update_columns()` with an `int *` that tracks the
offset into the `mapping` array where we should write the next value.
Each call to `graph_insert_into_new_columns()` effectively pushes one
column index and one "null" value (-1) onto the `mapping` array and
therefore increments `mapping_idx` by 2.
`graph_update_width()` duplicates this process: the `width` of the graph
is essentially the initial width of the `mapping` array before edges
begin collapsing. The `graph_update_width()` function's logic
effectively works out how many times `graph_insert_into_new_columns()`
was called based on the relationship of the current commit to the rest
of the graph.
I'm about to make some changes that make the assignment of values into
the `mapping` array more complicated. Rather than make
`graph_update_width()` more complicated at the same time, we can simply
remove this function and use `graph->width` to track the offset into the
`mapping` array as we're building it. This removes the duplication and
makes sure that `graph->width` is the same as the visual width of the
`mapping` array once `graph_update_columns()` is complete.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I will shortly be making some changes to this function and so am trying
to simplify it. It currently contains some duplicated logic; both
branches the function can take assign the commit's column index into
the `mapping` array and increment `mapping_index`.
Here I change the function so that the only conditional behaviour is
that it appends the commit to `new_columns` if it's not present. All
manipulation of `mapping` now happens on a single code path.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I will shortly be making some changes to
`graph_insert_into_new_columns()` and so am trying to simplify it. One
possible simplification is that we can extract the loop for finding the
element in `new_columns` containing the given commit.
`find_new_column_by_commit()` contains a very similar loop but it
returns a `struct column *` rather than an `int` offset into the array.
Here I'm introducing a version that returns `int` and using that in
`graph_insert_into_new_columns()` and `graph_output_post_merge_line()`.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the display width of graph lines is implicitly tracked via the
`graph_line` interface, the calls to `graph_pad_horizontally()` no
longer need to be located inside the individual output functions, where
the character counting was previously being done.
All the functions called by `graph_next_line()` generate a line of
output, then call `graph_pad_horizontally()`, and finally change the
graph state if necessary. As padding is the final change to the output
done by all these functions, it can be removed from all of them and done
in `graph_next_line()` instead.
I've also moved the guard in `graph_output_padding_line()` that checks
the graph has a commit; this function is only called by
`graph_next_line()` and we must not pad the `graph_line` if no commit is
set.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the output functions called by `graph_next_line()` currently keep
track of how many printable chars they've written to the buffer, before
calling `graph_pad_horizontally()` to pad the line with spaces. Some
functions do this by incrementing a counter whenever they write to the
buffer, and others do it by encoding an assumption about how many chars
are written, as in:
graph_pad_horizontally(graph, sb, graph->num_columns * 2);
This adds a fair amount of noise to the functions' logic and is easily
broken if one forgets to increment the right counter or update the
calculations used for padding.
To make this easier to use, I'm introducing a new struct called
`graph_line` that wraps a `strbuf` and keeps count of its display width
implicitly. `graph_next_line()` wraps this around the `struct strbuf *`
it's given and passes a `struct graph_line *` to the output functions,
which use its interface.
The `graph_line` interface wraps the `strbuf_addch()`,
`strbuf_addchars()` and `strbuf_addstr()` functions, and adds the
`graph_line_write_column()` function for adding a single character with
color formatting. The `graph_pad_horizontally()` function can then use
the `width` field from the struct rather than taking a character count
as a parameter.
Signed-off-by: James Coglan <jcoglan@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The specification of promisor packfiles (in partial-clone.txt) states
that the .promisor files that accompany packfiles do not matter (just
like .keep files), so whenever a packfile is fetched from the promisor
remote, Git has been writing empty .promisor files. But these files
could contain more useful information.
So instead of writing empty files, write the refs fetched to these
files. This makes it easier to debug issues with partial clones, as we
can identify what refs (and their associated hashes) were fetched at the
time the packfile was downloaded, and if necessary, compare those hashes
against what the promisor remote reports now.
This is implemented by teaching fetch-pack to write its own non-empty
.promisor file whenever it knows the name of the pack's lockfile. This
covers the case wherein the user runs "git fetch" with an internal
protocol or HTTP protocol v2 (fetch_refs_via_pack() in transport.c sets
lock_pack) and with HTTP protocol v0/v1 (fetch_git() in remote-curl.c
passes "--lock-pack" to "fetch-pack").
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to commit 356ee4659b ("sequencer: try to commit without forking
'git commit'", 2017-11-24) the sequencer would always run the
post-commit hook after each pick or revert as it forked `git commit` to
create the commit. The conversion to committing without forking `git
commit` omitted to call the post-commit hook after creating the commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function was declared in commit.h but was implemented in
builtin/commit.c so was not part of libgit. Move it to libgit so we can
use it in the sequencer. This simplifies the implementation of
run_prepare_commit_msg_hook() and will be used in the next commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 65850686cf ("rebase -i: rewrite write_basic_state() in C",
2018-08-28) accidentially added new function declarations after
the #endif at the end of the include guard.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests were calling set_fake_editor to ensure they had a sane no-op
editor set. Now that all the editor setting is done in subshells these
tests can rely on EDITOR=: and so do not need to call set_fake_editor.
Also add a test at the end to detect any future additions messing with
the exported value of $EDITOR.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As $EDITOR is exported setting it in one test affects all subsequent
tests. Avoid this by always setting it in a subshell. This commit leaves
20 calls to set_fake_editor that are not in subshells as they can
safely be removed in the next commit once all the other editor setting
is done inside subshells.
I have moved the call to set_fake_editor in some tests so it comes
immediately before the call to 'git rebase' to avoid moving unrelated
commands into the subshell. In one case ('rebase -ix with
--autosquash') the call to set_fake_editor is moved past an invocation
of 'git rebase'. This is safe as that invocation of 'git rebase'
requires EDITOR=: or EDITOR=fake-editor.sh without FAKE_LINES being
set which will be the case as the preceding tests either set their
editor in a subshell or call set_fake_editor without setting FAKE_LINES.
In a one test ('auto-amend only edited commits after "edit"') a call
to test_tick are now in a subshell. I think this is OK as it is there
to set the date for the next commit which is executed in the same
subshell rather than updating GIT_COMMITTER_DATE for later tests (the
next test calls test_tick before doing anything else).
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Neither of the commands executed in the subshell change any shell
variables or the current directory so there is no need for them to be
executed in a subshell.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, when format-patch generated a cover letter, only the body would
be populated with a branch's description while the subject would be
populated with placeholder text. However, users may want to have the
subject of their cover letter automatically populated in the same way.
Teach format-patch to accept the `--cover-from-description` option and
corresponding `format.coverFromDescription` config, allowing users to
populate different parts of the cover letter (including the subject
now).
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, `thread` and `config_cover_letter` were defined as ints even
though they behaved as enums. Define actual enums and change these
variables to use these new definitions.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 30984ed2e9 (format-patch: support deep threading, 2009-02-19),
introduced the following lines:
#define THREAD_SHALLOW 1
[...]
thread = git_config_bool(var, value) && THREAD_SHALLOW;
Since git_config_bool() returns a bool, the trailing `&& THREAD_SHALLOW`
is a no-op. Replace this errorneous and condition with a ternary
statement so that it is clear what the configured value is when a
boolean is given.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This changes the indent from
"<tab><sp><sp><sp><sp><sp><sp><sp><sp>"
to
"<tab><tab>"
so that the statement lines up with the rest of the block.
Signed-off-by: Norman Rasmussen <norman@rasmussen.co.za>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git range-diff" failed to handle mode-only change, which has been
corrected.
* tg/range-diff-output-update:
range-diff: don't segfault with mode-only changes
Pretty-printed command line formatter (used in e.g. reporting the
command being run by the tracing API) had a bug that lost an
argument that is an empty string, which has been corrected.
* gs/sq-quote-buf-pretty:
sq_quote_buf_pretty: don't drop empty arguments
The trace2 output, when sending them to files in a designated
directory, can populate the directory with too many files; a
mechanism is introduced to set the maximum number of files and
discard further logs when the maximum is reached.
* js/trace2-cap-max-output-files:
trace2: write discard message to sentinel files
trace2: discard new traces if target directory has too many files
docs: clarify trace2 version invariants
docs: mention trace2 target-dir mode in git-config
"git log --graph" for an octopus merge is sometimes colored
incorrectly, which is demonstrated and documented but not yet
fixed.
* dl/octopus-graph-bug:
t4214: demonstrate octopus graph coloring failure
t4214: explicitly list tags in log
t4214: generate expect in their own test cases
t4214: use test_merge
test-lib: let test_merge() perform octopus merges
Updates to fast-import/export.
* en/fast-imexport-nested-tags:
fast-export: handle nested tags
t9350: add tests for tags of things other than a commit
fast-export: allow user to request tags be marked with --mark-tags
fast-export: add support for --import-marks-if-exists
fast-import: add support for new 'alias' command
fast-import: allow tags to be identified by mark labels
fast-import: fix handling of deleted tags
fast-export: fix exporting a tag and nothing else
CI updates.
* js/azure-pipelines-msvc:
ci: also build and test with MS Visual Studio on Azure Pipelines
ci: really use shallow clones on Azure Pipelines
tests: let --immediate and --write-junit-xml play well together
test-tool run-command: learn to run (parts of) the testsuite
vcxproj: include more generated files
vcxproj: only copy `git-remote-http.exe` once it was built
msvc: work around a bug in GetEnvironmentVariable()
msvc: handle DEVELOPER=1
msvc: ignore some libraries when linking
compat/win32/path-utils.h: add #include guards
winansi: use FLEX_ARRAY to avoid compiler warning
msvc: avoid using minus operator on unsigned types
push: do not pretend to return `int` from `die_push_simple()`
"git fetch --jobs=<n>" allowed <n> parallel jobs when fetching
submodules, but this did not apply to "git fetch --multiple" that
fetches from multiple remote repositories. It now does.
* js/fetch-jobs:
fetch: let --jobs=<n> parallelize --multiple, too
The merge-recursive machiery is one of the most complex parts of
the system that accumulated cruft over time. This large series
cleans up the implementation quite a bit.
* en/merge-recursive-cleanup: (26 commits)
merge-recursive: fix the fix to the diff3 common ancestor label
merge-recursive: fix the diff3 common ancestor label for virtual commits
merge-recursive: alphabetize include list
merge-recursive: add sanity checks for relevant merge_options
merge-recursive: rename MERGE_RECURSIVE_* to MERGE_VARIANT_*
merge-recursive: split internal fields into a separate struct
merge-recursive: avoid losing output and leaking memory holding that output
merge-recursive: comment and reorder the merge_options fields
merge-recursive: consolidate unnecessary fields in merge_options
merge-recursive: move some definitions around to clean up the header
merge-recursive: rename merge_options argument to opt in header
merge-recursive: rename 'mrtree' to 'result_tree', for clarity
merge-recursive: use common name for ancestors/common/base_list
merge-recursive: fix some overly long lines
cache-tree: share code between functions writing an index as a tree
merge-recursive: don't force external callers to do our logging
merge-recursive: remove useless parameter in merge_trees()
merge-recursive: exit early if index != head
Ensure index matches head before invoking merge machinery, round N
merge-recursive: remove another implicit dependency on the_repository
...
Use argv_array to build an array of strings instead of open-coding it.
This simplifies the code a bit.
We also need to make the specs parameter of push(), push_dav() and
push_git() const to match the argv member of the argv_array. That's
fine, as all three only actually read from the specs array anyway.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of utf8_strnwidth()'s feature to skip ANSI escape sequences
instead of open-coding it. This shortens the code and makes it more
consistent.
This changes the behavior, though: The old code skips all kinds of
Control Sequence Introducer sequences, while utf8_strnwidth() only skips
the Select Graphic Rendition kind, i.e. those ending with "m". They are
used for specifying color and font attributes like boldness. The only
other kind of escape sequence we print in Git is Erase in Line, ending
with "K". That's not used for columnar output, so this difference
actually doesn't matter here.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first step for deleting an item from a linked list is to locate the
item preceding it. Be more careful in release_request() and handle an
empty list. This only has consequences for invalid delete requests
(removing the same item twice, or deleting an item that was never added
to the list), but simplifies the loop condition as well as the check
after the loop.
Once we found the item's predecessor in the list, update its next
pointer to skip over the item, which removes it from the list. In other
words: Make the item's successor the successor of its predecessor.
(At this point entry->next == request and prev->next == lock,
respectively.) This is a bit simpler and saves a pointer dereference.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git stash push does not recursively stash submodules, but if
submodule.recurse is set, it may recursively reset --hard them. Having
only the destructive action recurse is likely to be surprising
behaviour, and unlikely to be desirable, so the easiest fix should be to
ensure that the call to git reset --hard never recurses into submodules.
This matches the behavior of check_changes_tracked_files, which ignores
submodules.
Signed-off-by: Jakob Jarmar <jakob@jarmar.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is a good idea to have a readme so people finding the project can
know more about it, and know how they can get involved.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Acked-by: Bert Wesarg <bert.wesarg@googlemail.com>
'git format-patch -o <outdir>' did an equivalent of 'mkdir <outdir>'
not 'mkdir -p <outdir>', which is being corrected.
Avoid the usage of 'adjust_shared_perm' on the leading directories which
may have security implications. Achieved by temporarily disabling of
'config.sharedRepository' like 'git init' does.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parameter marker for x was garbled in its introduction in 89c855ed3c
("git-compat-util.h: implement a different ARRAY_SIZE macro for for
safely deriving the size of array", 2015-04-30).
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro has been available globally since b4f2a6ac92 ("Use #define
ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))", 2006-03-09), so let's use it.
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Existing documentation on object walks seems to be primarily intended
as a reference for those already familiar with the procedure. This
tutorial attempts to give an entry-level guide to a couple of bare-bones
object walks so that new Git contributors can learn the concepts
without having to wade through options parsing or special casing.
The target audience is a Git contributor who is just getting started
with the concept of object walking. The goal is to prepare this
contributor to be able to understand and modify existing commands which
perform revision walks more easily, although it will also prepare
contributors to create new commands which perform walks.
The tutorial covers a basic overview of the structs involved during
object walk, setting up a basic commit walk, setting up a basic
all-object walk, and adding some configuration changes to both walk
types. It intentionally does not cover how to create new commands or
search for options from the command line or gitconfigs.
There is an associated patchset at
https://github.com/nasamuffin/git/tree/revwalk that contains a reference
implementation of the code generated by this tutorial.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While doing some testing with fsmonitor enabled I found
that git commands would segfault after staging and
unstaging an untracked file. Looking at the crash it
appeared that fsmonitor_ewah_callback was attempting to
adjust bits beyond the bounds of the index cache.
Digging into how this could happen it became clear that
the fsmonitor extension must have been written with
more bits than there were entries in the index. The
root cause ended up being that fill_fsmonitor_bitmap was
populating fsmonitor_dirty with bits for all entries in
the index, even those that had been marked for removal.
To solve this problem fill_fsmonitor_bitmap has been
updated to skip entries with the the CE_REMOVE flag set.
With this change the bits written for the fsmonitor
extension will be consistent with the index entries
written by do_write_index. Additionally, BUG checks
have been added to detect if the number of bits in
fsmonitor_dirty should ever exceed the number of
entries in the index again.
Another option that was considered was moving the call
to fill_fsmonitor_bitmap closer to where the index is
written (and where the fsmonitor extension itself is
written). However, that did not work as the
fsmonitor_dirty bitmap must be filled before the index
is split during writing.
Signed-off-by: William Baker <William.Baker@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the %.cocci.patch target was defined in 63f0a758a0 (add coccicheck
make target, 2016-09-15), it included a mechanism to suppress the noisy
output, similar to the $(QUIET_<x>) family of variables.
In the case where one wants to inspect the output hidden by
$(QUIET_<x>), one could define $(V) for verbose output. In the
%.cocci.patch target, this was not implemented.
Move the output suppression into the $(QUIET_SPATCH) variable which is
used like the other $(QUIET_<x>) variables. While we're at it, change
the number of spaces printed from 5 to 4, like the other variables
there.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous patches, extern was mechanically removed from function
declarations without care to formatting, causing parameter lists to be
misaligned. Manually format changed sections such that the parameter
lists are realigned.
Viewing this patch with 'git diff -w' should produce no output.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change test 'find value_list for a key from a configset' to redirect the
result to 'expect' instead of 'except' which was a typo.
With this change, the test case actually fails because it uses
`configset_get_value`. Clearly, this was intended to be
`configset_get_value_multi` since the test expects a list of values
instead of a single value, so let's fix that, too.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -i" has been taught to show the total number of hunks and
the hunks that has been processed so far when showing prompts.
* kt/add-i-progress:
add -i: show progress counter in the prompt
"git stash apply" in a subdirectory of a secondary worktree failed
to access the worktree correctly, which has been corrected.
* js/stash-apply-in-secondary-worktree:
stash apply: report status correctly even in a worktree's subdirectory
"git range-diff" segfaulted when diff.noprefix configuration was
used, as it blindly expected the patch it internally generates to
have the standard a/ and b/ prefixes. The command now forces the
internal patch to be built without any prefix, not to be affected
by any end-user configuration.
* js/range-diff-noprefix:
range-diff: internally force `diff.noprefix=true`
A few simplification and bugfixes to PCRE interface.
* ab/pcre-jit-fixes:
grep: under --debug, show whether PCRE JIT is enabled
grep: do not enter PCRE2_UTF mode on fixed matching
grep: stess test PCRE v2 on invalid UTF-8 data
grep: create a "is_fixed" member in "grep_pat"
grep: consistently use "p->fixed" in compile_regexp()
grep: stop using a custom JIT stack with PCRE v1
grep: stop "using" a custom JIT stack with PCRE v2
grep: remove overly paranoid BUG(...) code
grep: use PCRE v2 for optimized fixed-string search
grep: remove the kwset optimization
grep: drop support for \0 in --fixed-strings <pattern>
grep: make the behavior for NUL-byte in patterns sane
grep tests: move binary pattern tests into their own file
grep tests: move "grep binary" alongside the rest
grep: inline the return value of a function call used only once
t4210: skip more command-line encoding tests on MinGW
grep: don't use PCRE2?_UTF8 with "log --encoding=<non-utf8>"
log tests: test regex backends in "--encode=<enc>" tests
"git rebase -i" showed a wrong HEAD while "reword" open the editor.
* pw/rebase-i-show-HEAD-to-reword:
sequencer: simplify root commit creation
rebase -i: check for updated todo after squash and reword
rebase -i: always update HEAD before rewording
The author names taken from SVN repositories may have extra leading
or trailing whitespaces, which are now munged away.
* tk/git-svn-trim-author-name:
git-svn: trim leading and trailing whitespaces in author name
Preparation for SHA-256 upgrade continues.
* bc/object-id-part17: (26 commits)
midx: switch to using the_hash_algo
builtin/show-index: replace sha1_to_hex
rerere: replace sha1_to_hex
builtin/receive-pack: replace sha1_to_hex
builtin/index-pack: replace sha1_to_hex
packfile: replace sha1_to_hex
wt-status: convert struct wt_status to object_id
cache: remove null_sha1
builtin/worktree: switch null_sha1 to null_oid
builtin/repack: write object IDs of the proper length
pack-write: use hash_to_hex when writing checksums
sequencer: convert to use the_hash_algo
bisect: switch to using the_hash_algo
sha1-lookup: switch hard-coded constants to the_hash_algo
config: use the_hash_algo in abbrev comparison
combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo
bundle: switch to use the_hash_algo
connected: switch GIT_SHA1_HEXSZ to the_hash_algo
show-index: switch hard-coded constants to the_hash_algo
blame: remove needless comparison with GIT_SHA1_HEXSZ
...
"git clean" fixes.
* en/clean-nested-with-ignored:
dir: special case check for the possibility that pathspec is NULL
clean: fix theoretical path corruption
clean: rewrap overly long line
clean: avoid removing untracked files in a nested git repository
clean: disambiguate the definition of -d
git-clean.txt: do not claim we will delete files with -n/--dry-run
dir: add commentary explaining match_pathspec_item's return value
dir: if our pathspec might match files under a dir, recurse into it
dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case
dir: also check directories for matching pathspecs
dir: fix off-by-one error in match_pathspec_item
dir: fix typo in comment
t7300: add testcases showing failure to clean specified pathspecs
It's possible that somebody on the project committee is the subject of a
complaint. In that case, it may be useful to be able to contact the
other members individually, so let's make it clear that's an option.
This also serves to enumerate the set of people on the committee. That
lets you easily _know_ if you're in the situation mentioned above. And
it's just convenient to list who's involved in the process, since the
project committee list is not anywhere else in the repository.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've never had a formally written Code of Conduct document. Though it
has been discussed off and on over the years, for the most part the
behavior on the mailing list has been good enough that nobody felt the
need to push one forward.
However, even if there aren't specific problems now, it's a good idea to
have a document:
- it puts everybody on the same page with respect to expectations.
This might avoid poor behavior, but also makes it easier to handle
it if it does happen.
- it publicly advertises that good conduct is important to us and will
be enforced, which may make some people more comfortable with
joining our community
- it may be a good time to cement our expectations when things are
quiet, since it gives everybody some distance rather than focusing
on a current contentious issue
This patch adapts the Contributor Covenant Code of Conduct. As opposed
to writing our own from scratch, this uses common and well-accepted
language, and strikes a good balance between illustrating expectations
and avoiding a laundry list of behaviors. It's also the same document
used by the Git for Windows project.
The text is taken mostly verbatim from:
https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
I also stole a very nice introductory paragraph from the Git for Windows
version of the file.
There are a few subtle points, though:
- the document refers to "the project maintainers". For the code, we
generally only consider there to be one maintainer: Junio C Hamano.
But for dealing with community issues, it makes sense to involve
more people to spread the responsibility. I've listed the project
committee address of git@sfconservancy.org as the contact point.
- the document mentions banning from the community, both in the intro
paragraph and in "Our Responsibilities". The exact mechanism here is
left vague. I can imagine it might start with social enforcement
(not accepting patches, ignoring emails) and could escalate to
technical measures if necessary (asking vger admins to block an
address). It probably make sense _not_ to get too specific at this
point, and deal with specifics as they come up.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: CB Bailey <cb@hashpling.org>
Acked-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Garima Singh <garimasigit@gmail.com>
Acked-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Elijah Newren <newren@gmail.com>
Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Thomas Gummerer <t.gummerer@gmail.com>
Acked-by: William Baker <williamtbakeremail@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to skip "UTF" and "UTF-" prefix, when computing an advice
message, did not work correctly when the prefix was "UTF", which
has been fixed.
* rs/convert-fix-utf-without-dash:
convert: fix handling of dashless UTF prefix in validate_encoding()
Miscellaneous code clean-ups.
* ah/cleanups:
git_mkstemps_mode(): replace magic numbers with computed value
wrapper: use a loop instead of repetitive statements
diffcore-break: use a goto instead of a redundant if statement
commit-graph: remove a duplicate assignment
The rename detection logic sorts a list of rename source candidates
by similarity to pick the best candidate, which means that a tie
between sources with the same similarity is broken by the original
location in the original candidate list (which is sorted by path).
Force the sorting by similarity done with a stable sort, which is
not promised by system supplied qsort(3), to ensure consistent
results across platforms.
* js/diff-rename-force-stable-sort:
diffcore_rename(): use a stable sort
Move git_sort(), a stable sort, into into libgit.a
Correct code that tried to reference all entries in a sparse array
of pointers by mistake.
* as/shallow-slab-use-fix:
shallow.c: don't free unallocated slabs
Currently, the tests for GIT_SKIP_TESTS do not cover the situation where
we skip an entire test suite. The tests also do not cover the situation
where we have GIT_SKIP_TESTS defined but the test suite does not match.
Add two test cases so we cover this blindspot.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building the packfile to be sent, send_pack() is given a list of
remote refs to be used as exclusions. For each ref, it first checks if
the ref exists locally, and if it does, passes it with a "^" prefix to
pack-objects. However, in a partial clone, the check may trigger a lazy
fetch.
The additional commit ancestry information obtained during such fetches
may show that certain objects that would have been sent are already
known to the server, resulting in a smaller pack being sent. But this is
at the cost of fetching from many possibly unrelated refs, and the lazy
fetches do not help at all in the typical case where the client is
up-to-date with the upstream of the branch being pushed.
Ensure that these lazy fetches do not occur.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 6bd26f58ea (t4014: use test_line_count() where possible, 2019-08-27),
we converted many test cases to take advantage of the test_line_count()
function. In one conversion, we inverted the expected and actual value
as tested by test_line_count(). Although functionally correct, if
format-patch ever produced incorrect output, the debugging output would
be a bunch of hashes which would be difficult to debug.
Invert the expected and actual values provided to test_line_count() so
that if format-patch produces incorrect output, the debugging output
will be a list of human-readable files instead.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11)
the 'parse_git_diff_header' function was made public and useable by
callers outside of apply.c.
However it was missed that its (then) only caller, 'find_header' did
some error handling, and completing 'struct patch' appropriately.
range-diff then started using this function, and tried to handle this
appropriately itself, but fell short in some cases. This in turn
would lead to range-diff segfaulting when there are mode-only changes
in a range.
Move the error handling and completing of the struct into the
'parse_git_diff_header' function, so other callers can take advantage
of it. This fixes the segfault in 'git range-diff'.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous step added annotations with "the_repository" to various
functions in the push codepath in the transport layer, but they all
can take arbitrary repository pointer, and may be working on a
repository that is not the_repository. Fix them.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Empty arguments passed on the command line can be represented by
a '', however sq_quote_buf_pretty was incorrectly dropping these
arguments altogether. Fix this problem by ensuring that such
arguments are emitted as '' instead.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 8e4ec337 ("merge-recursive: fix the diff3 common ancestor
label for virtual commits", 2019-10-01), which was a fix to commit
743474cbfa ("merge-recursive: provide a better label for diff3
common ancestor", 2019-08-17), the label for the common ancestor was
changed from always being
"merged common ancestors"
to instead be based on the number of merge bases and whether the merge
base was a real commit or a virtual one:
>=2: "merged common ancestors"
1, via merge_recursive_generic: "constructed merge base"
1, otherwise: <abbreviated commit hash>
0: "<empty tree>"
The handling for "constructed merge base" worked by allowing
opt->ancestor to be set in merge_recursive_generic(), so we paid
attention to the setting of that variable in merge_recursive_internal().
Now, for the outer merge, the code flow was simply the following:
ancestor_name = "merged merge bases"
loop over merge_bases: merge_recursive_internal()
The first merge base not needing recursion would determine its own
ancestor_name however necessary and thus run
ancestor_name = $SOMETHING
empty loop over merge_bases...
opt->ancestor = ancestor_name
merge_trees_internal()
Now, the next set of merge_bases that would need to be merged after this
particular merge had completed would note that opt->ancestor has been
set to something (to a local ancestor_name variable that has since been
popped off the stack), and thus it would run:
... else if (opt->ancestor) {
ancestor_name = opt->ancestor; /* OOPS! */
loop over merge_bases: merge_recursive_internal()
opt->ancestor = ancestor_name
merge_trees_internal()
This resulted in garbage strings being printed for the virtual merge
bases, which was visible in git.git by just merging commit b744c3af07
into commit 6d8cb22a4f. There are two ways to fix this: set
opt->ancestor to NULL after using it to avoid re-use, or add a
!opt->priv->call_depth check to the if block for using a pre-defined
opt->ancestor. Apply both fixes.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dev support.
* dl/honor-cflags-in-hdr-check:
ci: run `hdr-check` as part of the `Static Analysis` job
Makefile: emulate compile in $(HCO) target better
pack-bitmap.h: remove magic number
promisor-remote.h: include missing header
apply.h: include missing header
A bug in merge-recursive code that triggers when a branch with a
symbolic link is merged with a branch that replaces it with a
directory has been fixed.
* jt/merge-recursive-symlink-is-not-a-dir-in-way:
merge-recursive: symlink's descendants not in way
The hg-to-git script (in contrib/) has been updated to work with
Python 3.
* hb/hg-to-git-py3:
hg-to-git: make it compatible with both python3 and python2
The code to parse and use the commit-graph file has been made more
robust against corrupted input.
* tb/commit-graph-harden:
commit-graph.c: handle corrupt/missing trees
commit-graph.c: handle commit parsing errors
t/t5318: introduce failing 'git commit-graph write' tests
The cache-tree code has been taught to be less aggressive in
attempting to see if a tree object it computed already exists in
the repository.
* jt/cache-tree-avoid-lazy-fetch-during-merge:
cache-tree: do not lazy-fetch tentative tree
Coccinelle checks are done on more source files than before now.
* dl/cocci-everywhere:
Makefile: run coccicheck on more source files
Makefile: strip leading ./ in $(FIND_SOURCE_FILES)
Makefile: define THIRD_PARTY_SOURCES
Makefile: strip leading ./ in $(LIB_H)
The code used in following tags in "git fetch" has been optimized.
* ms/fetch-follow-tag-optim:
fetch: use oidset to keep the want OIDs for faster lookup
The object name parser for "Nth parent" syntax has been made more
robust against integer overflows.
* rs/nth-parent-parse:
sha1-name: check for overflow of N in "foo^N" and "foo~N"
rev-parse: demonstrate overflow of N for "foo^N" and "foo~N"
The object traversal machinery has been optimized not to load tree
objects when we are only interested in commit history.
* jk/list-objects-optim-wo-trees:
list-objects: don't queue root trees unless revs->tree_objects is set
The "upload-pack" (the counterpart of "git fetch") needs to disable
commit-graph when responding to a shallow clone/fetch request, but
the way this was done made Git panic, which has been corrected.
* jk/disable-commit-graph-during-upload-pack:
upload-pack: disable commit graph more gently for shallow traversal
commit-graph: bump DIE_ON_LOAD check to actual load-time
The command line completion for "git archive" and "git rebase" are
now made less prone to go out of sync with the binary.
* dl/complete-rebase-and-archive:
completion: teach archive to use __gitcomp_builtin
completion: teach rebase to use __gitcomp_builtin
A pair of small fixups to "git commit-graph" have been applied.
* jk/commit-graph-cleanup:
commit-graph: turn off save_commit_buffer
commit-graph: don't show progress percentages while expanding reachable commits
"git log --decorate-refs-exclude=<pattern>" was incorrectly
overruled when the "--simplify-by-decoration" option is used, which
has been corrected.
* rs/simplify-by-deco-with-deco-refs-exclude:
log-tree: call load_ref_decorations() in get_name_decoration()
log: test --decorate-refs-exclude with --simplify-by-decoration
The name of the blob object that stores the filter specification
for sparse cloning/fetching was interpreted in a wrong place in the
code, causing Git to abort.
* jk/partial-clone-sparse-blob:
list-objects-filter: use empty string instead of NULL for sparse "base"
list-objects-filter: give a more specific error sparse parsing error
list-objects-filter: delay parsing of sparse oid
t5616: test cloning/fetching with sparse:oid=<oid> filter
"git stash" learned to write refreshed index back to disk.
* tg/stash-refresh-index:
stash: make sure to write refreshed cache
merge: use refresh_and_write_cache
factor out refresh_and_write_cache function
Comments stating that "struct hashmap_entry" must be the first
member in a struct are no longer valid.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since these macros already take a `keyvar' pointer of a known type,
we can rely on OFFSETOF_VAR to get the correct offset without
relying on non-portable `__typeof__' and `offsetof'.
Argument order is also rearranged, so `keyvar' and `member' are
sequential as they are used as: `keyvar->member'
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we cannot rely on a `__typeof__' operator being portable
to use with `offsetof'; we can calculate the pointer offset
using an existing pointer and the address of a member using
pointer arithmetic for compilers without `__typeof__'.
This allows us to simplify usage of hashmap iterator macros
by not having to specify a type when a pointer of that type
is already given.
In the future, list iterator macros (e.g. list_for_each_entry)
may also be implemented using OFFSETOF_VAR to save hackers the
trouble of using container_of/list_entry macros and without
relying on non-portable `__typeof__'.
v3: use `__typeof__' to avoid clang warnings
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`hashmap_free_entries' behaves like `container_of' and passes
the offset of the hashmap_entry struct to the internal
`hashmap_free_' function, allowing the function to free any
struct pointer regardless of where the hashmap_entry field
is located.
`hashmap_free' no longer takes any arguments aside from
the hashmap itself.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
And add *_entry variants to perform container_of as necessary
to simplify most callers.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inspired by list_for_each_entry in the Linux kernel.
Once again, these are somewhat compromised usability-wise
by compilers lacking __typeof__ support.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another step in eliminating the requirement of hashmap_entry
being the first member of a struct.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update callers to use hashmap_get_entry, hashmap_get_entry_from_hash
or container_of as appropriate.
This is another step towards eliminating the requirement of
hashmap_entry being the first field in a struct.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using `container_of' can be verbose and choosing names for
intermediate "struct hashmap_entry" pointers is a hard problem.
So introduce "*_entry" APIs inspired by similar linked-list
APIs in the Linux kernel.
Unfortunately, `__typeof__' is not portable C, so we need an
extra parameter to specify the type.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a step towards removing the requirement for
hashmap_entry being the first field of a struct.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro is popular within the Linux kernel for supporting
intrusive data structures such as linked lists, red-black trees,
and chained hash tables while allowing the compiler to do
type checking.
Later patches will use container_of() to remove the limitation
of "hashmap_entry" being location-dependent. This will complete
the transition to compile-time type checking for the hashmap API.
This macro already exists in our source as "list_entry" in
list.h and making "list_entry" an alias to "container_of"
as the Linux kernel has done is a possibility.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "void *" as the compiler now
detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "void *" as the compiler now
detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is less error-prone than "const void *" as the compiler
now detects invalid types being passed.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
C compilers do type checking to make life easier for us. So
rely on that and update all hashmap_entry_init callers to take
"struct hashmap_entry *" to avoid future bugs while improving
safety and readability.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This hashmap_entry_init function is intended to take a
hashmap_entry struct pointer, not a hashmap struct pointer.
This was not noticed because hashmap_entry_init takes a "void *"
arg instead of "struct hashmap_entry *", and the hashmap struct
is larger and can be cast into a hashmap_entry struct without
data corruption.
This has the beneficial side effect of reducing the size of
a delta_base_cache_entry from 104 bytes to 72 bytes on 64-bit
systems.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Assigning hashmap_entry.hash manually leaves hashmap_entry.next
uninitialized, which can be dangerous once the hashmap_entry is
inserted into a hashmap. Detect those assignments and use
hashmap_entry_init, instead.
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise, the hashmap_entry.next field appears to remain
uninitialized, which can lead to problems when
add_lines_to_move_detection calls hashmap_add.
I found this through manual inspection when converting
hashmap_add callers to take "struct hashmap_entry *".
Signed-off-by: Eric Wong <e@80x24.org>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests print a file before searching for a pattern using
test_i18ngrep. This is useful when debugging tests with --verbose when
the pattern is not found as expected.
Since 63b1a175ee (t: make 'test_i18ngrep' more informative on failure,
2018-02-08) test_i18ngrep already shows the contents of a file that
doesn't match the expected pattern, though.
So don't bother doing the same unconditionally up-front. The contents
are not interesting if the expected pattern is found, and showing it
twice if it doesn't match is of no use.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid double-open exceptions on Windows platform when
calculating for lfs compressed size threshold
(git-p4.largeFileCompressedThreshold) comparisons.
Take new approach using the NamedTemporaryFile()
file-like object as input to the ZipFile() which
auto-deletes after implicit close leaving with scope.
Original code had double-open exception on Windows
platform because file still open from NamedTemporaryFile()
using generated filename instead of object.
Thanks to Andrey for patiently suggesting several
iterations on this change for avoiding exceptions!
Also print error details after resulting IOError to make
debugging cause of exception less mysterious when it has
nothing to do with "git version recent enough."
Signed-off-by: Philip.McGraw <Philip.McGraw@bentley.com>
Reviewed-by: Andrey Mazo <ahippo@yandex.com>
Acked-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure the grep machinery does not abort when seeing a payload
that is not UTF-8 even when JIT is not in use with PCRE1.
* cb/skip-utf8-check-with-pcre1:
grep: skip UTF8 checks explicitly
The markup used in user-manual has been updated to work better with
asciidoctor.
* ma/user-manual-markup-update:
user-manual.txt: render ASCII art correctly under Asciidoctor
asciidoctor-extensions.rb: handle "book" doctype in linkgit
user-manual.txt: change header notation
user-manual.txt: add missing section label
Start using DocBook 5 (instead of DocBook 4.5) as Asciidoctor 2.0
no longer works with the older one.
* bc/doc-use-docbook-5:
Documentation: fix build with Asciidoctor 2
Update support for Asciidoctor documentation toolchain.
* ma/asciidoctor-refmiscinfo:
doc-diff: replace --cut-header-footer with --cut-footer
asciidoctor-extensions: provide `<refmiscinfo/>`
Doc/Makefile: give mansource/-version/-manual attributes
The testsuite will eventually learn how to run using an algorithm other
than SHA-1. In preparation for this, teach the test_oid family of
functions how to look up the empty blob and empty tree values so they
can be used.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_oid function provides a mechanism for looking up hash algorithm
information, but it doesn't specify a way to discover the hash algorithm
name. Knowing this information is useful if one wants to invoke the
test-tool helper for the algorithm in use, such as in our pack
generation library.
While it's currently possible to inspect the global variable holding
this value, in the future we'll allow specifying an algorithm for
storage and an algorithm for display, so it's better to abstract this
value away. To assist with this, provide a named entry in the
algorithm-specific lookup table that prints the algorithm in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The synopsis section in git-rev-list.txt has grown to be a huge list
that probably needs its own synopsis. Since the list is huge, users may
be given the false impression that the list is complete, however it is
not. It is missing many of the available options.
Since the list of options in the synopsis is not only annoying but
actively harmful, replace it with `[<options>]` so users know to
explicitly look through the documentation for further information.
While we're at it, update the optional path notation so that it is more
modern.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Strip "UTF" and an optional dash from the start of 'upper' without
passing a NULL pointer to skip_prefix() in the second call, as it cannot
handle that.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
... because we can, now. Technically, we actually build using `MSBuild`,
which is however pretty close to building interactively in Visual
Studio.
As there is no convenient way to run Git's test suite in Visual Studio,
we unpack a Portable Git to run it, using the just-added test helper to
allow running test scripts in parallel.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was a left-over from the previous YAML schema, and it no longer
works. The problem was noticed while editing `azure-pipelines.yml` in VS
Code with the very helpful "Azure Pipelines" extension (syntax
highlighting and intellisense for `azure-pipelines.yml`...).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the `--immediate` option is in effect, any test failure will
immediately exit the test script. Together with `--write-junit-xml`, we
will want the JUnit-style `.xml` file to be finalized (and not leave the
XML incomplete). Let's make it so.
This comes in particularly handy when trying to debug via Azure
Pipelines, where the JUnit-style XML is consumed to present the test
results in an informative and helpful way.
While at it, also handle the `error()` code path.
The only remaining code path that sets `GIT_EXIT_OK` happens whenever
the trash directory could not be set up, i.e. long before the JUnit XML
was written, therefore we should _not_ try to finalize that XML in that
case.
It is tempting to change the `immediate` code path to just hand off to
`error`, simplifying the code in the process. That would, however,
result in a change of behavior (an additional error message) in the test
suite, which is outside of the purview of the current patch series: its
goal is to allow building Git with Visual Studio and testing it with a
portable version of Git for Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows jumps through hoops to provide a development environment
that allows to build Git and to run its test suite. To that end, an
entire MSYS2 system, including GNU make and GCC is offered as "the Git
for Windows SDK". It does come at a price: an initial download of said
SDK weighs in with several hundreds of megabytes, and the unpacked SDK
occupies ~2GB of disk space.
A much more native development environment on Windows is Visual Studio.
To help contributors use that environment, we already have a Makefile
target `vcxproj` that generates a commit with project files (and other
generated files), and Git for Windows' `vs/master` branch is
continuously re-generated using that target.
The idea is to allow building Git in Visual Studio, and to run
individual tests using a Portable Git.
The one missing thing is a way to run the entire test suite: neither
`make` nor `prove` are required to run Git, therefore Git for Windows
does not support those commands in the Portable Git.
To help with that, add a simple test helper that exercises the
`run_processes_parallel()` function to allow for running test scripts in
parallel (which is really necessary, especially on Windows, as Git's
test suite takes such a long time to run).
This will also come in handy for the upcoming change to our Azure
Pipeline: we will use this helper in a Portable Git to test the Visual
Studio build of Git.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the CI builds, we bundle all generated files into a so-called
artifacts `.tar` file, so that the test phase can fan out into multiple
parallel builds.
This patch makes sure that all files are included in the `vcxproj`
target which are needed for that artifacts `.tar` file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In b18ae14a8f (vcxproj: also link-or-copy builtins, 2019-07-29), we
started to copy or hard-link the built-ins as a post-build step of the
`git` project.
At the same time, we tried to copy or hard-link `git-remote-http.exe`,
but it is quite possible that it was not built at that time.
Let's move that latter task into a post-install step of the
`git-remote-http` project instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return value of that function is 0 both for variables that are
unset, as well as for variables whose values are empty. To discern those
two cases, one has to call `GetLastError()`, whose return value is
`ERROR_ENVVAR_NOT_FOUND` and `ERROR_SUCCESS`, respectively.
Except that it is not actually set to `ERROR_SUCCESS` in the latter
case, apparently, but the last error value seems to be simply unchanged.
To work around this, let's just re-set the last error value just before
inspecting the environment variable.
This fixes a problem that triggers failures in t3301-notes.sh (where we
try to override config settings by passing empty values for certain
environment variables).
This problem is hidden in the MINGW build by the fact that older
MSVC runtimes (such as the one used by MINGW builds) have a `calloc()`
that re-sets the last error value in case of success, while newer
runtimes set the error value only if `NULL` is returned by that
function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We frequently build Git using the `DEVELOPER=1` make setting as a
shortcut to enable all kinds of more stringent compiler warnings.
Those compiler warnings are relatively specific to GCC, though, so let's
try our best to translate them to the equivalent options to pass to MS
Visual C++'s `cl.exe`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To build with MSVC, we "translate" GCC options to MSVC options, and part
of those options refer to the libraries to link into the final
executable. Currently, this part looks somewhat like this on Windows:
-lcurl -lnghttp2 -lidn2 -lssl -lcrypto -lssl -lcrypto -lgdi32
-lcrypt32 -lwldap32 -lz -lws2_32 -lexpat
Some of those are direct dependencies (such as curl and ssl) and others
are indirect (nghttp2 and idn2, for example, are dependencies of curl,
but need to be linked in for reasons).
We already handle the direct dependencies, e.g. `-liconv` is already
handled as adding `libiconv.lib` to the list of libraries to link
against.
Let's just ignore the remaining `-l*` options so that MSVC does not have
to warn us that it ignored e.g. the `/lnghttp2` option. We do that by
extending the clause that already "eats" the `-R*` options to also eat
the `-l*` options.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the common guards that allow headers to be #include'd multiple
times.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSVC would complain thusly:
C4200: nonstandard extension used: zero-sized array in struct/union
Let's just use the `FLEX_ARRAY` constant that we introduced for exactly
this type of scenario.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSVC complains about this with `-Wall`, which can be taken as a sign
that this is indeed a real bug. The symptom is:
C4146: unary minus operator applied to unsigned type, result
still unsigned
Let's avoid this warning in the minimal way, e.g. writing `-1 -
<unsigned value>` instead of `-<unsigned value> - 1`.
Note that the change in the `estimate_cache_size()` function is
needed because MSVC considers the "return type" of the `sizeof()`
operator to be `size_t`, i.e. unsigned, and therefore it cannot be
negated using the unary minus operator.
Even worse, that arithmetic is doing extra work, in vain. We want to
calculate the entry extra cache size as the difference between the
size of the `cache_entry` structure minus the size of the
`ondisk_cache_entry` structure, padded to the appropriate alignment
boundary.
To that end, we start by assigning that difference to the `per_entry`
variable, and then abuse the `len` parameter of the
`align_padding_size()` macro to take the negative size of the ondisk
entry size. Essentially, we try to avoid passing the already calculated
difference to that macro by passing the operands of that difference
instead, when the macro expects operands of an addition:
#define align_padding_size(size, len) \
((size + (len) + 8) & ~7) - (size + len)
Currently, we pass A and -B to that macro instead of passing A - B and
0, where A - B is already stored in the `per_entry` variable, ready to
be used.
This is neither necessary, nor intuitive. Let's fix this, and have code
that is both easier to read and that also does not trigger MSVC's
warning.
While at it, we take care of reporting overflows (which are unlikely,
but hey, defensive programming is good!).
We _also_ take pains of casting the unsigned value to signed: otherwise,
the signed operand (i.e. the `-1`) would be cast to unsigned before
doing the arithmetic.
Helped-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git wants to spawn a child Git process inside a worktree's
subdirectory while `GIT_DIR` is set, we need to take care of specifying
the work tree's top-level directory explicitly because it cannot be
discovered: the current directory is _not_ the top-level directory of
the work tree, and neither is it inside the parent directory of
`GIT_DIR`.
This fixes the problem where `git stash apply` would report pretty much
everything deleted or untracked when run inside a worktree's
subdirectory.
To make sure that we do not introduce the "reverse problem", i.e. when
`GIT_WORK_TREE` is defined but `GIT_DIR` is not, we simply make sure
that both are set.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So far, `--jobs=<n>` only parallelizes submodule fetches/clones, not
`--multiple` fetches, which is unintuitive, given that the option's name
does not say anything about submodules in particular.
Let's change that. With this patch, also fetches from multiple remotes
are parallelized.
For backwards-compatibility (and to prepare for a use case where
submodule and multiple-remote fetches may need different parallelization
limits), the config setting `submodule.fetchJobs` still only controls
the submodule part of `git fetch`, while the newly-introduced setting
`fetch.parallel` controls both (but can be overridden for submodules
with `submodule.fetchJobs`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "discard" event type for trace2 event destinations. When the
trace2 file count check creates a sentinel file, it will include the
normal trace2 output in the sentinel, along with this new discard
event.
Writing this message into the sentinel file is useful for tracking how
often the file count check triggers in practice.
Bump up the event format version since we've added a new event type.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace2 can write files into a target directory. With heavy usage, this
directory can fill up with files, causing difficulty for
trace-processing systems.
This patch adds a config option (trace2.maxFiles) to set a maximum
number of files that trace2 will write to a target directory. The
following behavior is enabled when the maxFiles is set to a positive
integer:
When trace2 would write a file to a target directory, first check
whether or not the traces should be discarded. Traces should be
discarded if:
* there is a sentinel file declaring that there are too many files
* OR, the number of files exceeds trace2.maxFiles.
In the latter case, we create a sentinel file named git-trace2-discard
to speed up future checks.
The assumption is that a separate trace-processing system is dealing
with the generated traces; once it processes and removes the sentinel
file, it should be safe to generate new trace files again.
The default value for trace2.maxFiles is zero, which disables the file
count check.
The config can also be overridden with a new environment variable:
GIT_TRACE2_MAX_FILES.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The graph coloring logic for octopus merges currently has a bug. This
can be seen git.git with 74c7cfa875 (Merge of
http://members.cox.net/junkio/git-jc.git, 2005-05-05), whose second
child is 211232bae6 (Octopus merge of the following five patches.,
2005-05-05).
If one runs
git log --graph 74c7cfa875
one can see that the octopus merge is colored incorrectly. In
particular, the horizontal dashes are off by one color. Each horizontal
dash should be the color of the line to their bottom-right. Instead, they
are currently the color of the line to their bottom.
Demonstrate this breakage with a few sets of test cases. These test
cases should show not only simple cases of the bug occuring but trickier
situations that may not be handled properly in any attempt to fix the
bug.
While we're at it, include a passing test case as a canary in case an
attempt to fix the bug breaks existing operation.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future test case, we will be extending the commit graph. As a
result, explicitly list the tags that will generate the graph so that
when future additions are made, the current graph illustrations won't be
affected.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, the expect files of the test case were being generated in the
setup method. However, it would make more sense to generate these files
within the test cases that actually use them so that it's obvious to
future readers where the expected values are coming from.
Move the generation of the expect files in their own respective test
cases.
While we're at it, we want to establish a pattern in this test suite
that, firstly, a non-colored test case is given then, immediately after,
the colored version is given.
Switch test cases "log --graph with tricky octopus merge, no color" and
"log --graph with tricky octopus merge with colors" so that the "no
color" version appears first.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we extended test_merge() so that it could
perform octopus merges. Now that the restriction is lifted, use
test_merge() to perform the octopus merge instead of manually
duplicating test_merge() functionality.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently test_merge() only allows developers to merge in one branch.
However, this restriction is artificial and there is no reason why it
needs to be this way.
Extend test_merge() to allow the specification of multiple branches so
that octopus merges can be performed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it explicit that we always want trace2 "version" events to be the
first event of any trace session. Also list the changes that would or
would not cause the EVENT format version field to be incremented.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the description of trace2's target-directory behavior into the
shared trace2-target-values file so that it is included in both the
git-config and api-trace2 docs. Leave the SID discussion only in
api-trace2 since it's a technical detail.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Found with "git grep '^#include ' '*.c' | sort | uniq -d".
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is marked as `NORETURN`, and it indeed does not want to
return anything. So let's not declare it with the return type `int`.
This fixes the following warning when building with MSVC:
C4646: function declared with 'noreturn' has non-void return type
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multiple changes here:
* add a test for a tag of a blob
* add a test for a tag of a tag of a commit
* add a comment to the tests for (possibly nested) tags of trees,
making it clear that these tests are doing much less than you might
expect
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new option, --mark-tags, which will output mark identifiers with
each tag object. This improves the incremental export story with
--export-marks since it will allow us to record that annotated tags have
been exported, and it is also needed as a step towards supporting nested
tags.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-import has support for both an --import-marks flag and an
--import-marks-if-exists flag; the latter of which will not die() if the
file does not exist. fast-export only had support for an --import-marks
flag; add an --import-marks-if-exists flag for consistency.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export and fast-import have nice --import-marks flags which allow
for incremental migrations. However, if there is a mark in
fast-export's file of marks without a corresponding mark in the one for
fast-import, then we run the risk that fast-export tries to send new
objects relative to the mark it knows which fast-import does not,
causing fast-import to fail.
This arises in practice when there is a filter of some sort running
between the fast-export and fast-import processes which prunes some
commits programmatically. Provide such a filter with the ability to
alias pruned commits to their most recent non-pruned ancestor.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark identifiers are used in fast-export and fast-import to provide a
label to refer to earlier content. Blobs are given labels because they
need to be referenced in the commits where they first appear with a
given filename, and commits are given labels because they can be the
parents of other commits. Tags were never given labels, probably
because they were viewed as unnecessary, but that presents two problems:
1. It leaves us without a way of referring to previous tags if we
want to create a tag of a tag (or higher nestings).
2. It leaves us with no way of recording that a tag has already been
imported when using --export-marks and --import-marks.
Fix these problems by allowing an optional mark label for tags.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If our input stream includes a tag which is later deleted, we were not
properly deleting it. We did have a step which would delete it, but we
left a tag in the tag list noting that it needed to be updated, and the
updating of annotated tags occurred AFTER ref deletion. So, when we
record that a tag needs to be deleted, also remove it from the list of
annotated tags to update.
While this has likely been something that has not happened in practice,
it will come up more in order to support nested tags. For nested tags,
we either need to give temporary names to the intermediate tags and then
delete them, or else we need to use the final name for the intermediate
tags. If we use the final name for the intermediate tags, then in order
to keep the sanity check that someone doesn't try to update the same tag
twice, we need to delete the ref after creating the intermediate tag.
So, either way nested tags imply the need to delete temporary inner tag
references.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Report the current hunk count and total number of hunks for the
current file in the prompt. Also adjust the expected output in
some tests to match.
Signed-off-by: Kunal Tyagi <tyagi.kunal@live.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui now highlights diff3 style conflicts properly. As an auxiliary
change, querying a path's attribute is done via the existing interface
instead of hand-rolling the code.
* bw/diff3-conflict-style:
git-gui: support for diff3 conflict style
git-gui: use existing interface to query a path's attribute
This adds highlight support for the diff3 conflict style.
The common pre-image will be reversed to --, because it has been removed
and replaced with ours or theirs side respectively.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
When parsing the diffs, `range-diff` expects to see the prefixes `a/`
and `b/` in the diff headers.
These prefixes can be forced off via the config setting
`diff.noprefix=true`. As `range-diff` is not prepared for that
situation, this will cause a segmentation fault.
Let's avoid that by passing the `--no-prefix` option to the `git log`
process that generates the diffs that `range-diff` wants to parse.
And of course expect the output to have no prefixes, then.
Reported-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 regions in transport.c and builtin/push.c to better track
time spent in various phases of pushing:
* Listing refs
* Checking submodules
* Pushing submodules
* Pushing refs
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 regions to fetch-pack.c and builtins/fetch.c to better track
time spent in the various phases of a fetch:
* listing refs
* negotiation for protocol versions v0-v2
* fetching refs
* consuming refs
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The magic number "6" appears several times in the function, and is
related to the size of the "XXXXXX" string we expect to find in the
template. Let's pull that "XXXXXX" into a constant array, whose size we
can get at compile time with ARRAY_SIZE().
Note that we probably can't just change this value, since callers will
be feeding us a certain number of X's, but it hopefully makes the
function itself easier to follow.
While we're here, let's do the same with the "letters" array (which we
_could_ modify if we wanted to include more characters).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the hand-coded call to git check-attr with the already provided one.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
git-gui will now use git-bash instead of bash on Windows, if it is
available.
* js/git-bash-if-available:
git-gui (Windows): use git-bash.exe if it is available
Emit trace2_cmd_mode() messages for each commit-graph
sub-command.
The commit graph commands were in flux when trace2 was
making it's way to git. Now that we have enough sub-commands
in commit-graph, we can label the various modes within them.
Distinguishing between read, write and verify is a great
start.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test was originally designed for the case where user reported
that setting GIT_SSH to a .bat file with spaces in path fails on
Windows: https://github.com/git-for-windows/git/issues/692
The test has two different problems:
1. It succeeds with AND without fix eb7c7863 that addressed user's
problem. This happens because the core problem was misunderstood,
leading to conclusion that git is unable to start any programs with
spaces in path on Win7. But in fact
a) Bug only affected cmd.exe scripts, such as .bat scripts
b) Bug only happened when cmd.exe received at least two quoted args
c) Bug happened on any Windows (verified on Win10).
Therefore, correct test must involve .bat script and two quoted args.
2. In Visual Studio build, it fails to run, because 'test-fake-ssh.exe'
is copied away from its dependencies 'libiconv.dll' and 'zlib1.dll'.
Fix both problems by using .bat script instead of 'test-fake-ssh.exe'.
NOTE: With this change, the test now correctly fails without eb7c7863.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A check into the history of this code revealed no particular reason for
the code to be written in this way. All popular compilers are capable of
unrolling loops if it benefits performance, and once this code is
replaced with a loop, the magic number 6 used in multiple places in this
function can be replaced with a named constant.
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Reviewed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The condition "if (q->nr <= j)" checks whether the loop exited normally
or via a break statement. Avoid this check by replacing the jump out of
the inner loop with a jump to the end of the outer loop, which makes it
obvious that diff_q is not executed when the peer survives.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leave the variable 'g' uninitialized before it is set just before its
first use in front of a loop, which is a much more appropriate place to
indicate what it is used for.
Also initialize the variable 'num_commits' just before the loop instead
of at the beginning of the function for the same reason.
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix possible segfault when cloning a submodule shallow.
Signed-off-by: Ali Utku Selen <auselen@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 743474cbfa ("merge-recursive: provide a better label for
diff3 common ancestor", 2019-08-17), the label for the common ancestor
was changed from always being
"merged common ancestors"
to instead be based on the number of merge bases:
>=2: "merged common ancestors"
1: <abbreviated commit hash>
0: "<empty tree>"
Unfortunately, this did not take into account that when we have a single
merge base, that merge base could be fake or constructed. In such
cases, this resulted in a label of "00000000". Of course, the previous
label of "merged common ancestors" was also misleading for this case.
Since we have an API that is explicitly about creating fake merge base
commits in merge_recursive_generic(), we should provide a better label
when using that API with one merge base. So, when
merge_recursive_generic() is called with one merge base, set the label
to:
"constructed merge base"
Note that callers of merge_recursive_generic() include the builtin
commands git-am (in combination with git apply --build-fake-ancestor),
git-merge-recursive, and git-stash.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, when promisor_remote_move_to_tail() is called for a
promisor_remote which is currently the final element in promisors, a
cycle is created in the promisors linked list. This cycle leads to a
double free later on in promisor_remote_clear() when the final element
of the promisors list is removed: promisors is set to promisors->next (a
no-op, as promisors->next == promisors); the previous value of promisors
is free()'d; then the new value of promisors (which is equal to the
previous value of promisors) is also free()'d. This double-free error
was unrecoverable for the user without removing the filter or re-cloning
the repo and hoping to miss this edge case.
Now, when promisor_remote_move_to_tail() would be a no-op, just do a
no-op. In cases of promisor_remote_move_to_tail() where r is not already
at the tail of the list, it works as before.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During Git's rename detection, the file names are sorted. At the moment,
this job is performed by `qsort()`. As that function is not guaranteed
to implement a stable sort algorithm, this can lead to inconsistent
and/or surprising behavior: a rename might be detected differently
depending on the platform where Git was run.
The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and it even leads to an inconsistency leading to a test
failure in t3030.35 "merge-recursive remembers the names of all base
trees": a different code path than on Linux is taken in the rename
detection of an ambiguous rename between either `e` to `a` or
`a~Temporary merge branch 2_0` to `a` during a recursive merge,
unexpectedly resulting in a clean merge.
Let's use the stable sort provided by `git_stable_qsort()` to avoid this
inconsistency.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `qsort()` function is not guaranteed to be stable, i.e. it does not
promise to maintain the order of items it is told to consider equal. In
contrast, the `git_sort()` function we carry in `compat/qsort.c` _is_
stable, by virtue of implementing a merge sort algorithm.
In preparation for using a stable sort in Git's rename detection, move
the stable sort into `libgit.a` so that it is compiled in
unconditionally, and rename it to `git_stable_qsort()`.
Note: this also makes the hack obsolete that was introduced in
fe21c6b285 (mingw: reencode environment variables on the fly (UTF-16
<-> UTF-8), 2018-10-30), where we included `compat/qsort.c` directly in
`compat/mingw.c` to use the stable sort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commits 404ebceda0 ("dir: also check directories for matching
pathspecs", 2019-09-17) and 89a1f4aaf7 ("dir: if our pathspec might
match files under a dir, recurse into it", 2019-09-17) added calls to
match_pathspec() and do_match_pathspec() passing along their pathspec
parameter. Both match_pathspec() and do_match_pathspec() assume the
pathspec argument they are given is non-NULL. It turns out that
unpack-tree.c's verify_clean_subdirectory() calls read_directory() with
pathspec == NULL, and it is possible on case insensitive filesystems for
that NULL to make it to these new calls to match_pathspec() and
do_match_pathspec(). Add appropriate checks on the NULLness of pathspec
to avoid a segfault.
In case the negation throws anyone off (one of the calls was to
do_match_pathspec() while the other was to !match_pathspec(), yet no
negation of the NULLness of pathspec is used), there are two ways to
understand the differences:
* The code already handled the pathspec == NULL cases before this
series, and this series only tried to change behavior when there was
a pathspec, thus we only want to go into the if-block if pathspec is
non-NULL.
* One of the calls is for whether to recurse into a subdirectory, the
other is for after we've recursed into it for whether we want to
remove the subdirectory itself (i.e. the subdirectory didn't match
but something under it could have). That difference in situation
leads to the slight differences in logic used (well, that and the
slightly unusual fact that we don't want empty pathspecs to remove
untracked directories by default).
Denton found and analyzed one issue and provided the patch for the
match_pathspec() call, SZEDER figured out why the issue only reproduced
for some folks and not others and provided the testcase, and I looked
through the remainder of the series and noted the do_match_pathspec()
call that should have the same check.
Co-authored-by: Denton Liu <liu.denton@gmail.com>
Co-authored-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows 2.x ships with an executable that starts the Git Bash
with all the environment variables and what not properly set up. It is
also adjusted according to the Terminal emulator option chosen when
installing Git for Windows (while `bash.exe --login -i` would always
launch with Windows' default console).
So let's use that executable (usually C:\Program Files\Git\git-bash.exe)
instead of `bash.exe --login -i` if its presence was detected.
This fixes https://github.com/git-for-windows/git/issues/490
Signed-off-by: Thomas Kläger <thomas.klaeger@10a.ch>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
A configuration variable tells "git fetch" to write the commit
graph after finishing.
* ds/commit-graph-on-fetch:
fetch: add fetch.writeCommitGraph config setting
"git rebase --autostash <upstream> <branch>", when <branch> is
different from the current branch, incorrectly moved the tip of the
current branch, which has been corrected.
* bw/rebase-autostash-keep-current-branch:
builtin/rebase.c: Remove pointless message
builtin/rebase.c: make sure the active branch isn't moved when autostashing
The internal code originally invented for ".gitignore" processing
got reshuffled and renamed to make it less tied to "excluding" and
stress more that it is about "matching", as it has been reused for
things like sparse checkout specification that want to check if a
path is "included".
* ds/include-exclude:
unpack-trees: rename 'is_excluded_from_list()'
treewide: rename 'exclude' methods to 'pattern'
treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_'
treewide: rename 'struct exclude_list' to 'struct pattern_list'
treewide: rename 'struct exclude' to 'struct path_pattern'
Output from trace2 subsystem is formatted more prettily now.
* jh/trace2-pretty-output:
trace2: cleanup whitespace in perf format
trace2: cleanup whitespace in normal format
quote: add sq_append_quote_argv_pretty()
trace2: trim trailing whitespace in normal format error message
trace2: remove dead code in maybe_add_string_va()
trace2: trim whitespace in region messages in perf target format
trace2: cleanup column alignment in perf target format
"git rebase --keep-base <upstream>" tries to find the original base
of the topic being rebased and rebase on top of that same base,
which is useful when running the "git rebase -i" (and its limited
variant "git rebase -x").
The command also has learned to fast-forward in more cases where it
can instead of replaying to recreate identical commits.
* dl/rebase-i-keep-base:
rebase: teach rebase --keep-base
rebase tests: test linear branch topology
rebase: fast-forward --fork-point in more cases
rebase: fast-forward --onto in more cases
rebase: refactor can_fast_forward into goto tower
t3432: test for --no-ff's interaction with fast-forward
t3432: distinguish "noop-same" v.s. "work-same" in "same head" tests
t3432: test rebase fast-forward behavior
t3431: add rebase --fork-point tests
The command line completion support (in contrib/) learned about the
"--skip" option of "git revert" and "git cherry-pick".
* dl/complete-cherry-pick-revert-skip:
status: mention --skip for revert and cherry-pick
completion: add --skip for cherry-pick and revert
completion: merge options for cherry-pick and revert
Various fixes to codepaths gcc 9 had trouble following dataflow.
* jk/misc-uninitialized-fixes:
pack-objects: drop packlist index_pos optimization
test-read-cache: drop namelen variable
diff-delta: set size out-parameter to 0 for NULL delta
bulk-checkin: zero-initialize hashfile_checkpoint
pack-objects: use object_id in packlist_alloc()
git-am: handle missing "author" when parsing commit
Fix an earlier regression in the test suite, which mistakenly
stopped running HTTPD tests.
* sg/git-test-boolean:
ci: restore running httpd tests
t/lib-git-svn.sh: check GIT_TEST_SVN_HTTPD when running SVN HTTP tests
Start discouraging the use of "git filter-branch".
* en/filter-branch-deprecation:
t9902: use a non-deprecated command for testing
Recommend git-filter-repo instead of git-filter-branch
t6006: simplify, fix, and optimize empty message test
Fix an earlier regression to "git push --all" which should have
been forbidden when the target remote repository is set to be a
mirror.
* tg/push-all-in-mirror-forbidden:
push: disallow --all and refspecs when remote.<name>.mirror is set
On Windows, the root level of UNC share is now allowed to be used
just like any other directory.
* js/gitdir-at-unc-root:
setup_git_directory(): handle UNC root paths correctly
Fix .git/ discovery at the root of UNC shares
setup_git_directory(): handle UNC paths correctly
The memory ownership model of the "git fast-import" got
straightened out.
* jk/fast-import-history-bugfix:
fast-import: duplicate into history rather than passing ownership
fast-import: duplicate parsed encoding string
A few implementation fixes in the notes API.
* mh/notes-duplicate-entries:
notes: avoid potential use-after-free during insertion
notes: avoid leaking duplicate entries
Straighten out the use of strbuf_detach() API function.
* rs/strbuf-detach:
grep: use return value of strbuf_detach()
log-tree: always use return value of strbuf_detach()
The documentation and tests for "git format-patch" have been
cleaned up.
* dl/format-patch-doc-test-cleanup:
config/format.txt: specify default value of format.coverLetter
Doc: add more detail for git-format-patch
t4014: stop losing return codes of git commands
t4014: remove confusing pipe in check_threading()
t4014: use test_line_count() where possible
t4014: let sed open its own files
t4014: drop redirections to /dev/null
t4014: use indentable here-docs
t4014: remove spaces after redirect operators
t4014: use sq for test case names
t4014: move closing sq onto its own line
t4014: s/expected/expect/
t4014: drop unnecessary blank lines from test cases
fast-export allows specifying revision ranges, which can be used to
export a tag without exporting the commit it tags. fast-export handled
this rather poorly: it would emit a "from :0" directive. Since marks
start at 1 and increase, this means it refers to an unknown commit and
fast-import will choke on the input.
When we are unable to look up a mark for the object being tagged, use a
"from $HASH" directive instead to fix this problem.
Note that this is quite similar to the behavior fast-export exhibits
with commits and parents when --reference-excluded-parents is passed
along with an excluded commit range. For tags of excluded commits we do
not require the --reference-excluded-parents flag because we always have
to tag something. By contrast, when dealing with commits, pruning a
parent is always a viable option, so we need the flag to specify that
parent pruning is not wanted. (It is slightly weird that
--reference-excluded-parents isn't the default with a separate
--prune-excluded-parents flag, but backward compatibility concerns
resulted in the current defaults.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is similar to 379805051d ("Documentation: render revisions
correctly under Asciidoctor", 2018-05-06) and is a no-op with AsciiDoc.
When creating a literal block from an indented block without any sort of
delimiters, Asciidoctor strips off all leading whitespace, resulting in
a misrendered ASCII drawing. Use an explicit literal block to indicate
to Asciidoctor that we want to keep the leading whitespace. Drop the
common indentation for all lines to make this a no-op with AsciiDoc.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
user-manual.txt is the only file we process using the "book" doctype.
When we use AsciiDoc, user-manual.conf ensures that the linkgit macro
expands into something like
<ulink url="git-foo.html">git-foo(1)</ulink>
in user-manual.xml, which we then process into a clickable link, both in
user-manual.html and user-manual.pdf. With Asciidoctor,
user-manual.conf is ignored (this is expected) and our
Asciidoctor-specific implementation of linkgit kicks in:
<citerefentry>
<refentrytitle>git-foo</refentrytitle><manvolnum>1</manvolnum>
</citerefentry>
This eventually renders as "git-foo(1)", which is not bad, but it
doesn't turn into a link.
Teach our Asciidoctor-specific implementation of the linkgit macro that
if the doctype is "book", we should emit <ulink .../> just like we do
with AsciiDoc. While we're doing this, future-proof by supporting a
"prefix". The implementation in user-manual.conf doesn't support this,
and we don't need this for now because user-manual.txt is the only file
we process this way (and it's immediately in Documentation/).
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When AsciiDoc processes user-manual.txt, it generates a book containing
chapters containing sections. So for example, we have chapter 6,
"Advanced branch management", which contains four relatively short
sections, 6.1-6.4. Asciidoctor generates a book containing *parts*
containing *chapters* instead. So part 6, "Advanced branch management"
contains four short chapters, 1-4. This looks a bit odd.
To make AsciiDoc (8.6.10) and Asciidoctor (1.5.5) handle these the same,
change from indicating chapters like so:
[[foobar]]
Foobar
======
to doing it like so:
[[foobar]]
== Foobar
Same thing for sections (line of dashes to ===), subsections (line of
tildes to ====) and subsubsections (line of carets to =====). Mark the
appendices with "[appendix]", which both AsciiDoc and Asciidoctor
understand. This means we need to drop the "Appendix X: " from their
titles, or those "Appendix X: " would be included literally in the name
of the appendix.
This commit is a no-op for AsciiDoc: The generated user-manual.xml is
identical before and after this patch. Asciidoctor now creates the same
chapter-section-subsection structure as AsciiDoc.
Changing the book title at the start of the document to similarly use
"=" instead of a line of equal signs makes no difference with any of the
engines, but let's do that change anyway for consistency.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We provide a label for each chapter and section except for the "Pitfalls
with submodules" section. Since we're doing it everywhere else, let's do
it here, too.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, when testing headers using `make hdr-check`, headers are
directly compiled. Although this seems to test the headers, this is too
strict since we treat the headers as C sources. As a result, this will
cause warnings to appear that would otherwise not, such as a static
variable definition intended for later use throwing a unused variable
warning.
In addition, on platforms that can run `make hdr-check` but require
custom flags, this target was failing because none of them were being
passed to the compiler. For example, on MacOS, the NO_OPENSSL flag was
being set but it was not being passed into compiler so the check was
failing.
Fix these problems by emulating the compile process better, including
test compiling dummy *.hcc C sources generated from the *.h files and
passing $(ALL_CFLAGS) into the compiler for the $(HCO) target so that
these custom flags can be used.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we ran `make hdr-check` with the following patch
diff --git a/Makefile b/Makefile
index f879697ea3..d8df4e316b 100644
--- a/Makefile
+++ b/Makefile
@@ -2773,7 +2773,7 @@ CHK_HDRS = $(filter-out $(EXCEPT_HDRS),$(patsubst ./%,%,$(LIB_H)))
HCO = $(patsubst %.h,%.hco,$(CHK_HDRS))
$(HCO): %.hco: %.h FORCE
- $(QUIET_HDR)$(CC) -include git-compat-util.h -I. -o /dev/null -c -xc $<
+ $(QUIET_HDR)$(CC) -include git-compat-util.h -I. -o /dev/null -c -xc $(ALL_CFLAGS) $<
.PHONY: hdr-check $(HCO)
hdr-check: $(HCO)
and with `DEVELOPER=1`, we got the following warning on Arch Linux:
pack-bitmap.h:20:19: error: ‘BITMAP_IDX_SIGNATURE’ defined but not used [-Werror=unused-const-variable=]
20 | static const char BITMAP_IDX_SIGNATURE[] = {'B', 'I', 'T', 'M'};
| ^~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
"Use" the BITMAP_IDX_SIGNATURE variable by making the size of
bitmap_disk_header.magic equal to the size of BITMAP_IDX_SIGNATURE,
thereby eliminating the magic number (4).
An alternative was to simply add MAYBE_UNUSED, however that does not
eliminate the magic number.
Another alternative was to change the definition to
extern const char BITMAP_IDX_SIGNATURE[4];
However, this design was also not chosen as the static definition allows
us to keep the declaration together for readability along with removing
the magic number.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we ran `make hdr-check`, we got the following warning message:
promisor-remote.h:21:46: warning: declaration of 'struct repository' will not be visible outside of this function [-Wvisibility]
extern int promisor_remote_get_direct(struct repository *repo,
^
1 warning generated.
This was caused by a missing reference to `struct repository`, which is
defined in "repository.h".
Include this missing header to fix this warning.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running `make hdr-check`, we got the following error messages:
apply.h:146:22: error: use of undeclared identifier 'GIT_MAX_HEXSZ'
char old_oid_prefix[GIT_MAX_HEXSZ + 1];
^
apply.h:147:22: error: use of undeclared identifier 'GIT_MAX_HEXSZ'
char new_oid_prefix[GIT_MAX_HEXSZ + 1];
^
apply.h:151:33: error: array has incomplete element type 'struct object_id'
struct object_id threeway_stage[3];
^
./strbuf.h:79:8: note: forward declaration of 'struct object_id'
struct object_id;
^
3 errors generated.
make: *** [apply.hco] Error 1
Include the missing "hash.h" header to fix these errors.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, the svn author names might contain leading or trailing
whitespaces, leading to messages such as:
Author: user1
not defined in authors.txt
(the trailing newline leads to the line break). The user "user1" is
defined in authors.txt though, e.g.
user1 = User <user1@example.com>
Fix this by trimming the author name retreived from svn before using it
in check_author.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After I discovered that UTF-16-LE-BOM test was buggy, I decided that
better tests are required. Possibly the best option here is to compare
git results against hardcoded ground truth.
The new tests also cover more interesting chars where (ANSI != UTF-8).
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to its name, the test is designed for UTF-16-LE-BOM.
However, possibly due to copy&paste oversight, it was using UTF-32.
While the test succeeds (extra \000\000 are interpreted as NUL),
I myself had an unrelated problem which caused the test to fail.
When analyzing the failure I was quite puzzled by the fact that the
test is obviously buggy. And it seems that I'm not alone:
https://public-inbox.org/git/CAH8yC8kSakS807d4jc_BtcUJOrcVT4No37AXSz=jePxhw-o9Dg@mail.gmail.com/T/#u
Fix the test to follow its original intention.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though Debug configuration builds, the resulting build is incorrect
in a subtle way: it mixes up Debug and Release binaries, which in turn
causes hard-to-predict bugs.
In my case, when git calls iconv library, iconv sets 'errno' and git
then tests it, but in Debug and Release CRT those 'errno' are different
memory locations.
This patch addresses 3 connected bugs:
1) Typo in '\(Configuration)'. As a result, Debug configuration
condition is always false and Release path is taken instead.
2) Regexp that replaced 'zlib.lib' with 'zlibd.lib' was only affecting
the first occurrence. However, some projects have it listed twice.
Previously this bug was hidden, because Debug path was never taken.
I decided that avoiding double -lz in makefile is fragile and I'd
better replace all occurrences instead.
3) In Debug, 'libcurl-d.lib' should be used instead of 'libcurl.lib'.
Previously this bug was hidden, because Debug path was never taken.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git name-rev' is invoked with commit-ish parameters, it tries to
save some work, and doesn't visit commits older than the committer
date of the oldest given commit minus a one day worth of slop. Since
our 'timestamp_t' is an unsigned type, this leads to a timestamp
underflow when the committer date of the oldest given commit is within
a day of the UNIX epoch. As a result the cutoff timestamp ends up
far-far in the future, and 'git name-rev' doesn't visit any commits,
and names each given commit as 'undefined'.
Check whether subtracting the slop from the oldest committer date
would lead to an underflow, and use no cutoff in that case. We don't
have a TIME_MIN constant, dddbad728c (timestamp_t: a new data type for
timestamps, 2017-04-26) didn't add one, so do it now.
Note that the type of the cutoff timestamp variable used to be signed
before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t,
2017-05-20). The behavior was still the same even back then, but the
underflow didn't happen when substracting the slop from the oldest
committer date, but when comparing the signed cutoff timestamp with
unsigned committer dates in name_rev(). IOW, this underflow bug is as
old as 'git name-rev' itself.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During the creation of this file, each time a new function declaration
was introduced, it included an `extern`. However, starting from
554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), we've been actively trying to prevent externs from
being used in function declarations because they're unnecessary.
Remove these spurious `extern`s.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the pull command instead of push is more accurate when giving
instructions on placing the psuh command in alphabetical order.
Signed-off-by: Pedro Sousa <pedroteosousa@gmail.com>
Acked-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Travis CI offers shell access to its virtual machine environment
running the build jobs, called "debug mode" [1]. After restarting a
build job in debug mode and logging in, the first thing I usually do
is to install dependencies, i.e. run './ci/install-dependencies.sh'.
This works just fine when I restarted a failed build job in debug
mode. However, after restarting a successful build job in debug mode
our CI scripts get all clever, and exit without doing anything useful,
claiming that "This commit's tree has already been built and tested
successfully" [2]. Our CI scripts are right, and we do want to skip
building and testing already known good trees in "regular" CI builds.
In debug mode, however, this is a nuisiance, because one has to delete
the cache (or at least the 'good-trees' file in the cache) to proceed.
Let's update our CI scripts, in particular the common 'ci/lib.sh', to
not skip previously successfully built and tested trees in debug mode,
so all those scripts will do what there were supposed to do even when
a successful build job was restarted in debug mode.
[1] https://docs.travis-ci.com/user/running-build-in-debug-mode/
[2] 9cc2c76f5e (travis-ci: record and skip successfully built trees,
2017-12-31)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some changes were added directly to git.git's git-gui subtree, instead
of being added to the separate git-gui repo and then being pulled back
into git.git. This means those changes are now not in the git-gui tree.
So pull them back into git-gui so we match with what git.git has (after
they pull in the recently added commits).
Most of the changes could be added back in via an octopus merge of all
the missing branches. But there were two commits (faf420e05a and
b71c6c3b64) which touched other parts of git.git along with git-gui, so
they had to be added back in via a cherry-pick because directly pulling
them in would also pull in all the ancestors of those commits,
needlessly bloating git-gui with git.git's history.
Thanks to Denton Liu <liu.denton@gmail.com> for providing me with a
script to generate and merge all the branches that were missing, which
made this task much easier.
* py/git-git-extra-stuff:
treewide: correct several "up-to-date" to "up to date"
Fix build with core.autocrlf=true
git-gui: call do_quit before destroying the main window
git-gui: workaround ttk:style theme use
git-gui: bind CTRL/CMD+numpad ENTER to do_commit
git-gui: search for all current SSH key types
git-gui: allow Ctrl+T to toggle multiple paths
git-gui: fix exception when trying to stage with empty file list
git-gui: avoid exception upon Ctrl+T in an empty list
git gui: fix staging a second line to a 1-line file
git-gui: prevent double UTF-8 conversion
git-gui: sort entries in optimized tclIndex
Replace Free Software Foundation address in license notices
git-gui (MinGW): make use of MSys2's msgfmt
Follow the Oxford style, which says to use "up-to-date" before the noun,
but "up to date" after it. Don't change plumbing (specifically
send-pack.c, but transport.c (git push) also has the same string).
This was produced by grepping for "up-to-date" and "up to date". It
turned out we only had to edit in one direction, removing the hyphens.
Fix a typo in Documentation/git-diff-index.txt while we're there.
Reported-by: Jeffrey Manian <jeffrey.manian@gmail.com>
Reported-by: STEVEN WHITE <stevencharleswhitevoices@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
On Windows, the default line endings are denoted by a Carriage Return
byte followed by a Line Feed byte, while Linux and MacOSX use a single
Line Feed byte to denote a line ending.
To help with this situation, Git introduced several mechanisms over the
last decade, most prominently the `core.autocrlf` setting.
Sometimes, however, a single setting is incorrect, e.g. when certain
files in the source code are to be consumed by software that can handle
only LF line endings, while other files can use whatever is appropriate
for the current platform.
To allow for that, Git added the `eol` option to its .gitattributes
handling, expecting every user of Git to mark their source code
appropriately.
Bash assumes that line-endings of scripts are denoted by a single Line
Feed byte. Therefore, shell scripts in Git's source code are one example
where that `eol=lf` option is *required*.
When generating common-cmds.h, the Unix tools we use generally operate on
the assumption that input and output deliminate their lines using LF-only
line endings. Consequently, they would happily copy the CR byte verbatim
into the strings in common-cmds.h, which in turn makes the C preprocessor
barf (that interprets them as MacOS-style line endings). Therefore, we
have to mark the input files as LF-only: command-list.txt and
Documentation/git-*.txt.
Quite a bit belatedly, this patch brings Git's own source code in line
with those expectations by setting those attributes to allow for a
correct build even when core.autocrlf=true.
This patch can be validated even on Linux, by using this cadence:
git config core.autocrlf true
rm .git/index && git stash
make -j15 DEVELOPER=1
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
I don't have access to my old work email since 20 Apr 2019.
Replace with my personal email address.
Signed-off-by: Andrey Mazo <ahippo@yandex.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch conceptually reverts 44103f4197 (t/helper: ignore
everything but sources, 2017-12-12). Back in those days we did have a
lot of separate test helper executables under 't/helper', and its
'.gitignore' did get out of sync every once in a while.
Since then, however, most of those separate executables were
integrated into a single 'test-tool' command [1], and new test helpers
are added as new subcommands, so the chances of that '.gitignore'
getting out of sync again are much lower. And even if a contributor
were not careful enough and submits a patch that adds a new executable
under 't/helper' but forgets to update '.gitignore' accordingly, our
CI builds would catch it in a timely manner [2].
Ignoring everything but sources has the drawback that building an
older version of Git (e.g. during bisecting) creates all those
executables, and after going back to e.g. current 'master' the usual
cleanup commands like 'make clean' or 'git clean -fd' don't remove
them (the former doesn't know about them, and the latter doesn't
remove ignored files).
So let's ignore only the executable files under 't/helper/, i.e.
'test-tool' and the three other remaining executables that could not
be integrated into 'test-tool' (no need to ignore object files, as
they are already ignored by our toplevel '.gitignore').
[1] The topic starting with efd71f8913 (t/helper: add an empty
test-tool program, 2018-03-24), and leading up to the merge commit
27f25845cf (Merge branch 'nd/combined-test-helper', 2018-04-11).
[2] b92cb86ea1 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the macro COPY_ARRAY to copy array elements and MOVE_ARRAY to do the
same for moving them backwards in an array with potential overlap. The
result is shorter and safer, as it infers the element type automatically
and does a (very) basic type compatibility check for its first two
arguments.
These cases were missed by Coccinelle and contrib/coccinelle/array.cocci
because the type of the elements is "const char *", not "char *", and
the rules in the semantic patch cautiously insist on the sizeof operator
being used on exactly the same type to avoid generating transformations
that introduce subtle bugs into tricky code.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the working tree has:
- bar (directory)
- bar/file (file)
- foo (symlink to .)
(note that lstat() for "foo/bar" would tell us that it is a directory)
and the user merges a commit that deletes the foo symlink and instead
contains:
- bar (directory, as above)
- bar/file (file, as above)
- foo (directory)
- foo/bar (file)
the merge should happen without requiring user intervention. However,
this does not happen.
This is because dir_in_way(), when checking the working tree, thinks
that "foo/bar" is a directory. But a symlink should be treated much the
same as a file: since dir_in_way() is only checking to see if there is a
directory in the way, we don't want symlinks in leading paths to
sometimes cause dir_in_way() to return true.
Teach dir_in_way() to also check for symlinks in leading paths before
reporting whether a directory is in the way.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting stash into C, calls to 'git update-index --refresh'
were replaced with the 'refresh_cache()' function. That is fine as
long as the index is only needed in-core, and not re-read from disk.
However in many cases we do actually need the refreshed index to be
written to disk, for example 'merge_recursive_generic()' discards the
in-core index before re-reading it from disk, and in the case of 'apply
--quiet', the 'refresh_cache()' we currently have is pointless without
writing the index to disk.
Always write the index after refreshing it to ensure there are no
regressions in this compared to the scripted stash. In the future we
can consider avoiding the write where possible after making sure none
of the subsequent calls actually need the refreshed cache, and it is
not expected to be refreshed after stash exits or it is written
somewhere else already.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the 'refresh_and_write_cache()' convenience function introduced in
the last commit, instead of refreshing and writing the index manually
in merge.c
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Getting the lock for the index, refreshing it and then writing it is a
pattern that happens more than once throughout the codebase, and isn't
trivial to get right. Factor out the refresh_and_write_cache function
from builtin/am.c to read-cache.c, so it can be re-used in other
places in a subsequent commit.
Note that we return different error codes for failing to refresh the
cache, and failing to write the index. The current caller only cares
about failing to write the index. However for other callers we're
going to convert in subsequent patches we will need this distinction.
Helped-by: Martin Ågren <martin.agren@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --[no-]progress to git commit-graph write and verify.
The progress feature was introduced in 7b0f229
("commit-graph write: add progress output", 2018-09-17) but
the ability to opt-out was overlooked.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the reference to the GIT_TEST_DATE_NOW which is done in date.c.
We can't get rid of the "x" variable, since it serves as a generic
scratch variable for parsing later in the function.
Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 2 is EOL at the end of 2019, many distros and systems now
come with python 3 as their default version.
Rewrite features used in hg-to-git that are no longer supported in
Python 3, in such a way that an updated code can still be usable
with Python 2:
- print is not a statement; use print() function instead.
- dict.has_key(key) is no more; use "key in dict" instead.
Signed-off-by: Hervé Beraud <herveberaud.pro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The http transport lacked some optimization the native transports
learned to avoid unnecessary ref advertisement, which has been
corrected.
* jt/avoid-ls-refs-with-http:
transport: teach all vtables to allow fetch first
transport-helper: skip ls-refs if unnecessary
The list-objects-filter API (used to create a sparse/lazy clone)
learned to take a combined filter specification.
* md/list-objects-filter-combo:
list-objects-filter-options: make parser void
list-objects-filter-options: clean up use of ALLOC_GROW
list-objects-filter-options: allow mult. --filter
strbuf: give URL-encoding API a char predicate fn
list-objects-filter-options: make filter_spec a string_list
list-objects-filter-options: move error check up
list-objects-filter: implement composite filters
list-objects-filter-options: always supply *errbuf
list-objects-filter: put omits set in filter struct
list-objects-filter: encapsulate filter components
Teach the lazy clone machinery that there can be more than one
promisor remote and consult them in order when downloading missing
objects on demand.
* cc/multi-promisor:
Move core_partial_clone_filter_default to promisor-remote.c
Move repository_format_partial_clone to promisor-remote.c
Remove fetch-object.{c,h} in favor of promisor-remote.{c,h}
remote: add promisor and partial clone config to the doc
partial-clone: add multiple remotes in the doc
t0410: test fetching from many promisor remotes
builtin/fetch: remove unique promisor remote limitation
promisor-remote: parse remote.*.partialclonefilter
Use promisor_remote_get_direct() and has_promisor_remote()
promisor-remote: use repository_format_partial_clone
promisor-remote: add promisor_remote_reinit()
promisor-remote: implement promisor_remote_get_direct()
Add initial support for many promisor remotes
fetch-object: make functions return an error code
t0410: remove pipes after git commands
Optimize unnecessary full-tree diff away from "git log -L" machinery.
* sg/line-log-tree-diff-optim:
line-log: avoid unnecessary full tree diffs
line-log: extract pathspec parsing from line ranges into a helper function
Command line completion updates for "git -c var.name=val"
* sg/complete-configuration-variables:
completion: complete config variables and values for 'git clone --config='
completion: complete config variables names and values for 'git clone -c'
completion: complete values of configuration variables after 'git -c var='
completion: complete configuration sections and variable names for 'git -c'
completion: split _git_config()
completion: simplify inner 'case' pattern in __gitcomp()
completion: use 'sort -u' to deduplicate config variable names
completion: deduplicate configuration sections
completion: add tests for 'git config' completion
completion: complete more values of more 'color.*' configuration variables
completion: fix a typo in a comment
A new "pre-merge-commit" hook has been introduced.
* js/pre-merge-commit-hook:
merge: --no-verify to bypass pre-merge-commit hook
git-merge: honor pre-merge-commit hook
merge: do no-verify like commit
t7503: verify proper hook execution
Tell cURL library to use the same malloc() implementation, with the
xmalloc() wrapper, as the rest of the system, for consistency.
* cb/curl-use-xmalloc:
http: use xmalloc with cURL
xmalloc() used to have a mechanism to ditch memory and address
space resources as the last resort upon seeing an allocation
failure from the underlying malloc(), which made the code complex
and thread-unsafe with dubious benefit, as major memory resource
users already do limit their uses with various other mechanisms.
It has been simplified away.
* jk/drop-release-pack-memory:
packfile: drop release_pack_memory()
"git rebase --rebase-merges" learned to drive different merge
strategies and pass strategy specific options to them.
* js/rebase-r-strategy:
t3427: accelerate this test by using fast-export and fast-import
rebase -r: do not (re-)generate root commits with `--root` *and* `--onto`
t3418: test `rebase -r` with merge strategies
t/lib-rebase: prepare for testing `git rebase --rebase-merges`
rebase -r: support merge strategies other than `recursive`
t3427: fix another incorrect assumption
t3427: accommodate for the `rebase --merge` backend having been replaced
t3427: fix erroneous assumption
t3427: condense the unnecessarily repetitive test cases into three
t3427: move the `filter-branch` invocation into the `setup` case
t3427: simplify the `setup` test case significantly
t3427: add a clarifying comment
rebase: fold git-rebase--common into the -p backend
sequencer: the `am` and `rebase--interactive` scripts are gone
.gitignore: there is no longer a built-in `git-rebase--interactive`
t3400: stop referring to the scripted rebase
Drop unused git-rebase--am.sh
Pass the target strbuf to the callback function grab_nth_branch_switch()
by reference so that it can add the result string directly instead of
having it put the string into a temporary strbuf first. This gets rid
of an extra allocation and a string copy.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shebang for a python script should be "/usr/bin/env python" and not
"/usr/bin/python". On some OSes like AIX, python default path is not under
"/usr/bin" ("/opt/freeware/bin" for AIX).
Note the main reason behind this change is that AIX rpm will add a
dependency on "/usr/bin/python" instead of "/usr/bin/env".
Signed-off-by: Clément Chigot <clement.chigot@atos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running make from a clean environment, all of the *.po files should
be converted into *.msg files. After that, when make is run without any
changes, make should not do anything.
After beffae768a (gitk: Add Chinese (zh_CN) translation, 2017-03-11),
zh_CN.po was introduced. When make was run, a zh_cn.msg file was
generated (notice the lowercase). However, since make is case-sensitive,
it expects zh_CN.po to generate a zh_CN.msg file so make will keep
reattempting to generate a zh_CN.msg so successive make invocations
result in
Generating catalog po/zh_cn.msg
msgfmt --statistics --tcl po/zh_cn.po -l zh_cn -d po/
317 translated messages.
happening continuously.
Rename zh_CN.po to zh_cn.po so that when make generates the zh_cn.msg
file, it will realize that it was successfully generated and only run
once.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, when running the "coccicheck" target, only the source files
which were being compiled would have been checked by Coccinelle.
However, just because we aren't compiling a source file doesn't mean we
have to exclude it from analysis. This will allow us to catch more
mistakes, in particular ones that affect Windows-only sources since
Coccinelle currently runs only on Linux.
Make the "coccicheck" target run on all C sources except for those that
are taken from some third-party source. We don't want to patch these
files since we want them to be as close to upstream as possible so that
it'll be easier to pull in upstream updates.
When running a build on Arch Linux with no additional flags provided,
after applying this patch, the following sources are now checked:
* block-sha1/sha1.c
* compat/access.c
* compat/basename.c
* compat/fileno.c
* compat/gmtime.c
* compat/hstrerror.c
* compat/memmem.c
* compat/mingw.c
* compat/mkdir.c
* compat/mkdtemp.c
* compat/mmap.c
* compat/msvc.c
* compat/pread.c
* compat/precompose_utf8.c
* compat/qsort.c
* compat/setenv.c
* compat/sha1-chunked.c
* compat/snprintf.c
* compat/stat.c
* compat/strcasestr.c
* compat/strdup.c
* compat/strtoimax.c
* compat/strtoumax.c
* compat/unsetenv.c
* compat/win32/dirent.c
* compat/win32/path-utils.c
* compat/win32/pthread.c
* compat/win32/syslog.c
* compat/win32/trace2_win32_process_info.c
* compat/win32mmap.c
* compat/winansi.c
* ppc/sha1.c
This also results in the following source now being excluded:
* compat/obstack.c
Instead of generating $(FOUND_C_SOURCES) from a
`$(shell $(FIND_SOURCE_FILES))` invocation, an alternative design was
considered which involved converting $(FIND_SOURCE_FILES) into
$(SOURCE_FILES) which would hold a list of filenames from the
$(FIND_SOURCE_FILES) invocation. We would simply filter `%.c` files into
$(ALL_C_SOURCES). $(SOURCE_FILES) would then be passed directly to the
etags, ctags and cscope commands. We can see from the following
invocation
$ git ls-files '*.[hcS]' '*.sh' ':!*[tp][0-9][0-9][0-9][0-9]*' ':!contrib' | wc -c
12779
that the number of characters in this list would pose a problem on
platforms with short command-line length limits (such as CMD which has a
max of 8191 characters). As a result, we don't perform this change.
However, we can see that the same issue may apply when running
Coccinelle since $(COCCI_SOURCES) is also a list of filenames:
if ! echo $(COCCI_SOURCES) | xargs $$limit \
$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
>$@+ 2>$@.log; \
This is justified since platforms that support Coccinelle generally have
reasonably long command-line length limits and so we are safe for the
foreseeable future.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, $(FIND_SOURCE_FILES) has two modes: if `git ls-files` is
present, it will use that to enumerate the files in the repository; else
it will use `$(FIND) .` to enumerate the files in the directory.
There is a subtle difference between these two methods, however. With
ls-files, filenames don't have a leading `./` while with $(FIND), they
do. This does not currently pose a problem but in a future patch, we
will be using `filter-out` to process the list of files with the
assumption that there is no prefix.
Unify the two possible invocations in $(FIND_SOURCE_FILES) by using sed
to remove the `./` prefix in the $(FIND) case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some files in our codebase are borrowed from other projects, and
minimally updated to suit our own needs. We'd sometimes need to tell
our own sources and these third-party sources apart for management
purposes (e.g. we may want to be less strict about coding style and
other issues on third-party files).
Define the $(MAKE) variable THIRD_PARTY_SOURCES that can be used to
match names of third-party sources.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_clean() had the following code structure:
struct strbuf abs_path = STRBUF_INIT;
for_each_string_list_item(item, &del_list) {
strbuf_addstr(&abs_path, prefix);
strbuf_addstr(&abs_path, item->string);
PROCESS(&abs_path);
strbuf_reset(&abs_path);
}
where I've elided a bunch of unnecessary details and PROCESS(&abs_path)
represents a big chunk of code rather than an actual function call. One
piece of PROCESS was:
if (lstat(abs_path.buf, &st))
continue;
which would cause the strbuf_reset() to be missed -- meaning that the
next path to be handled would have two paths concatenated. This path
used to use die_errno() instead of continue prior to commit 396049e5fb
("git-clean: refactor git-clean into two phases", 2013-06-25), but my
understanding of how correct_untracked_entries() works is that it will
prevent both dir/ and dir/file from being in the list to clean so this
should be dead code and the die_errno() should be safe. But I hesitate
to remove it since I am not certain.
However, we can fix both this bug and possible similar future bugs by
simply moving the strbuf_reset(&abs_path) to the beginning of the loop.
It'll result in N calls to strbuf_reset() instead of N-1, but that's a
small price to pay to avoid sneaky bugs like this.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users expect files in a nested git repository to be left alone unless
sufficiently forced (with two -f's). Unfortunately, in certain
circumstances, git would delete both tracked (and possibly dirty) files
and untracked files within a nested repository. To explain how this
happens, let's contrast a couple cases. First, take the following
example setup (which assumes we are already within a git repo):
git init nested
cd nested
>tracked
git add tracked
git commit -m init
>untracked
cd ..
In this setup, everything works as expected; running 'git clean -fd'
will result in fill_directory() returning the following paths:
nested/
nested/tracked
nested/untracked
and then correct_untracked_entries() would notice this can be compressed
to
nested/
and then since "nested/" is a directory, we would call
remove_dirs("nested/", ...), which would
check is_nonbare_repository_dir() and then decide to skip it.
However, if someone also creates an ignored file:
>nested/ignored
then running 'git clean -fd' would result in fill_directory() returning
the same paths:
nested/
nested/tracked
nested/untracked
but correct_untracked_entries() will notice that we had ignored entries
under nested/ and thus simplify this list to
nested/tracked
nested/untracked
Since these are not directories, we do not call remove_dirs() which was
the only place that had the is_nonbare_repository_dir() safety check --
resulting in us deleting both the untracked file and the tracked (and
possibly dirty) file.
One possible fix for this issue would be walking the parent directories
of each path and checking if they represent nonbare repositories, but
that would be wasteful. Even if we added caching of some sort, it's
still a waste because we should have been able to check that "nested/"
represented a nonbare repository before even descending into it in the
first place. Add a DIR_SKIP_NESTED_GIT flag to dir_struct.flags and use
it to prevent fill_directory() and friends from descending into nested
git repos.
With this change, we also modify two regression tests added in commit
91479b9c72 ("t7300: add tests to document behavior of clean and nested
git", 2015-06-15). That commit, nor its series, nor the six previous
iterations of that series on the mailing list discussed why those tests
coded the expectation they did. In fact, it appears their purpose was
simply to test _existing_ behavior to make sure that the performance
changes didn't change the behavior. However, these two tests directly
contradicted the manpage's claims that two -f's were required to delete
files/directories under a nested git repository. While one could argue
that the user gave an explicit path which matched files/directories that
were within a nested repository, there's a slippery slope that becomes
very difficult for users to understand once you go down that route (e.g.
what if they specified "git clean -f -d '*.c'"?) It would also be hard
to explain what the exact behavior was; avoid such problems by making it
really simple.
Also, clean up some grammar errors describing this functionality in the
git-clean manpage.
Finally, there are still a couple bugs with -ffd not cleaning out enough
(e.g. missing the nested .git) and with -ffdX possibly cleaning out the
wrong files (paying attention to outer .gitignore instead of inner).
This patch does not address these cases at all (and does not change the
behavior relative to those flags), it only fixes the handling when given
a single -f. See
https://public-inbox.org/git/20190905212043.GC32087@szeder.dev/ for more
discussion of the -ffd[X?] bugs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The -d flag pre-dated git-clean's ability to have paths specified. As
such, the default for git-clean was to only remove untracked files in
the current directory, and -d existed to allow it to recurse into
subdirectories.
The interaction of paths and the -d option appears to not have been
carefully considered, as evidenced by numerous bugs and a dearth of
tests covering such pairings in the testsuite. The definition turns out
to be important, so let's look at some of the various ways one could
interpret the -d option:
A) Without -d, only look in subdirectories which contain tracked
files under them; with -d, also look in subdirectories which
are untracked for files to clean.
B) Without specified paths from the user for us to delete, we need to
have some kind of default, so...without -d, only look in
subdirectories which contain tracked files under them; with -d,
also look in subdirectories which are untracked for files to clean.
The important distinction here is that choice B says that the presence
or absence of '-d' is irrelevant if paths are specified. The logic
behind option B is that if a user explicitly asked us to clean a
specified pathspec, then we should clean anything that matches that
pathspec. Some examples may clarify. Should
git clean -f untracked_dir/file
remove untracked_dir/file or not? It seems crazy not to, but a strict
reading of option A says it shouldn't be removed. How about
git clean -f untracked_dir/file1 tracked_dir/file2
or
git clean -f untracked_dir_1/file1 untracked_dir_2/file2
? Should it remove either or both of these files? Should it require
multiple runs to remove both the files listed? (If this sounds like a
crazy question to even ask, see the commit message of "t7300: Add some
testcases showing failure to clean specified pathspecs" added earlier in
this patch series.) What if -ffd were used instead of -f -- should that
allow these to be removed? Should it take multiple invocations with
-ffd? What if a glob (such as '*tracked*') were used instead of
spelling out the directory names? What if the filenames involved globs,
such as
git clean -f '*.o'
or
git clean -f '*/*.o'
?
The current documentation actually suggests a definition that is
slightly different than choice A, and the implementation prior to this
series provided something radically different than either choices A or
B. (The implementation, though, was clearly just buggy). There may be
other choices as well. However, for almost any given choice of
definition for -d that I can think of, some of the examples above will
appear buggy to the user. The only case that doesn't have negative
surprises is choice B: treat a user-specified path as a request to clean
all untracked files which match that path specification, including
recursing into any untracked directories.
Change the documentation and basic implementation to use this
definition.
There were two regression tests that indirectly depended on the current
implementation, but neither was about subdirectory handling. These two
tests were introduced in commit 5b7570cfb4 ("git-clean: add tests for
relative path", 2008-03-07) which was solely created to add coverage for
the changes in commit fb328947c8e ("git-clean: correct printing relative
path", 2008-03-07). Both tests specified a directory that happened to
have an untracked subdirectory, but both were only checking that the
resulting printout of a file that was removed was shown with a relative
path. Update these tests appropriately.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It appears that the wrong option got included in the list of what will
cause git-clean to actually take action. Correct the list.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way match_pathspec_item() handles names and pathspecs with trailing
slash characters, in conjunction with special options like
DO_MATCH_DIRECTORY and DO_MATCH_LEADING_PATHSPEC were non-obvious, and
broken until this patch series. Add a table in a comment explaining the
intent of how these work.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For git clean, if a directory is entirely untracked and the user did not
specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
not want to remove that directory and thus do not recurse into it.
However, if the user manually specified specific (or even globbed) paths
somewhere under that directory to remove, then we need to recurse into
the directory to make sure we remove the relevant paths under that
directory as the user requested.
Note that this does not mean that the recursed-into directory will be
added to dir->entries for later removal; as of a few commits earlier in
this series, there is another more strict match check that is run after
returning from a recursed-into directory before deciding to add it to the
list of entries. Therefore, this will only result in files underneath
the given directory which match one of the pathspecs being added to the
entries list.
Two notes of potential interest to future readers:
* If we wanted to only recurse into a directory when it is specifically
matched rather than matched-via-glob (e.g. '*.c'), then we could do
so via making the final non-zero return in match_pathspec_item be
MATCHED_RECURSIVELY instead of MATCHED_RECURSIVELY_LEADING_PATHSPEC.
(Note that the relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC
and MATCHED_RECURSIVELY are important for such a change.) I was
leaving open that possibility while writing an RFC asking for the
behavior we want, but even though we don't want it, that knowledge
might help you understand the code flow better.
* There is a growing amount of logic in read_directory_recursive() for
deciding whether to recurse into a subdirectory. However, there is a
comment immediately preceding this logic that says to recurse if
instructed by treat_path(). It may be better for the logic in
read_directory_recursive() to ultimately be moved to treat_path() (or
another function it calls, such as treat_directory()), but I have
left that for someone else to tackle in the future.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The specific checks done in match_pathspec_item for the DO_MATCH_SUBMODULE
case are useful for other cases which have nothing to do with submodules.
Rename this constant; a subsequent commit will make use of this change.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if a directory doesn't match a pathspec, it is possible, depending
on the precise pathspecs, that some file underneath it might. So we
special case and recurse into the directory for such situations. However,
we previously always added any untracked directory that we recursed into
to the list of untracked paths, regardless of whether the directory
itself matched the pathspec.
For the case of git-clean and a set of pathspecs of "dir/file" and "more",
this caused a problem because we'd end up with dir entries for both of
"dir"
"dir/file"
Then correct_untracked_entries() would try to helpfully prune duplicates
for us by removing "dir/file" since it's under "dir", leaving us with
"dir"
Since the original pathspec only had "dir/file", the only entry left
doesn't match and leaves nothing to be removed. (Note that if only one
pathspec was specified, e.g. only "dir/file", then the common_prefix_len
optimizations in fill_directory would cause us to bypass this problem,
making it appear in simple tests that we could correctly remove manually
specified pathspecs.)
Fix this by actually checking whether the directory we are about to add
to the list of dir entries actually matches the pathspec; only do this
matching check after we have already returned from recursing into the
directory.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a pathspec like 'foo/bar' comparing against a path named "foo/",
namelen will be 4, and match[namelen] will be 'b'. The correct location
of the directory separator is namelen-1.
However, other callers of match_pathspec_item() such as builtin/grep.c's
submodule_path_match() will compare against a path named "foo" instead of
"foo/". It might be better to change all the callers to be consistent,
as discussed at
https://public-inbox.org/git/xmqq7e6cdnkr.fsf@gitster-ct.c.googlers.com/
and
https://public-inbox.org/git/CABPp-BERWUPCPq-9fVW1LNocqkrfsoF4BPj3gJd9+En43vEkTQ@mail.gmail.com/
but there are many cases to audit, so for now just make sure we handle
both cases with and without a trailing slash.
The reason the code worked despite this sometimes-off-by-one error was
that the subsequent code immediately checked whether the first matchlen
characters matched (which they do) and then bailed and return
MATCHED_RECURSIVELY anyway since wildmatch doesn't have the ability to
check if "name" can be matched as a directory (or prefix) against the
pathspec.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Someone brought me a testcase where multiple git-clean invocations were
required to clean out unwanted files:
mkdir d{1,2}
touch d{1,2}/ut
touch d1/t && git add d1/t
With this setup, the user would need to run
git clean -ffd */ut
twice to delete both ut files.
A little testing showed some interesting variants:
* If only one of those two ut files existed (either one), then only one
clean command would be necessary.
* If both directories had tracked files, then only one git clean would
be necessary to clean both files.
* If both directories had no tracked files then the clean command above
would never clean either of the untracked files despite the pathspec
explicitly calling both of them out.
A bisect showed that the failure to clean out the files started with
commit cf424f5fd8 ("clean: respect pathspecs with "-d", 2014-03-10).
However, that pointed to a separate issue: while the "-d" flag was used
by the original user who showed me this problem, that flag should have
been irrelevant to this problem. Testing again without the "-d" flag
showed that the same buggy behavior exists without using that flag, and
has in fact existed since before cf424f5fd8.
Although these problems at first are perceived to be different (e.g.
never clearing out the requested files vs. taking multiple invocations
to get everything cleared out), they are actually just different
manifestations of the same problem. The case with multiple directories
that have no tracked files is the more general case; solving it will
solve all the others. So, I concentrate on it. Add testcases showing
that multiple untracked files within entirely untracked directories
cannot be cleaned when specifying these files to git clean via
pathspecs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Replace "SHA-1" by "object name", the modern name for hashes.
- Correct a few grammar weaknesses.
- Do not accidentally format a phrase in teletype font where quotes are
intended.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff, log doc: say "patch text" instead of "patches"
A poster on Stackoverflow was confused that the documentation of git-log
promised to generate "patches" or "patch files" with -p, but there were
none to be found. Rewrite the corresponding paragraph to talk about
"patch text" to avoid the confusion.
Shorten the language to say "X does Y" in place of "X does not Z, but Y".
Cross-reference the referred-to commands like the rest of the file does.
Enumerate git-show because it includes the description as well.
Mention porcelain commands before plumbing commands because I guess that
the paragraph is read more frequently in their context.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'progress.c' has seen a few fixes recently [1], and, unfortunately,
some of those fixes required further fixes [2]. It seems it's time to
have a few tests focusing on the subtleties of the progress display.
Add the 'test-tool progress' subcommand to help testing the progress
display, reading instructions from standard input and turning them
into calls to the display_progress() and display_throughput()
functions with the given parameters.
The progress display is, however, critically dependent on timing,
because it's only updated once every second or, if the toal is known
in advance, every 1%, and there is the throughput rate as well. These
make the progress display far too undeterministic for testing as-is.
To address this, add a few testing-specific variables and functions to
'progress.c', allowing the the new test helper to:
- Disable the triggered-every-second SIGALRM and set the
'progress_update' flag explicitly based in the input instructions.
This way the progress line will be updated deterministically when
the test wants it to be updated.
- Specify the time elapsed since start_progress() to make the
throughput rate calculations deterministic.
Add the new test script 't0500-progress-display.sh' to check a few
simple cases with and without throughput, and that a shorter progress
line properly covers up the previously displayed line in different
situations.
[1] See commits 545dc345eb (progress: break too long progress bar
lines, 2019-04-12) and 9f1fd84e15 (progress: clear previous
progress update dynamically, 2019-04-12).
[2] 1aed1a5f25 (progress: avoid empty line when breaking the progress
line, 2019-05-19)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 5b12e3123b (progress: use term_clear_line(),
2019-06-24), because covering up the entire last line while refreshing
the progress line caused unexpected problems during 'git
clone/fetch/push':
$ git clone ssh://localhost/home/szeder/src/tmp/linux.git/
Cloning into 'linux'...
remote:
remote:
remote:
remote: Enumerating objects: 999295
The length of the progress bar line can shorten when it includes
throughput and the unit changes, or when its length exceeds the width
of the terminal and is broken into two lines. In these cases the
previously displayed longer progress line should be covered up,
because otherwise the leftover characters from the previous progress
line make the output look weird [1]. term_clear_line() makes this
quite simple, as it covers up the entire last line either by using an
ANSI control sequence or by printing a terminal width worth of space
characters, depending on whether the terminal is smart or dumb.
Unfortunately, when accessing a remote repository via any non-local
protocol the remote 'git receive-pack/upload-pack' processes can't
possibly have any idea about the local terminal (smart of dumb? how
wide?) their progress will end up on. Consequently, they assume the
worst, i.e. standard-width dumb terminal, and print 80 spaces to cover
up the previously displayed progress line. The local 'git
clone/fetch/push' processes then display the remote's progress,
including these coverup spaces, with the 'remote: ' prefix, resulting
in a total line length of 88 characters. If the local terminal is
narrower than that, then the coverup gets line-wrapped, and after that
the CR at the end doesn't return to the beginning of the progress
line, but to the first column of its last line, resulting in those
repeated 'remote: <many-spaces>' lines.
By reverting 5b12e3123b (progress: use term_clear_line(),
2019-06-24) we won't cover up the entire last line, but go back to
comparing the length of the current progress bar line with the
previous one, and cover up as many characters as needed.
[1] See commits 545dc345eb (progress: break too long progress bar
lines, 2019-04-12) and 9f1fd84e15 (progress: clear previous
progress update dynamically, 2019-04-12).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, $(LIB_H) is generated from two modes: if `git ls-files` is
present, it will use that to enumerate the files in the repository; else
it will use `$(FIND) .` to enumerate the files in the directory.
There is a subtle difference between these two methods, however. With
ls-files, filenames don't have a leading `./` while with $(FIND), they
do. This results in $(CHK_HDRS) having to substitute out the leading
`./` before it uses $(LIB_H).
Unify the two possible values in $(LIB_H) by using patsubst to remove the
`./` prefix at its definition.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During git-fetch, the client checks if the advertised tags' OIDs are
already in the fetch request's want OID set. This check is done in a
linear scan. For a repository that has a lot of refs, repeating this
scan takes 15+ minutes. In order to speed this up, create a oid_set for
other refs' OIDs.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let commit_list_count() count the number of parents instead of
duplicating it. Also store the result in an unsigned int, as that's
what the function returns, and the count is never negative.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reject values that don't fit into an int, as get_parent() and
get_nth_ancestor() cannot handle them. That's better than potentially
returning a random object.
If this restriction turns out to be too tight then we can switch to a
wider data type, but we'd still have to check for overflow.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the number gets too high for an int, weird things may happen, as
signed overflows are undefined. Add a test to show this; rev-parse
"sucessfully" interprets 100000000000000000000000000000000 to be the
same as 0, at least on x64 with GCC 9.2.1 and Clang 8.0.1, which is
obviously bogus.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use add_excludes_from_blob_to_list() to parse a sparse blob. Since
we don't have a base path, we pass NULL and 0 for the base and baselen,
respectively. But the rest of the exclude code passes a literal empty
string instead of NULL for this case. And indeed, we eventually end up
with match_pathname() calling fspathncmp(), which then calls the system
strncmp(path, base, baselen).
This works on many platforms, which notice that baselen is 0 and do not
look at the bytes of "base" at all. But it does violate the C standard,
and building with SANITIZE=undefined will complain. You can also see it
by instrumenting fspathncmp like this:
diff --git a/dir.c b/dir.c
index d021c908e5..4bb3d3ec96 100644
--- a/dir.c
+++ b/dir.c
@@ -71,6 +71,8 @@ int fspathcmp(const char *a, const char *b)
int fspathncmp(const char *a, const char *b, size_t count)
{
+ if (!a || !b)
+ BUG("null fspathncmp arguments");
return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);
}
We could perhaps be more defensive in match_pathname(), but even if we
did so, it makes sense for this code to match the rest of the exclude
callers.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse:oid filter has two error modes: we might fail to resolve the
name to an OID, or we might fail to parse the contents of that OID. In
the latter case, let's give a less generic error message, and mention
the OID we did find.
While we're here, let's also mark both messages as translatable.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The list-objects-filter code has two steps to its initialization:
1. parse_list_objects_filter() makes sure the spec is a filter we know
about and is syntactically correct. This step is done by "rev-list"
or "upload-pack" that is going to apply a filter, but also by "git
clone" or "git fetch" before they send the spec across the wire.
2. list_objects_filter__init() runs the type-specific initialization
(using function pointers established in step 1). This happens at
the start of traverse_commit_list_filtered(), when we're about to
actually use the filter.
It's a good idea to parse as much as we can in step 1, in order to catch
problems early (e.g., a blob size limit that isn't a number). But one
thing we _shouldn't_ do is resolve any oids at that step (e.g., for
sparse-file contents specified by oid). In the case of a fetch, the oid
has to be resolved on the remote side.
The current code does resolve the oid during the parse phase, but
ignores any error (which we must do, because we might just be sending
the spec across the wire). This leads to two bugs:
- if we're not in a repository (e.g., because it's git-clone parsing
the spec), then we trigger a BUG() trying to resolve the name
- if we did hit the error case, we still have to notice that later and
bail. The code path in rev-list handles this, but the one in
upload-pack does not, leading to a segfault.
We can fix both by moving the oid resolution into the sparse-oid init
function. At that point we know we have a repository (because we're
about to traverse), and handling the error there fixes the segfault.
As a bonus, we can drop the NULL sparse_oid_value check in rev-list,
since this is now handled in the sparse-oid-filter init function.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We test in t5317 that "sparse:oid" filters work with rev-list, but
there's no coverage at all confirming that they work with a fetch or
clone (and in fact, there are several bugs). Let's do a basic test that
a clone fetches the correct objects.
[jk: extracted from Jon's earlier fix patches. I also simplified the
setup down to a single sparse file, and I added checks that we got the
right blobs]
Signed-off-by: Jon Simons <jon@jonsimons.org>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After the previous commit, AsciiDoc and Asciidoctor render the manpage
headers identically, so we no longer need the "cut the header" part of
our `--cut-header-footer` option. We do still need the "cut the footer"
part, though. The previous commits improved the rendering of the footer
in Asciidoctor by quite a bit, but the two programs still disagree on
how to format the date in the footer: 01/01/1970 vs 1970-01-01.
We could keep using `--cut-header-footer`, but it would be nice if we
had a slightly smaller hammer `--cut-footer` that would be less likely
to hide regressions. Rather than simply adding such an option, let's
also drop `--cut-header-footer`, i.e., rework it to lose the "header"
part of its name and functionality.
`--cut-header-footer` is just a developer tool and it probably has no
more than a handful of users, so we can afford to be aggressive.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As can be seen from the previous commit, there are three attributes that
we provide to AsciiDoc through asciidoc.conf. Asciidoctor ignores that
file. After that patch, newer versions of Asciidoctor pick up the
`manmanual` and `mansource` attributes as we invoke `asciidoctor`, but
they don't pick up `manversion`. ([1] says: "Not used by Asciidoctor.")
Older versions (<1.5.7) don't handle these attributes at all. As a
result, we are missing one or three `<refmiscinfo/>` tags in each
xml-file produced when we build with Asciidoctor.
Because of this, xmlto doesn't include the Git version number in the
rendered manpages. And in particular, with versions <1.5.7, the manpage
footers instead contain the fairly ugly "[FIXME: source]".
That Asciidoctor ignores asciidoc.conf is nothing new. This is why we
implement the `linkgit:` macro in asciidoc.conf *and* in
asciidoctor-extensions.rb. Follow suit and provide these tags in
asciidoctor-extensions.rb, using a "postprocessor" extension where we
just search and replace in the XML, treated as text.
We may consider a few alternatives:
* Inject these lines into the xml-files from the *Makefile*, e.g.,
using `sed`. That would reduce repetition, but it feels wrong to
impose another step and another risk on the AsciiDoc-processing only
to benefit the Asciidoctor-one.
* I tried providing a "docinfo processor" to inject these tags, but
could not figure out how to "merge" the two <refmeta/> sections that
resulted. To avoid xmlto barfing on the result, I needed to use
`xmlto --skip-validation ...`, which seems unfortunate.
Let's instead inject the missing tags using a postprocessor. We'll make
it fairly obvious that we aim to inject the exact same three lines of
`<refmiscinfo/>` that asciidoc.conf provides. We inject them in
*post*-processing so we need to do the variable expansion ourselves. We
do introduce the bug that asciidoc.conf already has in that we won't do
any escaping, e.g., of funky versions like "some v <2.25, >2.20".
The postprocessor we add here works on the XML as raw text and doesn't
really use the full potential of XML to do a more structured injection.
This is actually precisely what the Asciidoctor User Manual does in its
postprocessor example [2]. I looked into two other approaches:
1. The nokogiri library is apparently the "modern" way of doing XML
in ruby. I got it working fairly easily:
require 'nokogiri'
doc = Nokogiri::XML(output)
doc.search("refmeta").each { |n| n.add_child(new_tags) }
output = doc.to_xml
However, this adds another dependency (e.g., the "ruby-nokogiri"
package on Ubuntu). Using Asciidoctor is not our default, but it
will probably need to become so soon. Let's avoid adding a
dependency just so that we can say "search...add_child" rather than
"sub(regex...)".
2. The older REXML is apparently always(?) bundled with ruby, but I
couldn't even parse the original document:
require 'rexml/document'
doc = REXML::Document.new(output)
...
The error was "no implicit conversion of nil into String" and I
stopped there.
I don't think it's unlikely that doing a plain old search-and-replace
will work just as fine or better compared to parsing XML and worrying
about libraries and library versions.
[1] https://asciidoctor.org/docs/user-manual/#builtin-attributes
[2] https://asciidoctor.org/docs/user-manual/#postprocessor-example
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than hardcoding "Git Manual" and "Git" as the manual and source
in asciidoc.conf, provide them as attributes `manmanual` and
`mansource`. Rename the `git_version` attribute to `manversion`.
These new attribute names are not arbitrary, see, e.g., [1].
For AsciiDoc (8.6.10) and Asciidoctor <1.5.7, this is a no-op. Starting
with Asciidoctor 1.5.7, `manmanual` and `mansource` actually end up in
the xml-files and eventually in the rendered manpages. In particular,
the manpage headers now render just as with AsciiDoc.
No versions of Asciidoctor pick up the `manversion` [2], and older
versions don't pick up any of these attributes. -- We'll fix that with a
bit of a hack in the next commit.
[1] https://asciidoctor.org/docs/user-manual/#man-pages
[2] Note how [1] says "Not used by Asciidoctor".
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In b57e8119e6 (submodule: teach set-branch subcommand, 2019-02-08), the
`set-branch` subcommand was added for submodules. When the documentation
was written, the syntax for a "index term" in AsciiDoc was
accidentally used. This caused the documentation to be rendered as
set-branch -d|--default)|(-b|--branch <branch> [--] <path>
instead of
set-branch ((-d|--default)|(-b|--branch <branch>)) [--] <path>
In addition to this, the original documentation was possibly confusing
as it made it seem as if the `-b` option didn't accept a `<branch>`
argument.
Break `--default` and `--branch` into their own separate invocations to
make it obvious that these options are mutually exclusive. Also, this
removes the surrounding parentheses so that the "index term" syntax is
not triggered.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our documentation toolchain has traditionally been built around DocBook
4.5. This version of DocBook is the last DTD-based version of DocBook.
In 2009, DocBook 5 was introduced using namespaces and its syntax is
expressed in RELAX NG, which is more expressive and allows a wider
variety of syntax forms.
Asciidoctor, one of the alternatives for building our documentation,
moved support for DocBook 4.5 out of core in its recent 2.0 release and
now only supports DocBook 5 in the main release. The DocBoook 4.5
converter is still available as a separate component, but this is not
available in most distro packages. This would not be a problem but for
the fact that we use xmlto, which is still stuck in the DocBook 4.5 era.
xmlto performs DTD validation as part of the build process. This is not
problematic for DocBook 4.5, which has a valid DTD, but it clearly
cannot work for DocBook 5, since no DTD can adequately express its full
syntax. In addition, even if xmlto did support RELAX NG validation,
that wouldn't be sufficient because it uses the libxml2-based xmllint to
do so, which has known problems with validating interleaves in RELAX NG.
Fortunately, there's an easy way forward: ask Asciidoctor to use its
DocBook 5 backend and tell xmlto to skip validation. Asciidoctor has
supported DocBook 5 since v0.1.4 in 2013 and xmlto has supported
skipping validation for probably longer than that.
We also need to teach xmlto how to use the namespaced DocBook XSLT
stylesheets instead of the non-namespaced ones it usually uses.
Normally these stylesheets are interchangeable, but the non-namespaced
ones have a bug that causes them not to strip whitespace automatically
from certain elements when namespaces are in use. This results in
additional whitespace at the beginning of list elements, which is
jarring and unsightly.
We can do this by passing a custom stylesheet with the -x option that
simply imports the namespaced stylesheets via a URL. Any system with
support for XML catalogs will automatically look this URL up and
reference a local copy instead without us having to know where this
local copy is located. We know that anyone using xmlto will already
have catalogs set up properly since the DocBook 4.5 DTD used during
validation is also looked up via catalogs. All major Linux
distributions distribute the necessary stylesheets and have built-in
catalog support, and Homebrew does as well, albeit with a requirement to
set an environment variable to enable catalog support.
On the off chance that someone lacks support for catalogs, it is
possible for xmlto (via xmllint) to download the stylesheets from the
URLs in question, although this will likely perform poorly enough to
attract attention. People still have the option of using the prebuilt
documentation that we ship, so happily this should not be an impediment.
Finally, we need to filter out some messages from other stylesheets that
occur when invoking dblatex in the CI job. This tool strips namespaces
much like the unnamespaced DocBook stylesheets and prints similar
messages. If we permit these messages to be printed to standard error,
our documentation CI job will fail because we check standard error for
unexpected output. Due to dblatex's reliance on Python 2, we may need
to revisit its use in the future, in which case this problem may go
away, but this can be delayed until a future patch.
The final message we filter is due to libxslt on modern Debian and
Ubuntu. The patch which they use to implement reproducible ID
generation also prints messages about the ID generation. While this
doesn't affect our current CI images since they use Ubuntu 16.04 which
lacks this patch, if we upgrade to Ubuntu 18.04 or a modern Debian,
these messages will appear and, like the above messages, cause a CI
failure.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'master' of git://ozlabs.org/~paulus/gitk:
gitk: Do not mistake unchanged lines for submodule changes
gitk: Use right colour for remote refs in the "Tags and heads" dialog
gitk: Add Chinese (zh_CN) translation
gitk: Make web links clickable
Selecting whether to "Amend Last Commit" or not does not have a hotkey.
With this patch, the user may toggle between the two options with
CTRL/CMD+e.
Signed-off-by: Birger Skogeng Pedersen <birger.sp@gmail.com>
Rebased-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Unchanged lines are prefixed with a white-space, thus unchanged lines
starting with either " <" or " >" are mistaken for submodule changes.
Check if a line starts with either " <" or " >" only if we are listing
the changes of a submodule.
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Makes it easier to see which refs are local and which refs are remote.
Adds consistency with the remote background colour in the graph display.
Signed-off-by: Paul Wise <pabs3@bonedaddy.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
In a following commit get_delta_base() will be used outside
packfile.c, so let's make it non static and declare it in
packfile.h.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To see when packfile reuse kicks in or not, it is useful to
show reused packfile objects statistics in the output of
upload-pack.
Helped-by: James Ramsay <james@jramsay.com.au>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the amend setting from two radio buttons ("New commit" and "Amend
commit") to a single checkbutton. The two radio buttons can never be
selected together because they are exactly the opposite of each other,
so it makes sense to change it to a single checkbutton.
* bw/amend-checkbutton:
git-gui: convert new/amend commit radiobutton to checkbutton
While the commit message widget has a configurable fixed width, it
nevertheless allowed to write commit messages which exceeded this limit.
Though there is no visual clue, that there is scrolling going on. Now
there is a horizontal scrollbar.
There seems to be a bug in at least Tcl/Tk up to version 8.6.8, which
does not update the horizontal scrollbar if one removes the whole
content at once.
Suggested-by: Birger Skogeng Pedersen <birger.sp@gmail.com>
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Currently, _git_archive() uses a hardcoded list of options for its
completion. However, we can use __gitcomp_builtin() to get a dynamically
generated list of completions instead.
Teach _git_archive() to use __gitcomp_builtin() so that newly
implemented options in archive will be automatically completed without
any mucking around in git-completion.bash. While we're at it, teach it
to complete the missing `--worktree-attributes` option as well.
Unfortunately, since some args are passed through from cmd_archive() to
write_archive() (which calls parse_archive_args()), there's no way that a
`--git-completion-helper` arg can end up reaching parse_archive_args()
since the first call to parse_options() will end up calling exit(0). As
a result, we have to carry the options supported by write_archive() in
the hardcoded string.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, _git_rebase() uses a hardcoded list of options for its
completion. However, we can use __gitcomp_builtin() to get a dynamically
generated list of completions instead.
Teach _git_rebase() to use __gitcomp_builtin() so that newly implemented
options in rebase will be automatically completed without any mucking
around in git-completion.bash.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bash completion script knows some options to "git log" and
"git show" only in the positive form, (e.g. "--abbrev-commit"), but not
in their negative form (e.g. "--no-abbrev-commit"). Add them.
Also, the bash completion script is missing some other options to
"git diff", and "git show" (and thus, all other commands that take
"git diff"'s options). Add them. Of note, since "--indent-heuristic" is
no longer experimental, add that too.
Signed-off-by: Max Rothman <max.r.rothman@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the client has asked for certain shallow options like
"deepen-since", we do a custom rev-list walk that pretends to be
shallow. Before doing so, we have to disable the commit-graph, since it
is not compatible with the shallow view of the repository. That's
handled by 829a321569 (commit-graph: close_commit_graph before shallow
walk, 2018-08-20). That commit literally closes and frees our
repo->objects->commit_graph struct.
That creates an interesting problem for commits that have _already_ been
parsed using the commit graph. Their commit->object.parsed flag is set,
their commit->graph_pos is set, but their commit->maybe_tree may still
be NULL. When somebody later calls repo_get_commit_tree(), we see that
we haven't loaded the tree oid yet and try to get it from the commit
graph. But since it has been freed, we segfault!
So the root of the issue is a data dependency between the commit's
lazy-load of the tree oid and the fact that the commit graph can go
away mid-process. How can we resolve it?
There are a couple of general approaches:
1. The obvious answer is to avoid loading the tree from the graph when
we see that it's NULL. But then what do we return for the tree oid?
If we return NULL, our caller in do_traverse() will rightly
complain that we have no tree. We'd have to fallback to loading the
actual commit object and re-parsing it. That requires teaching
parse_commit_buffer() to understand re-parsing (i.e., not starting
from a clean slate and not leaking any allocated bits like parent
list pointers).
2. When we close the commit graph, walk through the set of in-memory
objects and clear any graph_pos pointers. But this means we also
have to "unparse" any such commits so that we know they still need
to open the commit object to fill in their trees. So it's no less
complicated than (1), and is more expensive (since we clear objects
we might not later need).
3. Stop freeing the commit-graph struct. Continue to let it be used
for lazy-loads of tree oids, but let upload-pack specify that it
shouldn't be used for further commit parsing.
4. Push the whole shallow rev-list out to its own sub-process, with
the commit-graph disabled from the start, giving it a clean memory
space to work from.
I've chosen (3) here. Options (1) and (2) would work, but are
non-trivial to implement. Option (4) is more expensive, and I'm not sure
how complicated it is (shelling out for the actual rev-list part is
easy, but we do then parse the resulting commits internally, and I'm not
clear which parts need to be handling shallow-ness).
The new test in t5500 triggers this segfault, but see the comments there
for how horribly intimate it has to be with how both upload-pack and
commit graphs work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 43d3561805 (commit-graph write: don't die if the existing graph
is corrupt, 2019-03-25) added an environment variable we use only in the
test suite, $GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD. But it put the check for
this variable at the very top of prepare_commit_graph(), which is called
every time we want to use the commit graph. Most importantly, it comes
_before_ we check the fast-path "did we already try to load?", meaning
we end up calling getenv() for every single use of the commit graph,
rather than just when we load.
getenv() is allowed to have unexpected side effects, but that shouldn't
be a problem here; we're lazy-loading the graph so it's clear that at
least _one_ invocation of this function is going to call it.
But it is inefficient. getenv() typically has to do a linear search
through the environment space.
We could memoize the call, but it's simpler still to just bump the check
down to the actual loading step. That's fine for our sole user in t5318,
and produces this minor real-world speedup:
[before]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 1.460 s ± 0.017 s [User: 1.174 s, System: 0.285 s]
Range (min … max): 1.440 s … 1.491 s 10 runs
[after]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 1.391 s ± 0.005 s [User: 1.118 s, System: 0.273 s]
Range (min … max): 1.385 s … 1.399 s 10 runs
Of course that actual speedup depends on how big your environment is. We
can game it like this:
for i in $(seq 10000); do
export dummy$i=$i
done
in which case I get:
[before]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 6.257 s ± 0.061 s [User: 6.005 s, System: 0.250 s]
Range (min … max): 6.174 s … 6.337 s 10 runs
[after]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 1.403 s ± 0.005 s [User: 1.146 s, System: 0.256 s]
Range (min … max): 1.396 s … 1.412 s 10 runs
So this is really more about avoiding the pathological case than
providing a big real-world speedup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When traverse_commit_list() processes each commit, it queues the
commit's root tree in the pending array. Then, after all commits are
processed, it calls traverse_trees_and_blobs() to walk over the pending
list, calling process_tree() on each. But if revs->tree_objects is not
set, process_tree() just exists immediately!
We can save ourselves some work by not even bothering to queue these
trees in the first place. There are a few subtle points to make:
- we also detect commits with a NULL tree pointer here. But this isn't
an interesting check for broken commits, since the lookup_tree()
we'd have done during commit parsing doesn't actually check that we
have the tree on disk. So we're not losing any robustness.
- besides queueing, we also set the NOT_USER_GIVEN flag on the tree
object. This is used by the traverse_commit_list_filtered() variant.
But if we're not exploring trees, then we won't actually care about
this flag, which is used only inside process_tree() code-paths.
- queueing trees eventually leads to us queueing blobs, too. But we
don't need to check revs->blob_objects here. Even in the current
code, we still wouldn't find those blobs, because we'd never open up
the tree objects to list their contents.
- the user-visible impact to the caller is minimal. The pending trees
are all cleared by the time the function returns anyway, by
traverse_trees_and_blobs(). We do call a show_commit() callback,
which technically could be looking at revs->pending during the
callback. But it seems like a rather unlikely thing to do (if you
want the tree of the current commit, then accessing the tree struct
member is a lot simpler).
So this should be safe to do. Let's look at the benefits:
[before]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 7.651 s ± 0.021 s [User: 7.399 s, System: 0.252 s]
Range (min … max): 7.607 s … 7.683 s 10 runs
[after]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 7.593 s ± 0.023 s [User: 7.329 s, System: 0.264 s]
Range (min … max): 7.565 s … 7.634 s 10 runs
Not too impressive, but then we're really just avoiding sticking a
pointer into a growable array. But still, I'll take a free 0.75%
speedup.
Let's try it after running "git commit-graph write":
[before]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 1.458 s ± 0.011 s [User: 1.199 s, System: 0.259 s]
Range (min … max): 1.447 s … 1.481 s 10 runs
[after]
Benchmark #1: git -C linux rev-list HEAD >/dev/null
Time (mean ± σ): 1.126 s ± 0.023 s [User: 896.5 ms, System: 229.0 ms]
Range (min … max): 1.106 s … 1.181 s 10 runs
Now that's more like it. We saved over 22% of the total time. Part of
that is because the runtime is shorter overall, but the absolute
improvement is also much larger. What's going on?
When we fill in a commit struct using the commit graph, we don't bother
to set the tree pointer, and instead lazy-load it when somebody calls
get_commit_tree(). So we're not only skipping the pointer write to the
pending queue, but we're skipping the lazy-load of the tree entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move a closing backtick that was placed one character too soon.
Signed-off-by: Cameron Steffen <cam.steffen94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b841d4ff43 (Add `human` format to test-tool, 2019-01-28) added
a get_time() function which allows $GIT_TEST_DATE_NOW in the
environment to override the current time. So we no longer need to
interpret that variable in cmd__date().
Therefore, we can stop passing the "now" parameter down through the
date functions, since nobody uses them. Note that we do need to make
sure all of the previous callers that took a "now" parameter are
correctly using get_time().
Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-gui learned to revert selected lines and hunks, just like it can
stage selected lines and hunks. To provide a safety net for accidental
revert, the most recent revert can be undone.
* py/revert-hunks-lines:
git-gui: allow undoing last revert
git-gui: return early when patch fails to apply
git-gui: allow reverting selected hunk
git-gui: allow reverting selected lines
git-gui learned to switch focus between widgets "unstaged commits",
"staged commits", "diff", and "commit message" using the keyboard
shortcuts Alt+1, Alt+2, Alt+3, and Alt+4 respectively.
* bp/widget-focus-hotkeys:
git-gui: add hotkeys to set widget focus
The user cannot change focus between the list of files, the diff view and
the commit message widgets without using the mouse (clicking either of
the four widgets).
With this patch, the user may set ui focus to the previously selected path
in either the "Unstaged Changes" or "Staged Changes" widgets, using
ALT+1 or ALT+2.
The user may also set the ui focus to the diff view widget with
ALT+3, or to the commit message widget with ALT+4.
This enables the user to select/unselect files, view the diff and create a
commit in git-gui using keyboard-only.
Signed-off-by: Birger Skogeng Pedersen <birger.sp@gmail.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The cache-tree datastructure is used to speed up the comparison
between the HEAD and the index, and when the index is updated by
a cherry-pick (for example), a tree object that would represent
the paths in the index in a directory is constructed in-core, to
see if such a tree object exists already in the object store.
When the lazy-fetch mechanism was introduced, we converted this
"does the tree exist?" check into an "if it does not, and if we
lazily cloned, see if the remote has it" call by mistake. Since
the whole point of this check is to repair the cache-tree by
recording an already existing tree object opportunistically, we
shouldn't even try to fetch one from the remote.
Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only
check for existence in the local object store without triggering the
lazy fetch mechanism.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
[jc: rewritten the proposed log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git am" based backend of "git rebase" ignored the result of
updating ".gitattributes" done in one step when replaying
subsequent steps.
* bc/reread-attributes-during-rebase:
am: reload .gitattributes after patching it
path: add a function to check for path suffix
"for-each-ref" and friends that shows refs did not protect themselves
against ancient tags that did not record tagger names when asked to
show "%(taggername)", which have been corrected.
* mp/for-each-ref-missing-name-or-email:
ref-filter: initialize empty name or email fields
On-demand object fetching in lazy clone incorrectly tried to fetch
commits from submodule projects, while still working in the
superproject, which has been corrected.
* jt/diff-lazy-fetch-submodule-fix:
diff: skip GITLINK when lazy fetching missing objs
"git fetch" learned "--set-upstream" option to help those who first
clone from their private fork they intend to push to, add the true
upstream via "git remote add" and then "git fetch" from it.
* cb/fetch-set-upstream:
pull, fetch: add --set-upstream option
"git archive" recorded incorrect length in extended pax header in
some corner cases, which has been corrected.
* rs/pax-extended-header-length-fix:
archive-tar: turn length miscalculation warning into BUG
archive-tar: use size_t in strbuf_append_ext_header()
archive-tar: fix pax extended header length calculation
archive-tar: report wrong pax extended header length
We promoted the "indent heuristics" that decides where to split
diff hunks from experimental to the default a few years ago, but
some stale documentation still marked it as experimental, which has
been corrected.
* sg/diff-indent-heuristic-non-experimental:
diff: 'diff.indentHeuristic' is no longer experimental
A mechanism to affect the default setting for a (related) group of
configuration variables is introduced.
* ds/feature-macros:
repo-settings: create feature.experimental setting
repo-settings: create feature.manyFiles setting
repo-settings: parse core.untrackedCache
commit-graph: turn on commit-graph by default
t6501: use 'git gc' in quiet mode
repo-settings: consolidate some config settings
The command line parser learned "--end-of-options" notation; the
standard convention for scripters to have hardcoded set of options
first on the command line, and force the command to treat end-user
input as non-options, has been to use "--" as the delimiter, but
that would not work for commands that use "--" as a delimiter
between revs and pathspec.
* jk/eoo:
gitcli: document --end-of-options
parse-options: allow --end-of-options as a synonym for "--"
revision: allow --end-of-options to end option parsing
Further clean-up of the initialization code.
* jk/repo-init-cleanup:
config: stop checking whether the_repository is NULL
common-main: delay trace2 initialization
t1309: use short branch name in includeIf.onbranch test
18547aacf5 ("grep/pcre: support utf-8", 2016-06-25) that was released
with git 2.10 added the PCRE_UTF8 flag to PCRE1 matching including a
call to has_non_ascii() to try to avoid breakage if there was non-utf8
encoded content in the haystack.
Usually PCRE is compiled with JIT support (even if is not the default),
and therefore the codepath used includes calling pcre_jit_exec, which
skips UTF-8 validation by design (which might result in crashes or hangs)
but when JIT support wasn't compiled we use pcre_exec instead with the
posibility that grep might be aborted if invalid UTF-8 is found in the
haystack.
PCRE1 provides a flag since Mar 5, 2007 that could be used to skip the
checks explicitly so use that to make both codepaths equivalent (the
flag is ignored by pcre1_jit_exec)
this fix is only implemented for PCRE1 because PCRE2 is likely to have
a better solution (without the risks) instead in the future
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Load a default set of ref name decorations at the first lookup. This
frees direct and indirect callers from doing so. They can still do it
if they want to use a filter or are interested in full decorations
instead of the default short ones -- the first load_ref_decorations()
call wins.
This means that the load in builtin/log.c::cmd_log_init_finish() is
respected even if --simplify-by-decoration is given, as the previously
dominating earlier load in handle_revision_opt() is gone. So a filter
given with --decorate-refs-exclude is used for simplification in that
case, as expected.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Demonstrate that a decoration filter given with --decorate-refs-exclude
is inadvertently overruled by --simplify-by-decoration.
Reported-by: Étienne SERVAIS <etienne.servais@voucoux.fr>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This paragraph uses a lot of +pluses+ to render text as monospace. That
works fine with AsciiDoc (8.6.10), and almost fine with Asciidoctor
(1.5.5), which renders the third of these literally ("+$projname+"). The
reason seems to be that Asciidoctor trips on the lone plus a bit
earlier, even though it is escaped.
Switch +$projname+ to `$projname`, and change the next, similar instance
too (+$projname/+), because otherwise, we'd trip on /that one/ instead.
If we would stop there, we would now start falling over on the escaped
plus ('\+') mentioned earlier, rendering /it/ literally. So change that
too...
In other words, unescape the lone '+' and change all the pluses that
follow it to backticks.
AsciiDoc renders this paragraph identically before and after this
commit, and Asciidoctor now renders this the same as AsciiDoc.
I did try to switch the whole paragraph to using backticks rather than
pluses. That worked great with Asciidoctor, but confused AsciiDoc...
Let's go with this rather surgical change instead.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The example output of `git merge-index` has been enriched by a second
"column" of helpful comments. When Asciidoctor renders this, the cells
in that second column aren't aligned.
Fix this by marking the example shell session as a code listing by
wrapping it in "----". Also drop some of the horizontal space between
the two columns so that we fit into 80 columns. This changes the
rendering with both AsciiDoc and Asciidoctor. They now render this
identically, nicely aligned, and within 80 columns.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The indented lines in the example shell script listing are indented
differently by AsciiDoc and Asciidoctor.
Fix this by marking the example shell script as a code listing by
wrapping it in "----". Because this gives us some extra indentation, we
can remove the one that we have been carrying explicitly. That is, drop
the first tab of indentation on each line. For consistency, make the
same change to the short example shell session further down.
With AsciiDoc, this results in identical rendering before and after this
commit. Asciidoctor now renders this the same as AsciiDoc does.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second "column" in the output of `git ls-remote` is typeset
differently by AsciiDoc and Asciidoctor, similar to various examples
touched by the last few commits.
Fix this by marking the example shell session as a code listing by
wrapping it in "----". Because this gives us some extra indentation, we
can remove the one that we have been carrying explicitly. That is, drop
the first tab of indentation on each line. With AsciiDoc, this results
in identical rendering before and after this commit. Asciidoctor now
renders this the same as AsciiDoc does.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The indented lines in these example config-file listings are indented
differently by AsciiDoc and Asciidoctor.
Fix this by marking the example config-files as code listings by
wrapping them in "----". Because this gives us some extra indentation,
we can remove the one that we have been carrying explicitly. That is,
drop the first tab of indentation on each line.
With AsciiDoc, this results in identical rendering before and after this
commit. Asciidoctor now renders this the same as AsciiDoc does.
git-config.txt pretty consistently uses twelve dashes rather than the
minimum four to spell "----". Let's stick to the file-local convention
there.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are several graphs in this document. For most of them, we use a
single leading tab to indent the whole graph, and then we use spaces
(possibly eight or more) to align things within the graph.
In the larger graph, we use a different strategy: We use 1-N tabs and
just a small number of spaces (<8). This is how we usually prefer to do
our indenting, but Asciidoctor ends up rendering this differently from
AsciiDoc. Same thing for the if-then-fi examples where the conditional
code is indented by two tabs, which renders differently under AsciiDoc
and Asciidoctor.
Similar to 379805051d ("Documentation: render revisions correctly under
Asciidoctor", 2018-05-06), use an explicit literal block to indicate
that we want to keep the leading whitespace in the tables. Change not
just the ones that render differently, but all of them for consistency.
Because this gives us some extra indentation, we can remove the one that
we have been carrying explicitly. That is, drop the first tab of
indentation on each line. With AsciiDoc, this results in identical
rendering before and after this commit, both for git-merge-base.1 and
git-merge-base.html.
A less intrusive change would be to replace tabs 2-N on each line with
eight spaces. But let's follow the example set by 379805051d, so that we
can use our preferred way of indenting.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for each of these options contains a list. After the
list, AsciiDoc interprets the continuation as a continuation of the
*list*, not as a continution of the larger block. As a result, we get
too much indentation. Wrap the entire blocks in "--" to fix this. With
Asciidoctor, this commit is a no-op, and the two programs now render
these identically.
These two files share the same problem and indeed, they both document
`--untracked-files` in quite similar ways. I haven't checked to what
extent that is intentional or warranted, and to what extent they have
simply drifted apart. I consider such an investigation and possible
cleanup as out of scope for this commit and this patch series.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph tool may read a lot of commits, but it only cares about
parsing their metadata (parents, trees, etc) and doesn't ever show the
messages to the user. And so it should not need save_commit_buffer,
which is meant for holding onto the object data of parsed commits so
that we can show them later. In fact, it's quite harmful to do so.
According to massif, the max heap of "git commit-graph write
--reachable" in linux.git before/after this patch (removing the commit
graph file in between) goes from ~1.1GB to ~270MB.
Which isn't surprising, since the difference is about the sum of the
uncompressed sizes of all commits in the repository, and this was
equivalent to leaking them.
This obviously helps if you're under memory pressure, but even without
it, things go faster. My before/after times for that command (without
massif) went from 12.521s to 11.874s, a speedup of ~5%.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 49bbc57a57 (commit-graph write: emit a percentage for all
progress, 2019-01-19) was a bit overeager when it added progress
percentages to the "Expanding reachable commits in commit graph" phase
as well, because most of the time the number of commits that phase has
to iterate over is not known in advance and grows significantly, and,
consequently, we end up with nonsensical numbers:
$ git commit-graph write --reachable
Expanding reachable commits in commit graph: 138606% (824706/595), done.
[...]
$ git rev-parse v5.0 | git commit-graph write --stdin-commits
Expanding reachable commits in commit graph: 81264400% (812644/1), done.
[...]
Even worse, because the percentage grows so quickly, the progress code
outputs much more often than it should (because it ticks every second,
or every 1%), slowing the whole process down. My time for "git
commit-graph write --reachable" on linux.git went from 13.463s to
12.521s with this patch, ~7% savings.
Therefore, don't show progress percentages in the "Expanding reachable
commits in commit graph" phase.
Note that the current code does sometimes do the right thing, if we
picked up all commits initially (e.g., omitting "--reachable" in a
fully-packed repository would get the correct count without any parent
traversal). So it may be possible to come up with a way to tell when we
could use a percentage here. But in the meantime, let's make sure we
robustly avoid printing nonsense.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply similar treatment as in the previous commit to handle an unchecked
call to 'get_commit_tree_oid()'. Previously, a NULL return value from
this function would be immediately dereferenced with '->hash', and then
cause a segfault.
Before dereferencing to access the 'hash' member, check the return value
of 'get_commit_tree_oid()' to make sure that it is not NULL.
To make this check correct, a related change is also needed in
'commit.c', which is to check the return value of 'get_commit_tree'
before taking its address. If 'get_commit_tree' returns NULL, we
encounter an undefined behavior when taking the address of the return
value of 'get_commit_tree' and then taking '->object.oid'. (On my system,
this is memory address 0x8, which is obviously wrong).
Fix this by making sure that 'get_commit_tree' returns something
non-NULL before digging through a structure that is not there, thus
preventing a segfault down the line in the commit graph code.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To write a commit graph chunk, 'write_graph_chunk_data()' takes a list
of commits to write and parses each one before writing the necessary
data, and continuing on to the next commit in the list.
Since the majority of these commits are not parsed ahead of time (an
exception is made for the *last* commit in the list, which is parsed
early within 'copy_oids_to_commits'), it is possible that calling
'parse_commit_no_graph()' on them may return an error. Failing to catch
these errors before de-referencing later calls can result in a undefined
memory access and a SIGSEGV.
One such example of this is 'get_commit_tree_oid()', which expects a
parsed object as its input (in this case, the commit-graph code passes
'*list'). If '*list' causes a parse error, the subsequent call will
fail.
Prevent such an issue by checking the return value of
'parse_commit_no_graph()' to avoid passing an unparsed object to a
function which expects a parsed object, thus preventing a segfault.
It is worth noting that this fix is really skirting around the issue in
object.c's 'parse_object()', which makes it difficult to tell how
corrupt an object is without digging into it. Presumably one could
change the meaning of 'parse_object' returns, but this would require
adjusting each callsite accordingly. Instead of that, add an additional
check to the object parsed.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When invoking 'git commit-graph' in a corrupt repository, one can cause
a segfault when ancestral commits are corrupt in one way or another.
This is due to two function calls in the 'commit-graph.c' code that may
return NULL, but are not checked for NULL-ness before dereferencing.
Before fixing the bug, introduce two failing tests that demonstrate the
problem. The first test corrupts an ancestral commit's parent to point
to a non-existent object. The second test instead corrupts an ancestral
tree by removing the 'tree' information entirely from the commit. Both
of these cases cause segfaults, each at different lines.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing 'git rebase --autostash <upstream> <master>' with a dirty worktree
a 'HEAD is now at ...' message is emitted, which is pointless as it refers to
the old active branch which isn't actually moved.
This commit removes the 'HEAD is now at...' message.
Signed-off-by: Ben Wijen <ben@wijen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Consider the following scenario:
git checkout not-the-master
work work work
git rebase --autostash upstream master
Here 'rebase --autostash <upstream> <branch>' incorrectly moves the
active branch (not-the-master) to master (before the rebase).
The expected behavior: (58794775:/git-rebase.sh:526)
AUTOSTASH=$(git stash create autostash)
git reset --hard
git checkout master
git rebase upstream
git stash apply $AUTOSTASH
The actual behavior: (6defce2b:/builtin/rebase.c:1062)
AUTOSTASH=$(git stash create autostash)
git reset --hard master
git checkout master
git rebase upstream
git stash apply $AUTOSTASH
This commit reinstates the 'legacy script' behavior as introduced with
58794775: rebase: implement --[no-]autostash and rebase.autostash
Signed-off-by: Ben Wijen <ben@wijen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In many test scripts, there are bespoke definitions of the single quote
that are some variation of this:
SQ="'"
Define a common $SQ variable in test-lib.sh and replace all usages of
these bespoke variables with the common one.
This change was done by running `git grep =\"\'\" t/` and
`git grep =\\\\\'` and manually changing the resulting definitions and
corresponding usages.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, the code to add an object to our packing list in
pack-objects all lived in a single function. It computed the position
within the hash table once, then used it to check if the object was
already present, and if not, to add it.
Later, in 2834bc27c1 (pack-objects: refactor the packing list,
2013-10-24), this was split into two functions: packlist_find() and
packlist_alloc(). We ended up with an "index_pos" variable that gets
passed through several functions to make it from one to the other.
The resulting code is rather confusing to follow. The "index_pos"
variable is sometimes undefined, if we don't yet have a hash table. This
works out in practice because in that case packlist_alloc() won't use it
at all, since it will have to create/grow the hash table. But it's hard
to verify that, and it does cause gcc 9.2.1's -Wmaybe-uninitialized to
complain when compiled with "-flto -O3" (rightfully, since we do pass
the uninitialized value as a function parameter, even if nobody ends up
using it).
All of this is to save computing the hash index again when we're
inserting into the hash table, which I found doesn't make a measurable
difference in the program runtime (which is not surprising, since we're
doing all kinds of other heavyweight things for each object).
Let's just drop this index_pos variable entirely, simplifying the code
(and pleasing the compiler).
We might be better still refactoring this custom hash table to use one
of our existing implementations (an oidmap, or a kh_oid_map). I stopped
short of that here, but this would be the likely first step towards that
anyway.
Reported-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Early in the function we set "namelen = strlen(name)" if "name" is
non-NULL. Later, we use "namelen" only if "name" is non-NULL. However,
it's hard to immediately see this, and it seems to confuse gcc 9.2.1
(with "-flto" interestingly, though all of the involved logic is in
inline functions; it also triggers when building with ASan).
Let's simplify the code and remove the variable entirely. There's only
one use of namelen anyway, so we can just call strlen() then. It's true
this is in a loop, so we might execute strlen() more often. But:
- this is test code that only ever loops twice in our test suite (we
do loop 1000 times in a t/perf test, but without using this option).
- a decent compiler ought to be able to hoist that out of the loop
anyway (though I wouldn't count on gcc 9.2.1 doing so!)
Reported-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we cannot generate a delta, we return NULL but leave delta_size
untouched. This is generally OK, as callers rely on NULL to decide if
the output is usable or not. But it can confuse compilers; in
particular, gcc 9.2.1 with "-flto -O3" complains in fast-import's
store_object() that delta_len may be used uninitialized.
Let's change the diff-delta code to set the size explicitly to 0 for a
NULL return. That silences the compiler and makes it easier to reason
about the result.
Reported-by: Stephan Beyer <s-beyer@gmx.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We declare a "struct hashfile_checkpoint" but only sometimes actually
call hashfile_checkpoint() on it. That makes it not immediately obvious
that it's valid when we later access its members.
In fact, the code is fine: we fill it in unconditionally in the while(1)
loop as long as "idx" is non-NULL. And then if "idx" is NULL, we exit
early from the function (because we're just computing the hash, not
actually writing), before we look at the struct.
However, this does seem to confuse gcc 9.2.1's -Wmaybe-uninitialized
when compiled with "-flto -O2" (probably because with LTO it can now
realize that our call to hashfile_truncate() does not set the members
either). Let's zero-initialize the struct to tell the compiler, as well
as any readers of the code, that all is well.
Reported-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only caller of packlist_alloc() already has a "struct object_id",
and we immediately copy the hash they pass us into our own object_id.
Let's avoid the unnecessary round-trip to a raw sha1 pointer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We try to parse the "author" line out of a commit buffer. We handle the
case that split_ident_line() doesn't work, but we don't do any error
checking that we found an "author" line in the first place! This would
cause us to segfault on such a corrupt object.
Let's put in an explicit NULL check (we can just die(), which is what a
bogus split would do, too). As a bonus, this silences a warning when
compiling with gcc 9.2.1 using "-flto -O3", which claims that ident_len
may be uninitialized (it would only be if we had a NULL here).
Reported-by: Stephan Beyer <s-beyer@gmx.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time GIT_TEST_HTTPD was a tristate variable and we
exported 'GIT_TEST_HTTPD=YesPlease' in our CI scripts to make sure
that we run the httpd tests in the Linux Clang and GCC build jobs, or
error out if they can't be run for any reason [1].
Then 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) came along, turned GIT_TEST_HTTPD into a bool, but forgot
to update our CI scripts accordingly. So, since GIT_TEST_HTTPD is set
explicitly, but its value is not one of the standardized true values,
our CI jobs have been simply skipping the httpd tests in the last
couple of weeks.
Set 'GIT_TEST_HTTPD=true' to restore running httpd tests in our CI
jobs.
[1] a1157b76eb (travis-ci: set GIT_TEST_HTTPD in 'ci/lib-travisci.sh',
2017-12-12)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time the GIT_SVN_TEST_HTTPD environment variable needed to
be set to enable SVN HTTP tests [1].
Then 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) came along, and attempted to turn GIT_SVN_TEST_HTTPD into
a bool, but while doing so it mistyped the variable name, and started
to check GIT_TEST_HTTPD instead. Consequently, if someone explicitly
set GIT_TEST_HTTPD to true and has only the general 'git-svn'
dependencies installed but not the Subversion server modules for
Apache (libapache2-mod-svn), then a couple of 'git-svn' tests fail,
because they can't start httpd due to the missing module.
We could simply fix this by checking the GIT_SVN_TEST_HTTPD
variablewith 'git env--helper', but notice that the name of this
variable doesn't conform to our usual GIT_TEST_* convention.
So let's check the GIT_TEST_SVN_HTTPD instead.
[1] a8a5d25118 (git svn: migrate tests to use lib-httpd, 2016-07-23)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid derefencing ->tagged without checking for NULL by using the
convenience wrapper for getting the ID of the tagged object. It die()s
when encountering a broken tag instead of segfaulting.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function for accessing the ID of the object referenced by a tag
safely, i.e. without causing a segfault when encountering a broken tag
where ->tagged is NULL.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!
Now that this library is renamed to use 'struct pattern_list'
and 'struct pattern', we can now rename the method used by
the sparse-checkout feature to determine which paths should
appear in the working directory.
The method is_excluded_from_list() is only used by the
sparse-checkout logic in unpack-trees and list-objects-filter.
The confusing part is that it returned 1 for "excluded" (i.e.
it matches the list of exclusions) but that really manes that
the path matched the list of patterns for _inclusion_ in the
working directory.
Rename the method to be path_matches_pattern_list() and have
it return an explicit 'enum pattern_match_result'. Here, the
values MATCHED = 1, UNMATCHED = 0, and UNDECIDED = -1 agree
with the previous integer values. This shift allows future
consumers to better understand what the retur values mean,
and provides more type checking for handling those values.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!
It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.
This commit renames several methods defined in dir.h to make
more sense with the renamed 'struct exclude_list' to 'struct
pattern_list' and 'struct exclude' to 'struct path_pattern':
* last_exclude_matching() -> last_matching_pattern()
* parse_exclude() -> parse_path_pattern()
In addition, the word 'exclude' was replaced with 'pattern'
in the methods below:
* add_exclude_list()
* add_excludes_from_file_to_list()
* add_excludes_from_file()
* add_excludes_from_blob_to_list()
* add_exclude()
* clear_exclude_list()
A few methods with the word "exclude" remain. These will
be handled seperately. In particular, the method
"is_excluded()" is concretely about the .gitignore file
relative to a specific directory. This is the important
boundary between library and consumer: is_excluded() cares
about .gitignore, but is_excluded() calls
last_matching_pattern() to make that decision.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!
It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.
This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the
names of the flags used on 'struct path_pattern'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list' makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!
It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.
This commit renames 'struct exclude_list' to 'struct pattern_list'
and renames several variables called 'el' to 'pl'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a list of 'struct exclude' items makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!
It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.
This commit renames 'struct exclude' to 'struct path_pattern'
and renames several variable names to match. 'struct pattern'
was already taken by attr.c, and this more completely describes
that the patterns are specific to file paths.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t9902 had a list of three random porcelain commands as a sanity check,
one of which was filter-branch. Since we are recommending people not
use filter-branch, let's update this test to use rebase instead of
filter-branch.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
filter-branch suffers from a deluge of disguised dangers that disfigure
history rewrites (i.e. deviate from the deliberate changes). Many of
these problems are unobtrusive and can easily go undiscovered until the
new repository is in use. This can result in problems ranging from an
even messier history than what led folks to filter-branch in the first
place, to data loss or corruption. These issues cannot be backward
compatibly fixed, so add a warning to both filter-branch and its manpage
recommending that another tool (such as filter-repo) be used instead.
Also, update other manpages that referenced filter-branch. Several of
these needed updates even if we could continue recommending
filter-branch, either due to implying that something was unique to
filter-branch when it applied more generally to all history rewriting
tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because
something about filter-branch was used as an example despite other more
commonly known examples now existing. Reword these sections to fix
these issues and to avoid recommending filter-branch.
Finally, remove the section explaining BFG Repo Cleaner as an
alternative to filter-branch. I feel somewhat bad about this,
especially since I feel like I learned so much from BFG that I put to
good use in filter-repo (which is much more than I can say for
filter-branch), but keeping that section presented a few problems:
* In order to recommend that people quit using filter-branch, we need
to provide them a recomendation for something else to use that
can handle all the same types of rewrites. To my knowledge,
filter-repo is the only such tool. So it needs to be mentioned.
* I don't want to give conflicting recommendations to users
* If we recommend two tools, we shouldn't expect users to learn both
and pick which one to use; we should explain which problems one
can solve that the other can't or when one is much faster than
the other.
* BFG and filter-repo have similar performance
* All filtering types that BFG can do, filter-repo can also do. In
fact, filter-repo comes with a reimplementation of BFG named
bfg-ish which provides the same user-interface as BFG but with
several bugfixes and new features that are hard to implement in
BFG due to its technical underpinnings.
While I could still mention both tools, it seems like I would need to
provide some kind of comparison and I would ultimately just say that
filter-repo can do everything BFG can, so ultimately it seems that it
is just better to remove that section altogether.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test t6006.71 ("oneline with empty message") was creating two commits
with simple commit messages, and then running filter-branch to rewrite
the commit messages to be "empty". This test was introduced in commit
1fb5fdd25f ("rev-list: fix --pretty=oneline with empty message",
2010-03-21) and written this way because the --allow-empty-message
option to git commit did not exist at the time.
However, the filter-branch invocation used differed slightly from
--allow-empty-message in that it would have a commit message consisting
solely of a single newline, and as such was not testing what the
original commit intended to test. Since both a truly empty commit
message and a commit message with a single linefeed could trigger the
original bug, modify the test slightly to include an example of each.
Despite only being one piece of the 71st test and there being 73 tests
overall, this small change to just this one test speeds up the overall
execution time of t6006 (as measured by the best of 3 runs of `time
./t6006-rev-list-format.sh`) by about 11% on Linux, 13% on Mac, and
about 15% on Windows.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-format-patch.txt, we were missing some key user information.
First of all, document the special value of `--base=auto`.
Next, while we're at it, surround option arguments with <> and change
existing names such as "Message-Id" to "message id", which conforms with
how existing documentation is written.
Finally, document the `format.outputDirectory` config and change
`format.coverletter` to use camel case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are two ways where the return codes of Git commands are
lost. The first way is when a command is in the upstream of a pipe. In a
pipe, only the return code of the last command is used. Thus, all other
commands will have their return codes masked. Rewrite pipes so that
there are no Git commands upstream.
The other way is when a command is in a non-assignment subshell. The
return code will be lost in favour of the surrounding command's. Rewrite
instances of this such that Git commands output to a file and
surrounding commands only call subshells with non-Git commands.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In check_threading(), there was a Git command in the upstream of a pipe.
In order to not lose its status code, it was saved into a file. However,
this may be confusing so rewrite to redirect IO to file. This allows us
to directly use the conventional &&-chain.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all instances of `cnt=$(... | wc -l) && test $cnt = N` into uses
of `test_line_count()`.
While we're at it, convert one instance of a Git command upstream of a
pipe into two commands. This prevents a failure of a Git command from
being masked since only the return code of the last member of the pipe
is shown.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we were using a redirection operator to feed input into
sed. However, since sed is capable of opening its own files, make sed
open its own files instead of redirecting input into it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since output is silenced when running without `-v` and debugging output
is useful with `-v`, remove redirections to /dev/null as it is not
useful.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The convention is to use indentable here-docs within test cases so that
the here-docs line up with the rest of the code within the test case.
Change here-docs from `<<\EOF` to `<<-\EOF` so that they can be indented
along with the rest of the test case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usual convention is for test case names to be written between
single-quotes. Change all double-quoted test case names to single-quotes
except for one test case name that uses a sq for a contraction.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usual convention for test cases is for the closing sq to be on its
own line. Move the sq onto its own line for cases that do not conform to
this style.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For test cases, the usual convention is to name expected output files
"expect", not "expected". Replace all instances of "expected" with
"expect", except for one case where the "expected" is used as the name
of a test case.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few test scripts assign a single LF to $LF, but that is already
given by test-lib.sh to everybody.
Remove the unnecessary reassignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 554544276a (*.[ch]: remove extern from function declarations using
spatch, 2019-04-29), we removed externs from function declarations using
spatch but we intentionally excluded files under compat/ since some are
directly copied from an upstream and we should avoid churning them so
that manually merging future updates will be simpler.
In the last commit, we determined the files which taken from an upstream
so we can exclude them and run spatch on the remainder.
This was the Coccinelle patch used:
@@
type T;
identifier f;
@@
- extern
T f(...);
and it was run with:
$ git ls-files compat/\*\*.{c,h} |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
$ git checkout -- \
compat/regex/ \
compat/inet_ntop.c \
compat/inet_pton.c \
compat/nedmalloc/ \
compat/obstack.{c,h} \
compat/poll/
Coccinelle has some trouble dealing with `__attribute__` and varargs so
we ran the following to ensure that no remaining changes were left
behind:
$ git ls-files compat/\*\*.{c,h} |
xargs sed -i'' -e 's/^\(\s*\)extern \([^(]*([^*]\)/\1\2/'
$ git checkout -- \
compat/regex/ \
compat/inet_ntop.c \
compat/inet_pton.c \
compat/nedmalloc/ \
compat/obstack.{c,h} \
compat/poll/
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After running Coccinelle on all sources inside compat/ that were created
by us[1], it was found that compat/mingw.c violated an array.cocci rule
in two places and, thus, a patch was generated. Apply this patch so that
all compat/ sources created by us follows all cocci rules.
[1]: Do not run Coccinelle on files that are taken from some upstream
because in case we need to pull updates from them, we would like to have
diverged as little as possible in order to make merging updates simpler.
The following sources were determined to have been taken from some
upstream:
* compat/regex/
* compat/inet_ntop.c
* compat/inet_pton.c
* compat/nedmalloc/
* compat/obstack.{c,h}
* compat/poll/
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fast-export and fast-import can easily handle the simple rewrite that
was being done by filter-branch, and should be faster on systems with a
slow fork. Measuring the overall time taken for all of t3427 (not just
the difference between filter-branch and fast-export/fast-import) shows
a speedup of about 5% on Linux and 11% on Mac.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, it is possible to embed additional metadata into an
executable by linking in a "manifest", i.e. an XML document that
describes capabilities and requirements (such as minimum or maximum
Windows version). These XML documents are expected to be stored in
`.manifest` files.
At least _some_ Visual Studio versions auto-generate `.manifest` files
when none is specified explicitly, therefore we used to ask Git to
ignore them.
However, we do have a beautiful `.manifest` file now:
`compat/win32/git.manifest`, so neither does Visual Studio auto-generate
a manifest for us, nor do we want Git to ignore the `.manifest` files
anymore.
Further reading on auto-generated `.manifest` files:
https://docs.microsoft.com/en-us/cpp/build/manifest-generation-in-visual-studio
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When applying multiple patches with git am, or when rebasing using the
am backend, it's possible that one of our patches has updated a
gitattributes file. Currently, we cache this information, so if a
file in a subsequent patch has attributes applied, the file will be
written out with the attributes in place as of the time we started the
rebase or am operation, not with the attributes applied by the previous
patch. This problem does not occur when using the -m or -i flags to
rebase.
To ensure we write the correct data into the working tree, expire the
cache after each patch that touches a path ending in ".gitattributes".
Since we load these attributes in multiple separate files, we must
expire them accordingly.
Verify that both the am and rebase code paths work correctly, including
the conflict marker size with am -3.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce code duplication by turning parse_tree_indirect() into a wrapper
of repo_peel_to_type(). This avoids a segfault when handling a broken
tag where ->tagged is NULL. The new version also checks the return
value of parse_object() that was ignored by the old one.
Initial-patch-by: Stefan Sperling <stsp@stsp.name>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph feature is now on by default, and is being
written during 'git gc' by default. Typically, Git only writes
a commit-graph when a 'git gc --auto' command passes the gc.auto
setting to actualy do work. This means that a commit-graph will
typically fall behind the commits that are being used every day.
To stay updated with the latest commits, add a step to 'git
fetch' to write a commit-graph after fetching new objects. The
fetch.writeCommitGraph config setting enables writing a split
commit-graph, so on average the cost of writing this file is
very small. Occasionally, the commit-graph chain will collapse
to a single level, and this could be slow for very large repos.
For additional use, adjust the default to be true when
feature.experimental is enabled.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pushes with --all, or refspecs are disallowed when --mirror is given
to 'git push', or when 'remote.<name>.mirror' is set in the config of
the repository, because they can have surprising
effects. 800a4ab399 ("push: check for errors earlier", 2018-05-16)
refactored this code to do that check earlier, so we can explicitly
check for the presence of flags, instead of their sideeffects.
However when 'remote.<name>.mirror' is set in the config, the
TRANSPORT_PUSH_MIRROR flag would only be set after we calling
'do_push()', so the checks would miss it entirely.
This leads to surprises for users [*1*].
Fix this by making sure we set the flag (if appropriate) before
checking for compatibility of the various options.
*1*: https://twitter.com/FiloSottile/status/1163918701462249472
Reported-by: Filippo Valsorda <filippo@ml.filippo.io>
Helped-by: Saleem Rashid
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As discovered on the mailing list, some of the descriptions of the
ff-related options were unclear. Try to be more precise with what these
options do.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Announce that calling help_unknown_ref() exits the program.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git switch' command was created to separate half of the
behavior of 'git checkout'. It specifically has the mode to
do nothing with the index and working directory if the user
only specifies to create a new branch and change HEAD to that
branch. This is also the behavior most users expect from
'git checkout -b', but for historical reasons it also performs
an index update by scanning the working directory. This can be
slow for even moderately-sized repos.
A performance fix for 'git checkout -b' was introduced by
fa655d8411 (checkout: optimize "git checkout -b <new_branch>"
2018-08-16). That change includes details about the config
setting checkout.optimizeNewBranch when the sparse-checkout
feature is required. The way this change detected if this
behavior change is safe was through the skip_merge_working_tree()
method. This method was complex and needed to be updated
as new options were introduced.
This behavior was essentially reverted by 65f099b ("switch:
no worktree status unless real branch switch happens"
2019-03-29). Instead, two members of the checkout_opts struct
were used to distinguish between 'git checkout' and 'git switch':
* switch_branch_doing_nothing_is_ok
* only_merge_on_switching_branches
These settings have opposite values depending on if we start
in cmd_checkout or cmd_switch.
The message for 64f099b includes "Users of big repos are
encouraged to move to switch." Making this change while
'git switch' is still experimental is too aggressive.
Create a happy medium between these two options by making
'git checkout -b <branch>' behave just like 'git switch',
but only if we read exactly those arguments. This must
be done in cmd_checkout to avoid the arguments being
consumed by the option parsing logic.
This differs from the previous change by fa644d8 in that
the config option checkout.optimizeNewBranch remains
deleted. This means that 'git checkout -b' will ignore
the index merge even if we have a sparse-checkout file.
While this is a behavior change for 'git checkout -b',
it matches the behavior of 'git switch -c'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Elijah Newren <newren@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes gitk look for http or https URLs in the commit description
and make the URLs clickable. Clicking on them will invoke an external
web browser with the URL.
The web browser command is by default "xdg-open" on Linux, "open" on
MacOS, and "cmd /c start" on Windows. The command can be changed in
the preferences window, and it can include parameters as well as the
command name. If it is set to the empty string then URLs will no
longer be made clickable.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Accidental clicks on the revert hunk/lines buttons can cause loss of
work, and can be frustrating. So, allow undoing the last revert.
Right now, a stack or deque are not being used for the sake of
simplicity, so only one undo is possible. Any reverts before the
previous one are lost.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
A common scenario is if a user is working on a topic branch and they
wish to make some changes to intermediate commits or autosquash, they
would run something such as
git rebase -i --onto master... master
in order to preserve the merge base. This is useful when contributing a
patch series to the Git mailing list, one often starts on top of the
current 'master'. While developing the patches, 'master' is also
developed further and it is sometimes not the best idea to keep rebasing
on top of 'master', but to keep the base commit as-is.
In addition to this, a user wishing to test individual commits in a
topic branch without changing anything may run
git rebase -x ./test.sh master... master
Since rebasing onto the merge base of the branch and the upstream is
such a common case, introduce the --keep-base option as a shortcut.
This allows us to rewrite the above as
git rebase -i --keep-base master
and
git rebase -x ./test.sh --keep-base master
respectively.
Add tests to ensure --keep-base works correctly in the normal case and
fails when there are multiple merge bases, both in regular and
interactive mode. Also, test to make sure conflicting options cause
rebase to fail. While we're adding test cases, add a missing
set_fake_editor call to 'rebase -i --onto master...side'.
While we're documenting the --keep-base option, change an instance of
"merge-base" to "merge base", which is the consistent spelling.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests rebasing a linear branch topology to linear rebase tests
added in 2aad7cace2 ("add simple tests of consistency across rebase
types", 2013-06-06).
These tests are duplicates of two surrounding tests that do the same
with tags pointing to the same objects. Right now there's no change in
behavior being introduced, but as we'll see in a subsequent change
rebase can have different behaviors when working implicitly with
remote tracking branches.
While I'm at it add a --fork-point test, strictly speaking this is
redundant to the existing '' test, as no argument to rebase implies
--fork-point. But now it's easier to grep for tests that explicitly
stress --fork-point.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, when we rebased with a --fork-point invocation where the
fork-point wasn't empty, we would be setting options.restrict_revision.
The fast-forward logic would automatically declare that the rebase was
not fast-forwardable if it was set. However, this was painting with a
very broad brush.
Refine the logic so that we can fast-forward in the case where the
restricted revision is equal to the merge base, since we stop rebasing
at the merge base anyway.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, when we had the following graph,
A---B---C (master)
\
D (side)
running 'git rebase --onto master... master side' would result in D
being always rebased, no matter what. However, the desired behavior is
that rebase should notice that this is fast-forwardable and do that
instead.
Add detection to `can_fast_forward` so that this case can be detected
and a fast-forward will be performed. First of all, rewrite the function
to use gotos which simplifies the logic. Next, since the
options.upstream &&
!oidcmp(&options.upstream->object.oid, &options.onto->object.oid)
conditions were removed in `cmd_rebase`, we reintroduce a substitute in
`can_fast_forward`. In particular, checking the merge bases of
`upstream` and `head` fixes a failing case in t3416.
The abbreviated graph for t3416 is as follows:
F---G topic
/
A---B---C---D---E master
and the failing command was
git rebase --onto master...topic F topic
Before, Git would see that there was one merge base (C), and the merge
and onto were the same so it would incorrectly return 1, indicating that
we could fast-forward. This would cause the rebased graph to be 'ABCFG'
when we were expecting 'ABCG'.
With the additional logic, we detect that upstream and head's merge base
is F. Since onto isn't F, it means we're not rebasing the full set of
commits from master..topic. Since we're excluding some commits, a
fast-forward cannot be performed and so we correctly return 0.
Add '-f' to test cases that failed as a result of this change because
they were not expecting a fast-forward so that a rebase is forced.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, can_fast_forward was written with an if-else statement. However,
in the future, we may be adding more termination cases which would lead
to deeply nested if statements.
Refactor to use a goto tower so that future cases can be easily
inserted.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add more stress tests for the can_fast_forward() case in
rebase.c. These tests are getting rather verbose, but now we can see
under --ff and --no-ff whether we skip work, or whether we're forced
to run the rebase.
These tests aren't supposed to endorse the status quo, just test for
what we're currently doing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fast-import's read_next_command() has somewhat odd memory ownership
semantics for the command_buf strbuf. After reading a command, we copy
the strbuf's pointer (without duplicating the string) into our cmd_hist
array of recent commands. And then when we're about to read a new
command, we clear the strbuf by calling strbuf_detach(), dropping
ownership from the strbuf (leaving the cmd_hist reference as the
remaining owner).
This has a few surprising implications:
- if the strbuf hasn't been copied into cmd_hist (e.g., because we
haven't ready any commands yet), then the strbuf_detach() will leak
the resulting string
- any modification to command_buf risks invalidating the pointer held
by cmd_hist. There doesn't seem to be any way to trigger this
currently (since we tend to modify it only by detaching and reading
in a new value), but it's subtly dangerous.
- any pointers into an input string will remain valid as long as
cmd_hist points to them. So in general, you can point into
command_buf.buf and call read_next_command() up to 100 times before
your string is cycled out and freed, leaving you with a dangling
pointer. This makes it easy to miss bugs during testing, as they
might trigger only for a sufficiently large commit (e.g., the bug
fixed in the previous commit).
Instead, let's make a new string to copy the command into the history
array, rather than having dual ownership with the old. Then we can drop
the strbuf_detach() calls entirely, and just reuse the same buffer
within command_buf over and over. We'd normally have to strbuf_reset()
it before using it again, but in both cases here we're using
strbuf_getline(), which does it automatically for us.
This fixes the leak, and it means that even a single call to
read_next_command() will invalidate any held pointers, making it easier
to find bugs. In fact, we can drop the extra input lines added to the
test case by the previous commit, as the unfixed bug would now trigger
just from reading the commit message, even without any modified files in
the commit.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read each line of the fast-import stream into the command_buf strbuf.
When reading a commit, we parse a line like "encoding foo" by storing a
pointer to "foo", but not making a copy. We may then read an unbounded
number of other lines (e.g., one for each modified file in the commit),
each of which writes into command_buf.
This works out in practice for small cases, because we hand off
ownership of the heap buffer from command_buf to the cmd_hist array, and
read new commands into a fresh heap buffer. And thus the pointer to
"foo" remains valid as long as there aren't so many intermediate lines
that we end up dropping the original "encoding" line from the history.
But as the test modification shows, if we go over our default of 100
lines, we end up with our encoding string pointing into freed heap
memory. This seems to fail reliably by writing garbage into the output,
but running under ASan definitely detects this as a use-after-free.
We can fix it by duplicating the encoding value, just as we do for other
parsed lines (e.g., an author line ends up in parse_ident, which copies
it to a new string).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reverting or cherry-picking, one of the options we can pass the
sequencer is `--skip`. However, unlike rebasing, `--skip` is not
mentioned as a possible option in the status message. Mention it so that
users are more aware of their options.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though `--skip` is a valid command-line option for cherry-pick and
revert while they are in progress, it is not completed. Add this missing
option to the completion script.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since revert and cherry-pick share the same sequencer code, they should
both accept the same command-line options. Derive the
`__git_cherry_pick_inprogress_options` and
`__git_revert_inprogress_options` variables from
`__git_sequencer_inprogress_options` so that the options aren't
unnecessarily duplicated twice.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "same head" introduced in the preceding commit to check whether
the rebase.c code lands in the can_fast_forward() case in, and thus
prints out an "is up to date" and aborts early.
In some of these cases we make it past that and to "rewinding head",
then do a rebase, only to find out there's nothing to change so HEAD
stays at the same OID.
These tests presumed these two cases were the same thing. In terms of
where HEAD ends up they are, but we're not only interested in rebase
semantics, but also whether or not we're needlessly doing work when we
could avoid it entirely.
I'm adding "same" and "diff" here because I'll follow-up and add
--no-ff tests, where some of those will be "diff"-erent, so add the
"diff" code already.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebase is run on a branch that can be fast-forwarded, this should
automatically be done. Create test to ensure this behavior happens.
There are some cases that currently don't pass. The first case is where
a feature and master have diverged, running
"git rebase master... master" causes a full rebase to happen even though
a fast-forward should happen.
The second case is when we are doing "git rebase --fork-point" and a
fork-point commit is found. Once again, a full rebase happens even
though a fast-forward should happen.
Mark these cases as failure so we can fix it later.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git clean -fd' must not delete an untracked directory if it belongs
to a different Git repository or worktree. Unfortunately, if a
'.gitignore' rule in the outer repository happens to match a file in a
nested repository or worktree, then something goes awry and 'git clean
-fd' does delete the content of the nested repository's worktree
except that ignored file, potentially leading to data loss.
Add a test to 't7300-clean.sh' to demonstrate this breakage.
This issue is a regression introduced in 6b1db43109 (clean: teach
clean -d to preserve ignored paths, 2017-05-23).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code used both a macro and a variable to keep track if JIT
support was desired and relied on the fact that a non JIT
enabled library will ignore a request for JIT compilation
(as defined by the second parameter of the call to pcre_study)
Cleanup the multiple levels of macros used and call pcre_study
with the right parameter after JIT support has been confirmed
and unless it was requested to be disabled with NO_LIBPCRE1_JIT
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e87de7cab4 ("grep: un-break building with PCRE < 8.32", 2017-05-25)
added a restriction for JIT support that is no longer needed after
pcre_jit_exec() calls were removed.
Reorganize the definitions in grep.h so that JIT support could be
detected early and NO_LIBPCRE1_JIT could be used reliably to enforce
JIT doesn't get used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let warning() format the message instead of using an intermediate strbuf
for that. This is shorter, easier to read and avoids an allocation.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Append the strbuf buffer only after detaching it. There is no practical
difference here, as the strbuf is not empty and no strbuf_ function is
called between storing the pointer to the still attached buffer and
calling strbuf_detach(), so that pointer is valid, but make sure to
follow the standard sequence anyway for consistency.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_detach() has been returning a pointer to a buffer even for empty
strbufs since 08ad56f3f0 ("strbuf: always return a non-NULL value from
strbuf_detach", 2012-10-18). Use that feature in show_log() instead of
having it handle empty strbufs specially.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index field in the commit object is used to find the buffer
corresponding to that commit in the buffer_slab. Resetting it first
means free_commit_buffer is not going to free the right buffer.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a function to strip the path suffix from a commit, but we don't
have one to check for a path suffix. For a plain filename, we can use
basename, but that requires an allocation, since POSIX allows it to
modify its argument. Refactor strip_path_suffix into a helper function
and a new function, ends_with_path_components, to meet this need.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cc8fdaee1e (banned.h: mark sprintf() as banned, 2018-07-24), both
'sprintf()' and 'vsprintf()' were marked as banned functions. The
non-variadic macro to ban 'vsprintf' has a typo which says that
'sprintf', not 'vsprintf' is banned. The variadic version does not have
the same typo.
Fix this by updating the explicit form of 'vsprintf' as the banned
version of itself, not 'sprintf'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The note_tree_insert() function may free the leaf_node struct we pass in
(e.g., if it's a duplicate, or if it needs to be combined with an
existing note).
Most callers are happy with this, as they assume that ownership of the
struct is handed off. But in load_subtree(), if we see an error we'll
use the handed-off struct's key_oid to generate the die() message,
potentially accessing freed memory.
We can easily fix this by instead using the original oid that we copied
into the leaf_node struct.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When add_note is called multiple times with the same key/value pair, the
leaf_node it creates is leaked by notes_tree_insert.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If Git were installed in a path containing non-ASCII characters,
commands such as `git am` and `git submodule`, which are implemented as
externals, would fail to launch with the following error:
> fatal: 'am' appears to be a git command, but we were not
> able to execute it. Maybe git-am is broken?
This was due to lookup_prog not being Unicode-aware. It was somehow
missed in 85faec9d3a (Win32: Unicode file name support (except dirent),
2012-03-15).
Note that the only problem in this function was calling
`GetFileAttributes()` instead of `GetFileAttributesW()`. The calls to
`access()` were fine because `access()` is a macro which resolves to
`mingw_access()`, which already handles Unicode correctly. But
`lookup_prog()` was changed to use `_waccess()` directly so that we only
convert the path to UTF-16 once.
To make things work correctly, we have to maintain UTF-8 and UTF-16
versions in tandem in `lookup_prog()`.
Signed-off-by: Adam Roben <adam@roben.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working in the root directory of a file share (this is only
possible in Git Bash and Powershell, but not in CMD), the current
directory is reported without a trailing slash.
This is different from Unix and standard Windows directories: both / and
C:\ are reported with a trailing slash as current directories.
If a Git worktree is located there, Git is not quite prepared for that:
while it does manage to find the .git directory/file, it returns as
length of the top-level directory's path *one more* than the length of
the current directory, and setup_git_directory_gently() would then
return an undefined string as prefix.
In practice, this undefined string usually points to NUL bytes, and does
not cause much harm. Under rare circumstances that are really involved
to reproduce (and not reliably so), the reported prefix could be a
suffix string of Git's exec path, though.
A careful analysis determined that this bug is unlikely to be
exploitable, therefore we mark this as a regular bug fix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A very common assumption in Git's source code base is that
offset_1st_component() returns either 0 for relative paths, or 1 for
absolute paths that start with a slash. In other words, the return value
is either 0 or points just after the dir separator.
This assumption is not fulfilled when calling offset_1st_component()
e.g. on UNC paths on Windows, e.g. "//my-server/my-share". In this case,
offset_1st_component() returns the length of the entire string (which is
correct, because stripping the last "component" would not result in a
valid directory), yet the return value still does not point just after a
dir separator.
This assumption is most prominently seen in the
setup_git_directory_gently_1() function, where we want to append a
".git" component and simply assume that there is already a dir
separator. In the UNC example given above, this assumption is incorrect.
As a consequence, Git will fail to handle a worktree at the top of a UNC
share correctly.
Let's fix this by adding a dir separator specifically for that case: we
found that there is no first component in the path and it does not end
in a dir separator? Then add it.
This fixes https://github.com/git-for-windows/git/issues/1320
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the parser to accept file://server/share/repo in the way that
Windows users expect it to be parsed who are used to referring to file
shares by UNC paths of the form \\server\share\folder.
[jes: tightened check to avoid handling file://C:/some/path as a UNC
path.]
This closes https://github.com/git-for-windows/git/issues/1264.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding object IDs, compute them and use those in the
comparison. Note that the comparison code ignores the actual object
IDs, but does check that they're the right size, so computing them is
the easiest way to ensure that they are.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out the hard-coded object IDs and use test_oid to provide values
for both SHA-1 and SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use $ZERO_OID instead of hard-coding a fixed size all-zeros object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Abstract away the SHA-1-specific constants by sanitizing diff output to
remove the index lines, since it's clear from the assertions in question
that we are not interested in the specific object IDs.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the procedure apply_or_revert_range_or_line, if the patch does not
apply successfully, a dialog is shown, but execution proceeds after
that. Instead, return early on error so the parts that come after this
don't work on top of an error state.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Just like the user can select lines to stage or unstage, add the
ability to revert selected lines.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
The only transport that does not allow fetch() to be called before
get_refs_list() is the bundle transport. Clean up the code by teaching
the bundle transport the ability to do this, and removing support for
transports that don't support this order of invocation.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e70a3030e7 ("fetch: do not list refs if fetching only hashes",
2018-10-07) and its ancestors taught Git, as an optimization, to skip
the ls-refs step when it is not necessary during a protocol v2 fetch
(for example, when lazy fetching a missing object in a partial clone, or
when running "git fetch --no-tags <remote> <SHA-1>"). But that was only
done for natively supported protocols; in particular, HTTP was not
supported.
Teach Git to skip ls-refs when using remote helpers that support connect
or stateless-connect. To do this, fetch() is made an acceptable entry
point. Because fetch() can now be the first function in the vtable
called, "get_helper(transport);" has to be added to the beginning of
that function to set the transport up (if not yet set up) before
process_connect() is invoked.
When fetch() is called, the transport could be taken over (this happens
if "connect" or "stateless-connect" is successfully run without any
"fallback" response), or not. If the transport is taken over, execution
continues like execution for natively supported protocols
(fetch_refs_via_pack() is executed, which will fetch refs using ls-refs
if needed). If not, the remote helper interface will invoke
get_refs_list() if it hasn't been invoked yet, preserving existing
behavior.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test & perf scripts must use unique numeric prefix, but a pair
shared the same number, which is fixed here.
* jk/perf-no-dups:
t/perf: rename duplicate-numbered test script
Compilation fix.
* rs/nedalloc-fixlets:
nedmalloc: avoid compiler warning about unused value
nedmalloc: do assignments only after the declaration section
The first line of verbose output from each test piece now carries
the test name and number to help scanning with eyeballs.
* sg/show-failed-test-names:
tests: show the test name and number at the start of verbose output
t0000-basic: use realistic test script names in the verbose tests
The code to write commit-graph over given commit object names has
been made a bit more robust.
* sg/commit-graph-validate:
commit-graph: error out on invalid commit oids in 'write --stdin-commits'
commit-graph: turn a group of write-related macro flags into an enum
t5318-commit-graph: use 'test_expect_code'
"git checkout" and "git restore" to re-populate the index from a
tree-ish (typically HEAD) did not work correctly for a path that
was removed and then added again with the intent-to-add bit, when
the corresponding working tree file was empty. This has been
corrected.
* vn/restore-empty-ita-corner-case-fix:
restore: add test for deleted ita files
checkout.c: unstage empty deleted ita files
"git pack-refs" can lose refs that are created while running, which
is getting corrected.
* sc/pack-refs-deletion-racefix:
pack-refs: always refresh after taking the lock file
Test fix.
* sg/do-not-skip-non-httpd-tests:
t: warn against adding non-httpd-specific tests after sourcing 'lib-httpd'
t5703: run all non-httpd-specific tests before sourcing 'lib-httpd.sh'
t5510-fetch: run non-httpd-specific test before sourcing 'lib-httpd.sh'
Codepaths to walk tree objects have been audited for integer
overflows and hardened.
* jk/tree-walk-overflow:
tree-walk: harden make_traverse_path() length computations
tree-walk: add a strbuf wrapper for make_traverse_path()
tree-walk: accept a raw length for traverse_path_len()
tree-walk: use size_t consistently
tree-walk: drop oid from traverse_info
setup_traverse_info(): stop copying oid
"git grep --recurse-submodules" that looks at the working tree
files looked at the contents in the index in submodules, instead of
files in the working tree.
* mt/grep-submodules-working-tree:
grep: fix worktree case in submodules
In t0021.15 one of the things we are checking is that the clean filter
is run when checking out empty-branch. The clean filter needs to be
run to make sure there are no modifications on the file system for the
test.r file, and thus it isn't dangerous to overwrite it.
However in the current test setup it is not always necessary to run
the clean filter, and thus the test sometimes fails, as debug.log
isn't written.
This happens when test.r has an older mtime than the index itself.
That mtime is also recorded as stat data for test.r in the index, and
based on the heuristic we're using for index entries, git correctly
assumes this file is up-to-date.
Usually this test succeeds because the mtime of test.r is the same as
the mtime of the index. In this case test.r is racily clean, so git
actually checks the contents, for which the clean filter is run.
Fix the test by updating the mtime of test.r, so git is forced to
check the contents of the file, and the clean filter is run as the
test expects.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Formatting $(taggername) on headerless tags such as v0.99 in Git
causes a SIGABRT with error "munmap_chunk(): invalid pointer",
because of an oversight in commit f0062d3b74 (ref-filter: free
item->value and item->value->s, 2018-10-19).
Signed-off-by: Mischa POSLAWSKY <git@shiar.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Linux kernel receives many patches to the devicetree files each
release. The hunk header for those patches typically show nothing,
making it difficult to figure out what node is being modified without
applying the patch or opening the file and seeking to the context. Let's
add a builtin 'dts' pattern to git so that users can get better diff
output on dts files when they use the diff=dts driver.
The regex has been constructed based on the spec at devicetree.org[1]
and with some help from Johannes Sixt.
[1] https://github.com/devicetree-org/devicetree-specification/releases/latest
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With rename detection enabled the line-level log is able to trace the
evolution of line ranges across whole-file renames [1]. Alas, to
achieve that it uses the diff machinery very inefficiently, making the
operation very slow [2]. And since rename detection is enabled by
default, the line-level log is very slow by default.
When the line-level log processes a commit with rename detection
enabled, it currently does the following (see queue_diffs()):
1. Computes a full tree diff between the commit and (one of) its
parent(s), i.e. invokes diff_tree_oid() with an empty
'diffopt->pathspec'.
2. Checks whether any paths in the line ranges were modified.
3. Checks whether any modified paths in the line ranges are missing
in the parent commit's tree.
4. If there is such a missing path, then calls diffcore_std() to
figure out whether the path was indeed renamed based on the
previously computed full tree diff.
5. Continues doing stuff that are unrelated to the slowness.
So basically the line-level log computes a full tree diff for each
commit-parent pair in step (1) to be used for rename detection in step
(4) in the off chance that an interesting path is missing from the
parent.
Avoid these expensive and mostly unnecessary full tree diffs by
limiting the diffs to paths in the line ranges. This is much cheaper,
and makes step (2) unnecessary. If it turns out that an interesting
path is missing from the parent, then fall back and compute a full
tree diff, so the rename detection will still work.
Care must be taken when to update the pathspec used to limit the diff
in case of renames. A path might be renamed on one branch and
modified on several parallel running branches, and while processing
commits on these branches the line-level log might have to alternate
between looking at a path's new and old name. However, at any one
time there is only a single 'diffopt->pathspec'.
So add a step (0) to the above to ensure that the paths in the
pathspec match the paths in the line ranges associated with the
currently processed commit, and re-parse the pathspec from the paths
in the line ranges if they differ.
The new test cases include a specially crafted piece of history with
two merged branches and two files, where each branch modifies both
files, renames on of them, and then modifies both again. Then two
separate 'git log -L' invocations check the line-level log of each of
those two files, which ensures that at least one of those invocations
have to do that back-and-forth between the file's old and new name (no
matter which branch is traversed first). 't/t4211-line-log.sh'
already contains two tests involving renames, they don't don't trigger
this back-and-forth.
Avoiding these unnecessary full tree diffs can have huge impact on
performance, especially in big repositories with big trees and mergy
history. Tracing the evolution of a function through the whole
history:
# git.git
$ time git --no-pager log -L:read_alternate_refs:sha1-file.c v2.23.0
Before:
real 0m8.874s
user 0m8.816s
sys 0m0.057s
After:
real 0m2.516s
user 0m2.456s
sys 0m0.060s
# linux.git
$ time ~/src/git/git --no-pager log \
-L:build_restore_work_registers:arch/mips/mm/tlbex.c v5.2
Before:
real 3m50.033s
user 3m48.041s
sys 0m0.300s
After:
real 0m2.599s
user 0m2.466s
sys 0m0.157s
That's just over 88x speedup.
[1] Line-level log's rename following is quite similar to 'git log
--follow path', with the notable differences that it does handle
multiple paths at once as well, and that it doesn't show the
commit performing the rename if it's an exact rename.
[2] This slowness might not have been apparent initially, because back
when the line-level log feature was introduced rename detection
was not yet enabled by default; 12da1d1f6f (Implement line-history
search (git log -L), 2013-03-28) and 5404c116aa (diff: activate
diff.renames by default, 2016-02-25).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A helper function to parse the paths involved in the line ranges and
to turn them into a pathspec will be useful in the next patch.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08),
diff was taught to batch the fetching of missing objects when operating
on a partial clone, but was not taught to refrain from fetching
GITLINKs. Teach diff to check if an object is a GITLINK before including
it in the set to be fetched.
(As stated in the commit message of that commit, unpack-trees was also
taught a similar thing prior, but unpack-trees correctly checks for
GITLINK before including objects in the set to be fetched.)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use QSORT_S instead of QSORT, which allows passing the repository
pointer to the comparison function without using a static variable.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define enum parse_opt_result before using it in a typedef. This avoids
the following compiler warning:
./parse-options.h:53:14: error: ISO C forbids forward references to 'enum' types [-Werror,-Wpedantic]
typedef enum parse_opt_result parse_opt_ll_cb(struct parse_opt_ctx_t *ctx,
^
While GCC and Clang both accept such a forward reference by default,
other compilers might be less forgiving.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 336226c259 (packfile.h: drop extern from function declarations,
2019-04-05), `extern` was removed from function declarations because
it's redundant. However, in 8434e85d5f (repack: refactor pack deletion
for future use, 2019-06-10), an `extern` was mistakenly included.
Remove this spurious `extern`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace references to several hard-coded object IDs with a variable
referring to the generated commit. Avoid matching on exact character
positions, which will be different depending on the hash in use. In the
test for a valid object ID, use an obviously invalid one from the lookup
table.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding a fixed length invalid object ID in the test,
compute one using the lookup tables.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test uses a hard-coded object ID to ensure that the result of
cherry-pick --ff is correct. Use test_oid to make this work for both
SHA-1 and SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute the object IDs used in the todo list instead of hard-coding
them.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes. Add a use of $EMPTY_TREE instead of a
hard-coded value. Remove a comment about hard-coded hashes which is no
longer applicable.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes. Convert some single-line heredocs into inline
uses of echo now that they can be expressed succinctly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding 40-character shell patterns, use grep to
determine if all of the paths have either zero or one levels of fanout,
as appropriate.
Note that the final test is implicitly dependent on the hash algorithm.
Depending on the algorithm in use, the fanout may or may not completely
compress. In its current state, this is not a problem, but it could be
if the hash algorithm changes again.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes. Move some invocations of test_commit around so
that we can compute the object IDs for these commits.
Compute several object IDs in the tests instead of using hard-coded
values so that the test works with any hash algorithm. Since the actual
values are sorted by the object ID of the object being annotated, sort
the expected values accordingly as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The various short object IDs in the range-diff output differ between
hash algorithms. Use test_oid_cache to look up values for both SHA-1
and SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the description of 'objects/pack', 'object' should be
pluralized to match the subject and agree with the
rest of the sentence.
Signed-off-by: Ben Milman <bpmilman@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adapt try_to_commit() to create a new root commit rather than special
casing this in run_git_commit(). This significantly reduces the amount of
special case code for creating the root commit and reduces the number of
commit code paths we have to worry about.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While a rebase is stopped for the user to edit a commit message it can
be convenient for them to also edit the todo list. The scripted version
of rebase supported this but the C version does not. We already check to
see if the todo list has been updated by an exec command so extend this
to rewords and squashes. It only costs a single stat call to do this so
it should not affect the speed of the rebase (especially as it has just
stopped for the user to edit a message)
Note that for squashes the editor may be opened on a different pick to
the squash itself as we edit the message at the end of a chain fixups
and squashes.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user runs git log while rewording a commit it is confusing if
sometimes we're amending the commit that's being reworded and at other
times we're creating a new commit depending on whether we could
fast-forward or not[1]. Fix this inconsistency by always committing the
picked commit and then running 'git commit --amend' to do the reword.
The first commit is performed by the sequencer without forking git
commit and does not impact on the speed of rebase. In a test rewording
100 commits with
GIT_EDITOR=true GIT_SEQUENCE_EDITOR='sed -i s/pick/reword/' \
../bin-wrappers/git rebase -i --root
and taking the best of three runs the current master took
957ms and with this patch it took 961ms.
This change fixes rewording the new root commit when rearranging commits
with --root.
Note that the new code no longer updates CHERRY_PICK_HEAD after creating
a root commit - I'm not sure why the old code was that creating that ref
after a successful commit, everywhere else it is removed after a
successful commit.
[1] https://public-inbox.org/git/xmqqlfvu4be3.fsf@gitster-ct.c.googlers.com/T/#m133009cb91cf0917bcf667300f061178be56680a
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the hash size, use the_hash_algo to look up the
hash size at runtime. Remove the #define constant which was used to
hold the hash length, since writing the expression with the_hash_algo
provide enough documentary value on its own.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this code path, we use sha1_to_hex to display the contents of a v1
pack index. While we plan to switch to v3 indices for SHA-256, the v1
pack indices still function, so to support both algorithms, switch
sha1_to_hex to hash_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace the uses of sha1_to_hex in this function with hash_to_hex to
allow the use of SHA-256 as well. Rename a variable since it is no
longer limited to SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex.
Rename several variables to indicate that they can contain any hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex so
this code works with other algorithms.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a use of sha1_to_hex with hash_to_hex so that this code works
with a hash algorithm other than SHA-1.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change struct wt_status to use struct object_id instead of an array of
unsigned char.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the existing uses of null_sha1 can be converted into uses of
null_oid, so do so. Remove null_sha1 and is_null_sha1, and define
is_null_oid in terms of null_oid. This also has the additional benefit
of removing several uses of sha1_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the_hash_algo when calling xwrite with a hex object ID so that the
proper amount of data is written.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pack checksums always use the current hash algorithm in use, so switch
from sha1_to_hex to hash_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert several uses of GIT_SHA1_HEXSZ constants to be references to
the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using GIT_SHA1_HEXSZ, use the_hash_algo so that the code is
hash size independent.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch a hard-coded 18 to be a reference to the_hash_algo. Rename the
sha1 parameter to be called hash instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch a use of the constant 40 and a use of GIT_SHA1_HEXSZ to use
the_hash_algo instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch various uses of GIT_SHA1_HEXSZ to reference the_hash_algo
instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index uses a variety of hard-coded constants to enumerate the
values in pack indices. Switch these hard-coded constants to use
the_hash_algo or GIT_MAX_RAWSZ, as appropriate, and convert one instance
of an unsigned char array to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When faking a working tree commit, we read in lines from MERGE_HEAD into
a strbuf. Because the strbuf is NUL-terminated and get_oid_hex will
fail if it unexpectedly encounters a NUL, the check for the length of
the line is unnecessary. There is no optimization benefit from this
case, either, since on failure we call die. Remove this check, since it
is no longer needed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch several hard-coded uses of the constant 40 to references to
the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The push cert code uses HMAC-SHA-1 to create a nonce. This is a secure
use of SHA-1 which is not affected by its collision resistance (or lack
thereof). However, it makes sense for us to use a better algorithm if
one is available, one which may even be more performant. Futhermore,
until we have specialized functions for computing the hex value of an
arbitrary function, it simplifies the code greatly to use the same hash
algorithm everywhere.
Switch this code to use GIT_MAX_BLKSZ and the_hash_algo for computing
the push cert nonce, and rename the hmac_sha1 function to simply "hmac".
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding constants, use parse_oid_hex to compute a pointer
and use it in further parsing operations.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the two separate patch-id implementations to use the_hash_algo
in their implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using GIT_SHA1_HEXSZ and hard-coded constants, switch to
using the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the --set-upstream option to git pull/fetch
which lets the user set the upstream configuration
(branch.<current-branch-name>.merge and
branch.<current-branch-name>.remote) for the current branch.
A typical use-case is:
git clone http://example.com/my-public-fork
git remote add main http://example.com/project-main-repo
git pull --set-upstream main master
or, instead of the last line:
git fetch --set-upstream main master
git merge # or git rebase
This is mostly equivalent to cloning project-main-repo (which sets
upsteam) and then "git remote add" my-public-fork, but may feel more
natural for people using a hosting system which allows forking from
the web UI.
This functionality is analog to "git push --set-upstream".
Signed-off-by: Corentin BOMPARD <corentin.bompard@etu.univ-lyon1.fr>
Signed-off-by: Nathan BERBEZIER <nathan.berbezier@etu.univ-lyon1.fr>
Signed-off-by: Pablo CHABANNE <pablo.chabanne@etu.univ-lyon1.fr>
Signed-off-by: Matthieu Moy <git@matthieu-moy.fr>
Patch-edited-by: Matthieu Moy <git@matthieu-moy.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're confident our pax extended header calculation is correct,
turn the criticality of the assertion up to the maximum, from warning
right up to BUG. Simplify the test, as the stderr comparison step would
not be reached in case the BUG message is triggered.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of its callers already passes in a size_t value. Use it
consistently in this function.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A pax extended header record starts with a decimal number. Its value
is the length of the whole record, including its own length.
The calculation of that number in strbuf_append_ext_header() is off by
one in case the length of the rest is close to a higher order of
magnitude. This affects paths and link targets a bit shorter than 1000,
10000, 100000 etc. characters -- paths with a length of up to 100 fit
into the tar header and don't need a pax extended header.
The mistake has been present since the function was added by ae64bbc18c
("tar-tree: Introduce write_entry()", 2006-03-25).
Account for digits added to len during the loop and keep incrementing
until we have enough space for len and the rest. The crucial change is
to check against the current value of len before each iteration, instead
of against its value before the loop.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extended header entries contain a length value that is a bit tricky to
calculate because it includes its own length (number of decimal digits)
as well. We get it wrong in corner cases. Add a check, report wrong
results as a warning and add a test for exercising it.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Other than cache.h which needs to appear first, and merge-recursive.h
which I want to be second so that we are more likely to notice if
merge-recursive.h has any missing includes, the rest of the list is
long and easier to look through if it's alphabetical.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are lots of options that callers can set, yet most have a limited
range of valid values, some options are meant for output (e.g.
opt->obuf, which is expected to start empty), and callers are expected
to not set opt->priv. Add several sanity checks to ensure callers
provide sane values.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I want to implement the same outward facing API as found within
merge-recursive.h in a different merge strategy. However, that makes
names like MERGE_RECURSIVE_{NORMAL,OURS,THEIRS} look a little funny;
rename to MERGE_VARIANT_{NORMAL,OURS,THEIRS}.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_options has several internal fields that should not be set or read
by external callers. This just complicates the API. Move them into an
opaque merge_options_internal struct that is defined only in
merge-recursive.c and keep these out of merge-recursive.h.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If opt->buffer_output is less than 2, then merge_trees(),
merge_recursive(), and merge_recursive_generic() are all supposed to
flush the opt->obuf output buffer to stdout and release any memory it
holds. merge_trees() did not do this. Move the logic that handles this
for merge_recursive_internal() to merge_finalize() so that all three
methods handle this requirement.
Note that this bug didn't cause any problems currently, because there
are only two callers of merge_trees() right now (a git grep for
'merge_trees(' is misleading because builtin/merge-tree.c also defines a
'merge_tree' function that is unrelated), and only one of those is
called with buffer_output less than 2 (builtin/checkout.c), but it set
opt->verbosity to 0, for which there is only currently one non-error
message that would be shown: "Already up to date!". However, that one
message can only occur when the merge is utterly trivial (the merge base
tree exactly matches the merge tree), and builtin/checkout.c already
attempts a trivial merge via unpack_trees() before falling back to
merge_trees().
Also, if opt->buffer_output is 2, then the caller is responsible to
handle showing any output in opt->obuf and for free'ing it. This
requirement might be easy to overlook, so add a comment to
merge-recursive.h pointing it out. (There are currently two callers
that set buffer_output to 2, both in sequencer.c, and both of which
handle this correctly.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge_options struct had lots of fields, making it a little
imposing, but the options naturally fall into multiple different groups.
Grouping similar options and adding a comment or two makes it easier to
read, easier for new folks to figure out which options are related, and
thus easier for them to find the options they need.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We provided users with the ability to state whether they wanted rename
detection, and to put a limit on how much CPU would be spent. Both of
these fields had multiple configuration parameters for setting them,
with one being a fallback and the other being an override. However,
instead of implementing the logic for how to combine the multiple
source locations into the appropriate setting at config loading time,
we loaded and tracked both values and then made the code combine them
every time it wanted to check the overall value. This had a few
minor drawbacks:
* it seems more complicated than necessary
* it runs the risk of people using the independent settings in the
future and breaking the intent of how the options are used
together
* it makes merge_options more complicated than necessary for other
potential users of the API
Fix these problems by moving the logic for combining the pairs of
options into a single value; make it apply at time-of-config-loading
instead of each-time-of-use.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No substantive code changes (view this with diff --color-moved), but
a few small code cleanups:
* Move structs and an inline function only used by merge-recursive.c
into merge-recursive.c
* Re-order function declarations to be more logical
* Add or fix some explanatory comments
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 259ccb6cc3 ("merge-recursive: rename merge_options argument
from 'o' to 'opt'", 2019-04-05), I renamed a bunch of function
arguments in merge-recursive.c, but forgot to make that same change to
merge-recursive.h. Make the two match.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not at all clear what 'mr' was supposed to stand for, at least not
to me. Pick a clearer name for this variable.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_trees(), merge_recursive(), and merge_recursive_generic() in
their function headers used four different names for the merge base or
list of merge bases they were passed:
* 'common'
* 'ancestors'
* 'ca'
* 'base_list'
They were able to refer to it four different ways instead of only three
by using a different name in the signature for the .c file than the .h
file. Change all of these to 'merge_base' or 'merge_bases'.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No substantive code change, just add some line breaks to fix lines that
have grown in length due to various refactorings. Most remaining lines
of excessive length in merge-recursive include error messages and it's
not clear that splitting those improves things.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_tree_from_memory() appeared to be a merge-recursive special that
basically duplicated write_index_as_tree(). The two have a different
signature, but the bigger difference was just that write_index_as_tree()
would always unconditionally read the index off of disk instead of
working on the current in-memory index. So:
* split out common code into write_index_as_tree_internal()
* rename write_tree_from_memory() to write_inmemory_index_as_tree(),
make it call write_index_as_tree_internal(), and move it to
cache-tree.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alternatively, you can view this as "make the merge functions behave
more similarly." merge-recursive has three different entry points:
merge_trees(), merge_recursive(), and merge_recursive_generic(). Two of
these would call diff_warn_rename_limit(), but merge_trees() didn't.
This lead to callers of merge_trees() needing to manually call
diff_warn_rename_limit() themselves. Move this to the new
merge_finalize() function to make sure that all three entry points run
this function.
Note that there are two external callers of merge_trees(), one in
sequencer.c and one in builtin/checkout.c. The one in sequencer.c is
cleaned up by this patch and just transfers where the call to
diff_warn_rename_limit() is made; the one in builtin/checkout.c is for
switching to a different commit and in the very rare case where the
warning might be triggered, it would probably be helpful to include
(e.g. if someone is modifying a file that has been renamed in moving to
the other commit, but there are so many renames between the commits that
the limit kicks in and none are detected, it may help to have an
explanation about why they got a delete/modify conflict instead of a
proper content merge in a renamed file).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_trees() took a results parameter that would only be written when
opt->call_depth was positive, which is never the case now that
merge_trees_internal() has been split from merge_trees(). Remove the
misleading and unused parameter from merge_trees().
While at it, add some comments explaining how the output of
merge_trees() and merge_recursive() differ.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had a rule to enforce that the index matches head, but it was found
at the beginning of merge_trees() and would only trigger when
opt->call_depth was 0. Since merge_recursive() doesn't call
merge_trees() until after returning from recursing, this meant that the
check wasn't triggered by merge_recursive() until it had first finished
all the intermediate merges to create virtual merge bases. That is a
potentially huge amount of computation (and writing of intermediate
merge results into the .git/objects directory) before it errors out and
says, in effect, "Sorry, I can't do any merging because you have some
local changes that would be overwritten."
Further, not enforcing this requirement earlier allowed other bugs (such
as an unintentional unconditional dropping and reloading of the index in
merge_recursive() even when no recursion was necessary), to mask bugs in
other callers (which were fixed in the commit prior to this one).
Make sure we do the index == head check at the beginning of the merge,
and error out immediately if it fails. While we're at it, fix a small
leak in the show-the-error codepath.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the bug that just won't die; there always seems to be another
form of it somewhere. See the commit message of 55f39cf755 ("merge:
fix misleading pre-merge check documentation", 2018-06-30) for a more
detailed explanation), but in short:
<quick summary>
builtin/merge.c contains this important requirement for merge
strategies:
...the index must be in sync with the head commit. The strategies are
responsible to ensure this.
This condition is important to enforce because there are two likely
failure cases when the index isn't in sync with the head commit:
* we silently throw away changes the user had staged before the merge
* we accidentally (and silently) include changes in the merge that
were not part of either of the branches/trees being merged
Discarding users' work and mis-merging are both bad outcomes, especially
when done silently, so naturally this rule was stated sternly -- but,
unfortunately totally ignored in practice unless and until actual bugs
were found. But, fear not: the bugs from this were fixed in commit
ee6566e8d7 ("[PATCH] Rewrite read-tree", 2005-09-05)
through a rewrite of read-tree (again, commit 55f39cf755 has a more
detailed explanation of how this affected merge). And it was fixed
again in commit
160252f816 ("git-merge-ours: make sure our index matches HEAD", 2005-11-03)
...and it was fixed again in commit
3ec62ad9ff ("merge-octopus: abort if index does not match HEAD", 2016-04-09)
...and again in commit
65170c07d4 ("merge-recursive: avoid incorporating uncommitted changes in a merge", 2017-12-21)
...and again in commit
eddd1a411d ("merge-recursive: enforce rule that index matches head before merging", 2018-06-30)
...with multiple testcases added to the testsuite that could be
enumerated in even more commits.
Then, finally, in a patch in the same series as the last fix above, the
documentation about this requirement was fixed in commit 55f39cf755
("merge: fix misleading pre-merge check documentation", 2018-06-30), and
we all lived happily ever after...
</quick summary>
Unfortunately, "ever after" apparently denotes a limited time and it
expired today. The merge-recursive rule to enforce that index matches
head was at the beginning of merge_trees() and would only trigger when
opt->call_depth was 0. Since merge_recursive() doesn't call
merge_trees() until after returning from recursing, this meant that the
check wasn't triggered by merge_recursive() until it had first finished
all the intermediate merges to create virtual merge bases. That is a
potentially HUGE amount of computation (and writing of intermediate
merge results into the .git/objects directory) before it errors out and
says, in effect, "Sorry, I can't do any merging because you have some
local changes that would be overwritten."
Trying to enforce that all of merge_trees(), merge_recursive(), and
merge_recursive_generic() checked the index == head condition earlier
resulted in a bunch of broken tests. It turns out that
merge_recursive() has code to drop and reload the cache while recursing
to create intermediate virtual merge bases, but unfortunately that code
runs even when no recursion is necessary. This unconditional dropping
and reloading of the cache masked a few bugs:
* builtin/merge-recursive.c: didn't even bother loading the index.
* builtin/stash.c: feels like a fake 'builtin' because it repeatedly
invokes git subprocesses all over the place, mixed with other
operations. In particular, invoking "git reset" will reset the
index on disk, but the parent process that invoked it won't
automatically have its in-memory index updated.
* t3030-merge-recursive.h: this test has always been broken in that it
didn't make sure to make index match head before running. But, it
didn't care about the index or even the merge result, just the
verbose output while running. While commit eddd1a411d
("merge-recursive: enforce rule that index matches head before
merging", 2018-06-30) should have uncovered this broken test, it
used a test_must_fail wrapper around the merge-recursive call
because it was known that the merge resulted in a rename/rename
conflict. Thus, that fix only made this test fail for a different
reason, and since the index == head check didn't happen until after
coming all the way back out of the recursion, the testcase had
enough information to pass the one check that it did perform.
So, load the index in builtin/merge-recursive.c, reload the in-memory
index in builtin/stash.c, and modify the t3030 testcase to correctly
setup the index and make sure that the test fails in the expected way
(meaning it reports a rename/rename conflict). This makes sure that
all callers actually make the index match head. The next commit will
then enforce the condition that index matches head earlier so this
problem doesn't return in the future.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d7cf3a96e9 ("merge-recursive.c: remove implicit dependency on
the_repository", 2019-01-12) and follow-ups like commit 34e7771bc6
("Use the right 'struct repository' instead of the_repository",
2019-06-27), removed most implicit uses of the_repository. Convert
calls to get_commit_tree() to instead use repo_get_commit_tree() to get
rid of another.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a 'free_buf' label to which all but one of the error paths in
update_file_flags() jump; that error case involves a NULL buf and is
thus not a memory leak. However, make that error case execute the same
deallocation code anyway so that if anyone adds any additional memory
allocations or deallocations, then all error paths correctly deallocate
resources.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve code readability by introducing an enum to replace the
not-quite-boolean values taken on by detect_directory_renames.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 7ca56aa076 ("merge-recursive: add a label for ancestor",
2010-03-20), a label was added for the '||||||' line to make it have
the more informative heading '|||||| merged common ancestors', with
the statement:
It would be nicer to use a more informative label. Perhaps someone
will provide one some day.
This chosen label was perfectly reasonable when recursiveness kicks in,
i.e. when there are multiple merge bases. (I can't think of a better
label in such cases.) But it is actually somewhat misleading when there
is a unique merge base or no merge base. Change this based on the
number of merge bases:
>=2: "merged common ancestors"
1: <abbreviated commit hash>
0: "<empty tree>"
Tests have also been added to check that we get the right ancestor name
for each of the three cases.
Also, since merge_recursive() and merge_trees() have polar opposite
pre-conditions for opt->ancestor, document merge_recursive()'s
pre-condition with an assertion. (An assertion was added to
merge_trees() already a few commits ago.) The differences in
pre-conditions stem from two factors: (1) merge_trees() does not recurse
and thus does not have multiple sub-merges to worry about -- each of
which would require a different value for opt->ancestor, (2)
merge_trees() is only passed trees rather than commits and thus cannot
internally guess as good of a label. Thus, while external callers of
merge_trees() are required to provide a non-NULL opt->ancestor,
merge_recursive() expects to set this value itself.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We always want our conflict hunks to be labelled so that users can know
where each came from. The previous commit fixed the one caller in the
codebase which was not setting opt->ancestor (and thus not providing a
label for the "merge base" conflict hunk in diff3-style conflict
markers); add an assertion to prevent future codepaths from also
overlooking this requirement.
Enforcing this requirement also allows us to simplify the code for
labelling the conflict hunks by no longer checking if the ancestor label
is NULL.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git checkout -m' and using diff3 style conflict markers,
we want all the conflict hunks (left-side, "common" or "merge base", and
right-side) to have label markers letting the user know where each came
from. The "common" hunk label (o.ancestor) came from
old_branch_info->name, but that is NULL when HEAD is detached, which
resulted in a blank label. Check for that case and provide an
abbreviated commit hash instead.
(Incidentally, this was the only case in the git codebase where
merge_trees() was called with opt->ancestor being NULL. A subsequent
commit will prevent similar problems by enforcing that merge_trees()
always be called with opt->ancestor != NULL.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 8daec1df03 ("merge-recursive: switch from (oid,mode) pairs
to a diff_filespec", 2019-04-05), an assertion on a->path && b->path
was added for code readability to document that these both needed to be
non-NULL at this point in the code. However, the subsequent lines also
read o->path, so it should be included in the assert.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both commit a7256debd4 ("checkout.txt: note about losing staged
changes with --merge", 2019-03-19) from nd/checkout-m-doc-update and
commit 6eff409e8a ("checkout: prevent losing staged changes with
--merge", 2019-03-22) from nd/checkout-m were included in git.git
despite the fact that the latter was meant to be v2 of the former.
The merge of these two topics resulted in a redundant chunk of code;
remove it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f0ed8226c9 (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31)
never told cURL about it.
Correct that by using the cURL initializer available since version 7.12 to
point to xmalloc and friends for consistency which then will pass the
allocation requests along when USE_NED_ALLOCATOR=YesPlease is used (most
likely in Windows)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The indent heuristic started out as experimental, but it's now our
default diff heuristic since 33de716387 (diff: enable indent heuristic
by default, 2017-05-08). Alas, that commit didn't update the
documentation, and the description of the 'diff.indentHeuristic'
configuration variable still implies that it's experimental and not
the default.
Update the description of 'diff.indentHeuristic' to make it clear that
it's the default diff heuristic.
The description of the related '--indent-heuristic' option has already
been updated in bab76141da (diff: --indent-heuristic is no
longer experimental, 2017-10-29).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'feature.experimental' setting includes config options that are
not committed to become defaults, but could use additional testing.
Update the following config settings to take new defaults, and to
use the repo_settings struct if not already using them:
* 'pack.useSparse=true'
* 'fetch.negotiationAlgorithm=skipping'
In the case of fetch.negotiationAlgorithm, the existing logic
would load the config option only when about to use the setting,
so had a die() statement on an unknown string value. This is
removed as now the config is parsed under prepare_repo_settings().
In general, this die() is probably misplaced and not valuable.
A test was removed that checked this die() statement executed.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The feature.manyFiles setting is suitable for repos with many
files in the working directory. By setting index.version=4 and
core.untrackedCache=true, commands such as 'git status' should
improve.
While adding this setting, modify the index version precedence
tests to check how this setting overrides the default for
index.version is unset.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.untrackedCache config setting is slightly complicated,
so clarify its use and centralize its parsing into the repo
settings.
The default value is "keep" (returned as -1), which persists the
untracked cache if it exists.
If the value is set as "false" (returned as 0), then remove the
untracked cache if it exists.
If the value is set as "true" (returned as 1), then write the
untracked cache and persist it.
Instead of relying on magic values of -1, 0, and 1, split these
options into an enum. This allows the use of "-1" as a
default value. After parsing the config options, if the value is
unset we can initialize it to UNTRACKED_CACHE_KEEP.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph feature has seen a lot of activity in the past
year or so since it was introduced. The feature is a critical
performance enhancement for medium- to large-sized repos, and
does not significantly hurt small repos.
Change the defaults for core.commitGraph and gc.writeCommitGraph
to true so users benefit from this feature by default.
There are several places in the test suite where the environment
variable GIT_TEST_COMMIT_GRAPH is disabled to avoid reading a
commit-graph, if it exists. The config option overrides the
environment, so swap these. Some GIT_TEST_COMMIT_GRAPH assignments
remain, and those are to avoid writing a commit-graph when a new
commit is created.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t6501-freshen-objects.sh sends the standard error from
'git gc' to a file and verifies that it is empty. This
is intended as a way to ensure no warnings are written
during the operation. However, as the commit-graph is
added as a step to 'git gc', its progress will appear
in the output.
Pass the '-q' argument to avoid a failing test case
when progress is written.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few important config settings that are not loaded
during git_default_config. These are instead loaded on-demand.
Centralize these config options to a single scan, and store
all of the values in a repo_settings struct. The values for
each setting are initialized as negative to indicate "unset".
This centralization will be particularly important in a later
change to introduce "meta" config settings that change the
defaults for these config settings.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid data loss, 'git worktree remove' refuses to delete a worktree
if it's dirty or contains untracked files. However, the error message
only mentions that the worktree "is dirty", even if the worktree in
question is in fact clean, but contains untracked files:
$ git worktree add test-worktree
Preparing worktree (new branch 'test-worktree')
HEAD is now at aa53e60 Initial
$ >test-worktree/untracked-file
$ git worktree remove test-worktree/
fatal: 'test-worktree/' is dirty, use --force to delete it
$ git -C test-worktree/ diff
$ git -C test-worktree/ diff --cached
$ # Huh? Where are those dirty files?!
Clarify this error message to say that the worktree "contains modified
or untracked files".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Completing configuration sections and variable names for the stuck
argument of 'git clone --config=<TAB>' requires a bit of extra care
compared to doing the same for the unstuck argument of 'git clone
--config <TAB>', because we have to deal with that '--config=' being
part of the current word to be completed.
Add an option to the __git_complete_config_variable_name_and_value()
and in turn to the __git_complete_config_variable_name() helper
functions to specify the current section/variable name to be
completed, so they can be used even when completing the stuck argument
of '--config='.
__git_complete_config_variable_value() already has such an option, and
thus no further changes were necessary to complete possible values
after 'git clone --config=section.name=<TAB>'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commits taught the completion script how to complete
configuration section, variable names, and their valus after 'git -c
<TAB>', and with a bit of foresight encapsulated all that in a
dedicated helper function. Use that function to complete the unstuck
argument of 'git config -c|--config <TAB>', which expect configuration
variables and values in the same 'section.name=value' form.
Note that handling the struck argument for 'git clone --config=<TAB>'
requires some extra care, so it will be done a separate patch.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git config' expects a configuration variable's name and value in
separate options, so we complete values as they stand on their own on
the command line. 'git -c', however, expects them in a single option
joined by a '=' character, so we should be able to complete values
when they are following 'section.name=' in the same word.
Add new options to the __git_complete_config_variable_value() function
to allow callers to specify the current word to be completed and the
configuration variable whose value is to be completed, and use these
to complete possible values after 'git -c 'section.name=<TAB>'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git config' expects a configuration variable's name and value in
separate arguments, so we let the __gitcomp() helper append a space
character to each variable name by default, like we do for most other
things (--options, refs, paths, etc.). 'git -c', however, expects
them in a single option joined by a '=' character, i.e.
'section.name=value', so we should append a '=' character to each
fully completed variable name, but no space, so the user can continue
typing the value right away.
Add an option to the __git_complete_config_variable_name() function to
allow callers to specify an alternate suffix to add, and use it to
append that '=' character to configuration variables. Update the
__gitcomp() helper function to not append a trailing space to any
completion words ending with a '=', not just to those option with a
stuck argument.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
_git_config() contains two enormous case statements, one to complete
configuration sections and variable names, and the other to complete
their values.
Split these out into two separate helper functions, so in the next
patches we can use them to implement completion for 'git -c <TAB>'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The second '*' in the '--*=*' pattern of the inner 'case' statement of
the __gitcomp() helper function never matches anything, so let's use
'--*=' instead.
The purpose of that inner case statement is to decide when to append a
trailing space to the listed options and when not. When an option
requires a stuck argument, i.e. '--option=', then the trailing space
should not be added, so the user can continue typing the required
argument right away. That '--*=*' pattern is supposed to match these
options, but for this purpose that second '*' is unnecessary, a '--*='
pattern works just as well. That second '*' would only make a
difference in case of a possible completion word like
'--option=value', but our completion script never passes such a word
to __gitcomp(), because the '--option=' and its 'value' must be
completed separately.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The completion script runs the classic '| sort | uniq' pipeline to
deduplicate the output of 'git help --config-for-completion'. 'sort
-u' does the same, but uses one less external process and pipeline
stage. Not a bit win, as it's only run once as the list of supported
configuration variables is initialized, but at least it sets a better
example for others to follow.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The number of configuration variables listed by the completion script
grew quite when we started to auto-generate it from the documentation
[1], so we now complete them in two steps: first we list only the
section names, then the rest [2]. To get the section names we simply
strip everything following the first dot in each variable name,
resulting in a lot of repeated section names, because most sections
contain more than one configuration variable. This is not a
correctness issue in practice, because Bash's completion facilities
remove all repetitions anyway, but these repetitions make testing a
bit harder.
Replace the small 'sed' script removing subsections and variable names
with an 'awk' script that does the same, and in addition removes any
repeated configuration sections as well (by first creating and filling
an associative array indexed by all encountered configuration
sections, and then iterating over this array and printing the indices,
i.e. the unique section names). This change makes the failing 'git
config - section' test in 't9902-completion.sh' pass.
Note that this changes the order of section names in the output, and
makes it downright undeterministic, but this is not an issue, because
Bash sorts them before presenting them to the user, and our completion
tests sort them as well before comparing with the expected output.
Yeah, it would be simpler and shorter to just append '| sort -u' to
that command, but that would incur the overhead of one more external
process and pipeline stage every time a user completes configuration
sections.
[1] e17ca92637 (completion: drop the hard coded list of config vars,
2018-05-26)
[2] f22f682695 (completion: complete general config vars in two steps,
2018-05-27)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next patches will change/refactor the way we complete
configuration variable names and values, so add a few tests to cover
the basics, namely the completion of matching configuration sections,
full variable names, and their values.
Note that the test checking the completion of configuration sections
is currently failing, though it's not a sign of an actual bug. If a
section contains multiple variables, then that section is currently
repeated as many times as the number of variables in there. This is
not a correctness issue in practice, because Bash's completion
facilities remove all repetitions anyway. Consequently, we could list
all those repeated sections in the expected output of this test as
well, but then it would have to be updated whenever a new
configuration variable is added to those sections. Instead, list each
matching configuration section only once, mark the test as failing for
now, and the next patch will update the completion script to avoid
those repetitions.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most 'color.*' configuration variables, with the sole exception of
'color.pager', accept the same set of values, but our completion
script recognizes only about half of them. We could explicitly add
all those missing variables, but let's try to reduce future
maintenance burden, and use the catch-all 'color.*' pattern instead,
so this list won't get out of sync when a similar new configuration
variable accepting the same values is introduced [1].
Furthermore, their documentation explicitly mentions that they all
accept the standard boolean values 'false' and 'true' as well, so list
these, too, among the possible values.
[1] OTOH, there will be a maintenance burden if ever a new
'color.something' is introduced which doesn't accept the same set
of values. We'll see which one happens first...
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Long ago, in 97bfeb34df (Release pack windows before reporting out of
memory., 2006-12-24), we taught xmalloc() and friends to try unmapping
pack windows when malloc() failed. It's unlikely that his helps a lot in
practice, and it has some downsides. First, the downsides:
1. It makes xmalloc() not thread-safe. We've worked around this in
pack-objects.c, which installs its own locking version of the
try_to_free_routine(). But other threaded code doesn't.
2. It makes the system as a whole harder to reason about. Functions
which allocate heap memory under the hood may have farther-reaching
effects than expected.
That might be worth the tradeoff if there's a benefit. But in practice,
it seems unlikely. We're generally dealing with mmap'd files, so the OS
is going to do a much better job at responding to memory pressure by
dropping individual pages (the exception is systems with NO_MMAP, but
even there the OS can probably respond just as well with swapping).
So the only thing we're really freeing is address space. On 64-bit
systems, we have plenty of that to go around. On 32-bit systems, it
could possibly help. But around the same time we made two other changes:
77ccc5bbd1 (Introduce new config option for mmap limit., 2006-12-23) and
60bb8b1453 (Fully activate the sliding window pack access., 2006-12-23).
Together that means that a 32-bit system should have no more than 256MB
total of packed-git mmaps at one time, split between a few 32MB windows.
It's unlikely we have any address space problems since then, but we
don't have any data since the features were all added at the same time.
Likewise, xmmap() will try to free memory. At first glance, it seems
like we'd need this (when we try to mmap a new window, we might need to
close an old one to save address space on a 32-bit system). But we're
saved again by core.packedGitLimit: if we're going to exceed our 256MB
limit, we'll close an existing window before we even call mmap().
So it seems unlikely that this feature is actually doing anything
useful. And while we don't have reports of it harming anything (probably
because it rarely if ever kicks in), it would be nice to simplify the
system overall. This patch drops the whole try_to_free system from
xmalloc(), as well as the manual pack memory release in xmmap().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The grammar for commits used a '?' rather than a '*' on the `merge`
directive line, despite the fact that the code allows multiple `merge`
directives in order to support n-way merges. In fact, elsewhere in
git-fast-import.txt there is an explicit declaration that "an unlimited
number of `merge` commands per commit are permitted by fast-import".
Fix the grammar to match the intent and implementation.
Reported-by: Joachim Klein <joachim.klein@automata.tools>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two perf scripts numbered p5600, but with otherwise different
names ("clone-reference" versus "partial-clone"). We store timing
results in files named after the whole script, so internally we don't
get confused between the two. But "aggregate.perl" just prints the test
number for each result, giving multiple entries for "5600.3". It also
makes it impossible to skip one test but not the other with
GIT_SKIP_TESTS.
Let's renumber the one that appeared later (by date -- the source of the
problem is that the two were developed on independent branches). For the
non-perf test suite, our test-lint rule would have complained about this
when the two were merged, but t/perf never learned that trick.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of new sq_append_quote_argv_pretty() to normalize
how we handle leading whitespace in perf format messages.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of new sq_append_quote_argv_pretty() to normalize
how we handle leading whitespace in normal format messages.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sq_quote_argv_pretty() builds a "pretty" string from the given argv.
It inserts whitespace before each value, rather than just between
them, so the resulting string always has a leading space. Lets give
callers an option to not have the leading space or have to ltrim()
it later.
Create sq_append_quote_argv_pretty() to convert an argv into a
pretty, quoted if necessary, string with space delimiters and
without a leading space.
Convert the existing sq_quote_argv_pretty() to use this new routine
while preserving the leading space behavior.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid creating unnecessary trailing whitespace in normal
target format error messages when the message is omitted.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove an unnecessary "if" block in maybe_add_string_va().
Commit "ad006fe419e trace2: NULL is not allowed for va_list"
changed "if (fmt && *fmt && ap)" to just "if (fmt && *fmt)"
because it isn't safe to treat 'ap' as a pointer. This made
the "if" block following it unnecessary.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid unecessary trailing whitespace in "region_enter" and "region_leave"
messages in perf target format.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the toplevel window for the window being destroyed is the main window
(aka "."), then simply destroying it means the cleanup tasks are not
executed (like saving the commit message buffer, saving window state,
etc.)
All this is handled by do_quit. Call it instead of directly
destroying the main window. For other toplevel windows, the old
behavior remains.
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Analogous to commit, introduce a '--no-verify' option which bypasses the
pre-merge-commit hook. The shorthand '-n' is taken by '--no-stat'
already.
[js: * reworded commit message to reflect current state of --no-stat flag
and new hook name
* fixed flag documentation to reflect new hook name
* cleaned up trailing whitespace
* squashed test changes from the original series' patch 4/4
* modified tests to follow pattern from this series' patch 1/4
* added a test case for --no-verify with non-executable hook
* when testing that the merge hook did not run, make sure we
actually have a merge to perform (by resetting the "side" branch
to its original state).
]
Improved-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-merge does not honor the pre-commit hook when doing automatic merge
commits, and for compatibility reasons this is going to stay.
Introduce a pre-merge-commit hook which is called for an automatic merge
commit just like pre-commit is called for a non-automatic merge commit
(or any other commit).
[js: * renamed hook from "pre-merge" to "pre-merge-commit"
* only discard the index if the hook is actually present
* expanded githooks documentation entry
* clarified that hook should write messages to stderr
* squashed test changes from the original series' patch 4/4
* modified tests to follow new pattern from this series' patch 1/4
* added a test case for non-executable merge hooks
* added a test case for failed merges
* when testing that the merge hook did not run, make sure we
actually have a merge to perform (by resetting the "side" branch
to its original state).
* reworded commit message
]
Improved-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f8b863598c ("builtin/merge: honor commit-msg hook for merges", 2017-09-07)
introduced the no-verify flag to merge for bypassing the commit-msg
hook, though in a different way from the implementation in commit.c.
Change the implementation in merge.c to be the same as in commit.c so
that both do the same in the same way. This also changes the output of
"git merge --help" to be more clear that the hook return code is
respected by default.
[js: * reworded commit message
* squashed documentation changes from original series' patch 3/4
]
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7503 did not verify that the expected hooks actually ran during
testing. Fix that by making the hook scripts write their $0 into a file
so that we can compare actual execution vs. expected execution.
While we're at it, do some test style cleanups, such as using
write_script() and doing setup inside a test_expect_success block.
Improved-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cast the evaluated value of the macro INITIAL_LOCK to void to instruct
the compiler that we're not interested in said value nor the following
warning:
In file included from compat/nedmalloc/nedmalloc.c:63:
compat/nedmalloc/malloc.c.h: In function ‘init_user_mstate’:
compat/nedmalloc/malloc.c.h:1706:62: error: right-hand operand of comma expression has no effect [-Werror=unused-value]
1706 | #define INITIAL_LOCK(sl) (memset(sl, 0, sizeof(MLOCK_T)), 0)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
compat/nedmalloc/malloc.c.h:5020:3: note: in expansion of macro ‘INITIAL_LOCK’
5020 | INITIAL_LOCK(&m->mutex);
| ^~~~~~~~~~~~
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid the following compiler warning:
In file included from compat/nedmalloc/nedmalloc.c:63:
compat/nedmalloc/malloc.c.h: In function ‘pthread_release_lock’:
compat/nedmalloc/malloc.c.h:1759:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
1759 | volatile unsigned int* lp = &sl->l;
| ^~~~~~~~
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit, our invariant that the_repository is never
NULL is restored, and we can stop being defensive in include_by_branch().
We can confirm the fix by showing that an onbranch config include will
not cause a segfault when run outside a git repository. I've put this in
t1309-early-config since it's related to the case added by 85fe0e800c
(config: work around bug with includeif:onbranch and early config,
2019-07-31), though technically the issue was with
read_very_early_config() and not read_early_config().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We initialize the trace2 system in the common main() function so that
all programs (even ones that aren't builtins) will enable tracing. But
trace2 startup is relatively heavy-weight, as we have to actually read
on-disk config to decide whether to trace. This can cause unexpected
interactions with other common-main initialization. For instance, we'll
end up in the config code before calling initialize_the_repository(),
and the usual invariant that the_repository is never NULL will not hold.
Let's push the trace2 initialization further down in common-main, to
just before we execute cmd_main(). The other parts of the initialization
are much more self-contained and less likely to call library code that
depends on those kinds of invariants.
Originally the trace2 code tried to start as early as possible to get
accurate timings. But the timer initialization was split out from the
config reading in a089724958 (trace2: refactor setting process starting
time, 2019-04-15), so there shouldn't be any impact from this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 85fe0e800c (config: work around bug with includeif:onbranch and
early config, 2019-07-31) tests that our early config-reader does not
access the file mentioned by includeIf.onbranch:refs/heads/master.path.
But it would never do so even if the feature were implemented, since the
onbranch matching code uses the short refname "master".
The test still serves its purpose, since the bug fixed by 85fe0e800c is
actually that we hit a BUG() before even deciding whether to match the
ref. But let's use the correct name to avoid confusion (and which we'll
eventually want to trigger once we do the "real" fix described in that
commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that --end-of-options is available for any users of
setup_revisions() or parse_options(), which should be effectively
everywhere, we can guide people to use it for all their disambiguating
needs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The revision option parser recently learned about --end-of-options, but
that's not quite enough for all callers. Some of them, like git-log,
pick out some options using parse_options(), and then feed the remainder
to setup_revisions(). For those cases we need to stop parse_options()
from finding more options when it sees --end-of-options, and to retain
that option in argv so that setup_revisions() can see it as well.
Let's handle this the same as we do "--". We can even piggy-back on the
handling of PARSE_OPT_KEEP_DASHDASH, because any caller that wants to
retain one will want to retain the other.
I've included two tests here. The "log" test covers "--source", which is
one of the options it handles with parse_options(), and would fail
before this patch. There's also a test that uses the parse-options
helper directly. That confirms that the option is handled correctly even
in cases without KEEP_DASHDASH or setup_revisions(). I.e., it is safe to
use --end-of-options in place of "--" in other programs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's currently no robust way to tell Git that a particular option is
meant to be a revision, and not an option. So if you have a branch
"refs/heads/--foo", you cannot just say:
git rev-list --foo
You can say:
git rev-list refs/heads/--foo
But that breaks down if you don't know the refname, and in particular if
you're a script passing along a value from elsewhere. In most programs,
you can use "--" to end option parsing, like this:
some-prog -- "$revision"
But that doesn't work for the revision parser, because "--" is already
meaningful there: it separates revisions from pathspecs. So we need some
other marker to separate options from revisions.
This patch introduces "--end-of-options", which serves that purpose:
git rev-list --oneline --end-of-options "$revision"
will work regardless of what's in "$revision" (well, if you say "--" it
may fail, but it won't do something dangerous, like triggering an
unexpected option).
The name is verbose, but that's probably a good thing; this is meant to
be used for scripted invocations where readability is more important
than terseness.
One alternative would be to introduce an explicit option to mark a
revision, like:
git rev-list --oneline --revision="$revision"
That's slightly _more_ informative than this commit (because it makes
even something silly like "--" unambiguous). But the pattern of using a
separator like "--" is well established in git and in other commands,
and it makes some scripting tasks simpler like:
git rev-list --end-of-options "$@"
There's no documentation in this patch, because it will make sense to
describe the feature once it is available everywhere (and support will
be added in further patches).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The verbose output of every test looks something like this:
expecting success:
echo content >file &&
git add file &&
git commit -m "add file"
[master (root-commit) d1fbfbd] add file
Author: A U Thor <author@example.com>
1 file changed, 1 insertion(+)
create mode 100644 file
ok 1 - commit works
i.e. first an "expecting success" (or "checking known breakage") line
followed by the commands to be executed, then the output of those
comamnds, and finally an "ok"/"not ok" line containing the test name.
Note that the test's name is only shown at the very end.
With '-x' tracing enabled and/or in longer tests the verbose output
might be several screenfulls long, making it harder than necessary to
find where the output of the test with a given name starts (especially
when the outputs to different file descriptors are racing, and the
"expecting success"/command block arrives earlier than the "ok" line
of the previous test).
Print the test name at the start of the test's verbose output, i.e. at
the end of the "expecting success" and "checking known breakage"
lines, to make the start of a particular test a bit easier to
recognize. Also print the test script and test case numbers, to help
those poor souls who regularly have to scan through the combined
verbose output of several test scripts.
So the dummy test above would start like this:
expecting success of 9999.1 'commit works':
echo content >file &&
[...]
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our test scripts are named something like 't1234-command.sh', but the
script names used in 't0000-basic.sh' don't follow this naming
convention. Normally this doesn't matter, because the test scripts
themselves don't care how they are called. However, the next patch
will start to include the test number in the test's verbose output, so
the test script's name will matter in the two tests checking the
verbose output.
Update the tests 'test --verbose' and 'test --verbose-only' to follow
out test script naming convention.
Leave the other tests in 't0000' unchanged: changing the names of
their test scripts would be only pointless code churn.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:
# nonsense
$ echo not-a-commit-oid | git commit-graph write --stdin-commits
$ echo $?
0
# sometimes I forgot that refs are not good...
$ echo HEAD | git commit-graph write --stdin-commits
$ echo $?
0
# valid tree OID, but not a commit OID
$ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
$ echo $?
0
$ ls -l .git/objects/info/commit-graph
ls: cannot access '.git/objects/info/commit-graph': No such file or directory
Check that all input records are indeed valid commit object ids and
return with error otherwise, the same way '--stdin-packs' handles
invalid input; see e103f7276f (commit-graph: return with errors during
write, 2019-06-12).
Note that it should only return with error when encountering an
invalid commit object id coming from standard input. However,
'--reachable' uses the same code path to process object ids pointed to
by all refs, and that includes tag object ids as well, which should
still be skipped over. Therefore add a new flag to 'enum
commit_graph_write_flags' and a corresponding field to 'struct
write_commit_graph_context', so we can differentiate between those two
cases.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 't5318-commit-graph.sh' the test 'close with correct error on bad
input' manually verifies the exit code of a 'git commit-graph write'
command.
Use 'test_expect_code' instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git restore --staged` uses the same machinery as `git checkout HEAD`,
so there should be a similar test case for "restore" as the existing
test case for "checkout" with deleted ita files.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Varun Naik <vcnaik94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is possible to delete a committed file from the index and then add it
as intent-to-add. After `git checkout HEAD <pathspec>`, the file should
be identical in the index and HEAD. The command already works correctly
if the file has contents in HEAD. This patch provides the desired
behavior even when the file is empty in HEAD.
`git checkout HEAD <pathspec>` calls tree.c:read_tree_1(), with fn
pointing to checkout.c:update_some(). update_some() creates a new cache
entry but discards it when its mode and oid match those of the old
entry. A cache entry for an ita file and a cache entry for an empty file
have the same oid. Therefore, an empty deleted ita file previously
passed both of these checks, and the new entry was discarded, so the
file remained unchanged in the index. After this fix, if the file is
marked as ita in the cache, then we avoid discarding the new entry and
add the new entry to the cache instead.
This change should not affect newly added ita files. For those, inside
tree.c:read_tree_1(), tree_entry_interesting() returns
entry_not_interesting, so fn is never called.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Varun Naik <vcnaik94@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a packed ref is deleted, the whole packed-refs file is
rewritten to omit the ref that no longer exists. However if another
gc command is running and calls `pack-refs --all` simultaneously,
there is a chance that a ref that was just updated lose the newly
created commits.
Through these steps, losing commits on newly updated refs can be
demonstrated:
# step 1: compile git without `USE_NSEC` option
Some kernel releases do enable it by default while some do
not. And if we compile git without `USE_NSEC`, it will be easier
demonstrated by the following steps.
# step 2: setup a repository and add the first commit
git init repo &&
(cd repo &&
git config core.logallrefupdates true &&
git commit --allow-empty -m foo)
# step 3: in one terminal, repack the refs repeatedly
cd repo &&
while true
do
git pack-refs --all
done
# step 4: in another terminal, simultaneously update the
# master with update-ref, and create and delete an
# unrelated ref also with update-ref
cd repo &&
while true
do
us=$(git commit-tree -m foo -p HEAD HEAD^{tree}) &&
git update-ref refs/heads/newbranch $us &&
git update-ref refs/heads/master $us &&
git update-ref -d refs/heads/newbranch &&
them=$(git rev-parse master) &&
if test "$them" != "$us"
then
echo >&2 "lost commit: $us"
exit 1
fi
# eye candy
printf .
done
Though we have the packed-refs lock file and loose refs lock
files to avoid updating conflicts, a ref will lost its newly
commits if racy stat-validity of `packed-refs` file happens
(which is quite same as the racy-git described in
`Documentation/technical/racy-git.txt`), the following
specific set of operations demonstrates the problem:
1. Call `pack-refs --all` to pack all the loose refs to
packed-refs, and let say the modify time of the
packed-refs is DATE_M.
2. Call `update-ref` to update a new commit to master while
it is already packed. the old value (let us call it
OID_A) remains in the packed-refs file and write the new
value (let us call it OID_B) to $GIT_DIR/refs/heads/master.
3. Call `update-ref -d` within the same DATE_M from the 1th
step to delete a different ref newbranch which is packed
in the packed-refs file. It check newbranch's oid from
packed-refs file without locking it.
Meanwhile it keeps a snapshot of the packed-refs file in
memory and record the file's attributes with the snapshot.
The oid of master in the packed-refs's snapshot is OID_A.
4. Call a new `pack-refs --all` to pack the loose refs, the
oid of master in packe-refs file is OID_B, and the loose
refs $GIT_DIR/refs/heads/master is removed. Let's say
the `pack-refs --all` is very quickly done and the new
packed-refs file's modify time is still DATE_M, and it
has the same file size, even the same inode.
5. 3th step now goes on after checking the newbranch, it
begin to rewrite the packed-refs file. After get the
lock file of packed-ref file, it checks it's on-disk
file attributes with the snapshot, suck as the timestamp,
the file size and the inode value. If they are both the
same values, and the snapshot is not refreshed.
Because the loose ref of master is removed by 4th step,
`update-ref -d` will updates the new packed-ref to disk
which contains master with the oid OID_A. So now the
newly commit OID_B of master is lost.
The best path forward is just always refreshing after take
the lock file of `packed-refs` file. Traditionally we avoided
that because refreshing it implied parsing the whole file.
But these days we mmap it, so it really is just an extra
open()/mmap() and a quick read of the header. That doesn't seem
like an outrageous cost to pay when we're already taking the lock.
Signed-off-by: Sun Chao <sunchao9@huawei.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Sun Chao <sunchao9@huawei.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a couple of test scripts that are not completely
httpd-specific, but do run a few httpd-specific tests at the end.
These test scripts source 'lib-httpd.sh' somewhere mid-script, which
then skips all the rest of the test script if the dependencies for
running httpd tests are not fulfilled.
As the previous two patches in this series show, already on two
occasions non-httpd-specific tests were appended at the end of such
test scripts, and, consequently, they were skipped as well when httpd
tests couldn't be run.
Add a comment at the end of these test scripts to warn against adding
non-httpd-specific tests at the end, in the hope that they will help
prevent similar issues in the future.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Truncate/elide very long "filename:linenumber" field.
Truncate region and data "category" field if necessary.
Adjust overall column widths.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The make_traverse_path() function isn't very careful about checking its
output buffer boundaries. In fact, it doesn't even _know_ the size of
the buffer it's writing to, and just assumes that the caller used
traverse_path_len() correctly. And even then we assume that our
traverse_info.pathlen components are all correct, and just blindly write
into the buffer.
Let's improve this situation a bit:
- have the caller pass in their allocated buffer length, which we'll
check against our own computations
- check for integer underflow as we do our backwards-insertion of
pathnames into the buffer
- check that we do not run out items in our list to traverse before
we've filled the expected number of bytes
None of these should be triggerable in practice (especially since our
switch to size_t everywhere in a previous commit), but it doesn't hurt
to check our assumptions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All but one of the callers of make_traverse_path() allocate a new heap
buffer to store the path. Let's give them an easy way to write to a
strbuf, which saves them from computing the length themselves (which is
especially tricky when they want to add to the path). It will also make
it easier for us to change the make_traverse_path() interface in a
future patch to improve its bounds-checking.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a "struct name_entry", but only care about the length of the
path name. Let's just take that length directly, making it easier to use
the function from callers that sometimes do not have a name_entry at
all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We store and manipulate the cumulative traverse_info.pathlen as an
"int", which can overflow when we are fed ridiculously long pathnames
(e.g., ones at the edge of 2GB or 4GB, even if the individual tree entry
names are smaller than that). The results can be confusing, though
after some prodding I was not able to use this integer overflow to cause
an under-allocated buffer.
Let's consistently use size_t to generate and store these, and make
sure our addition doesn't overflow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
't5703-upload-pack-ref-in-want.sh' sources 'lib-httpd.sh' near the end
to run a couple of httpd-specific tests, but 'lib-httpd.sh' skips all
the rest of the test script if the dependencies for running httpd
tests are not fulfilled. However, the last six tests in 't5703' are
not httpd-specific, but they are skipped as well when httpd tests
can't be run.
Move these six tests earlier in the test script, before 'lib-httpd.sh'
is sourced, so they will be run even when httpd tests aren't. Note
that this is not merely a pure code movement, because the setup test
case for the httpd tests needed an additional 'rm -rf
"$LOCAL_PRISTINE"' to clean up a directory left behind by the moved
non-httpd-specific tests.
Also add a comment at the end of this test script to warn against
adding non-httpd-specific tests at the end, in the hope that it will
help prevent similar issues in the future.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
't5510-fetch.sh' sources 'lib-httpd.sh' near the end to run a
httpd-specific test, but 'lib-httpd.sh' skips all the rest of the test
script if the dependencies for running httpd tests are not fulfilled.
Alas, recently cdbd70c437 (fetch: add --[no-]show-forced-updates
argument, 2019-06-18) appended a non-httpd-specific test at the end,
and this test is then skipped as well when httpd tests can't be run.
Move this new test earlier in the test script, before 'lib-httpd.sh'
is sourced, so it will be run even when httpd tests aren't.
Also add a comment at the end of this test script to warn against
adding non-httpd-specific tests at the end, in the hope that it will
help prevent similar issues in the future.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the previous commit shows, the presence of an oid in each level of
the traverse_info is confusing and ultimately not necessary. Let's drop
it to make it clear that it will not always be set (as well as convince
us that it's unused, and let the compiler catch any merges with other
branches that do add new uses).
Since the oid is part of name_entry, we'll actually stop embedding a
name_entry entirely, and instead just separately hold the pathname, its
length, and the mode.
This makes the resulting code slightly more verbose as we have to pass
those elements around individually. But it also makes it more clear what
each code path is going to use (and in most of the paths, we really only
care about the pathname itself).
A few of these conversions are noisier than they need to be, as they
also take the opportunity to rename "len" to "namelen" for clarity
(especially where we also have "pathlen" or "ce_len" alongside).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We assume that if setup_traverse_info() is passed a non-empty "base"
string, that string is pointing into a tree object and we can read the
object oid by skipping past the trailing NUL.
As it turns out, this is not true for either of the two calls, and we
may end up reading garbage bytes:
1. In git-merge-tree, our base string is either empty (in which case
we'd never run this code), or it comes from our traverse_path()
helper. The latter overallocates a buffer by the_hash_algo->rawsz
bytes, but then fills it with only make_traverse_path(), leaving
those extra bytes uninitialized (but part of a legitimate heap
buffer).
2. In unpack_trees(), we pass o->prefix, which is some arbitrary
string from the caller. In "git read-tree --prefix=foo", for
instance, it will point to the command-line parameter, and we'll
read 20 bytes past the end of the string.
Interestingly, tools like ASan do not detect (2) because the process
argv is part of a big pre-allocated buffer. So we're reading trash, but
it's trash that's probably part of the next argument, or the
environment.
You can convince it to fail by putting something like this at the
beginning of common-main.c's main() function:
{
int i;
for (i = 0; i < argc; i++)
argv[i] = xstrdup_or_null(argv[i]);
}
That puts the arguments into their own heap buffers, so running:
make SANITIZE=address test
will find problems when "read-tree --prefix" is used (e.g., in t3030).
Doubly interesting, even with the hackery above, this does not fail
prior to ea82b2a085 (tree-walk: store object_id in a separate member,
2019-01-15). That commit switched setup_traverse_info() to actually
copying the hash, rather than simply pointing to it. That pointer was
always pointing to garbage memory, but that commit started actually
dereferencing the bytes, which is what triggers ASan.
That also implies that nobody actually cares about reading these oid
bytes anyway (or at least no path covered by our tests). And manual
inspection of the code backs that up (I'll follow this patch with some
cleanups that show definitively this is the case, but they're quite
invasive, so it's worth doing this fix on its own).
So let's drop the bogus hashcpy(), along with the confusing oversizing
in merge-tree.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing a complete commit history onto a given commit, it is
pretty obvious that the root commits should be rebased on top of said
given commit.
To test this, let's kill two birds with one stone and add a test case to
t3427-rebase-subtree.sh that not only demonstrates that this works, but
also that `git rebase -r` works with merge strategies now.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a test case in this script that verifies that `git rebase
--preserve-merges` works all right with non-default merge strategies or
non-default merge strategy options.
Now that `git rebase --rebase-merges` learned about merge strategies,
let's copy-edit this test case to verify that that works as intended,
too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format of the todo list is quite a bit different in the
`--rebase-merges` mode; Let's prepare the fake editor to handle those
todo lists properly, too.
The original idea was that we keep the original command unless
overridden, and because the original todo lists only had `pick` lines
anyway, we could be sloppy and "override" the command by the same
command (i.e. use the sed replacement pattern "pick" instead of "&").
This actually would not have worked with `fixup` and `squash` commands,
but it would appear that we never tried to use the fake editor with
`--autosquash`.
However, in the next commit we want to use the fake editor in
conjunction with `--rebase-merges`, so let's use the correct sed
replacement pattern.
Technically, it is not necessary to take care of the `fakesha` thing
(where we reuse the sed replacement pattern to craft a new todo
command), at least for now, as the only user of that thing overrides the
`action` anyway. Nevertheless, for completeness' sake, we do take care
of it.
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already support merge strategies in the sequencer, but only for
`pick` commands.
With this commit, we now also support them in `merge` commands. The
approach is simple: if any merge strategy option is specified, or if any
merge strategy other than `recursive` is specified, we simply spawn the
`git merge` command. Otherwise, we handle the merge in-process just as
before.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case that concerns `git rebase -Xsubtree` (with the
default rebase backend, not with `--preserve-merges`) starts out with a
pre-rebase commit history that begins with a commit that introduces
three files: master1.t, master2.t and master3.t.
This commit was generated by passing a subtree merge commit through `git
filter-branch --subdirectory-filter`, so it looks as if this commit
really introduces all those files.
The commit history onto which this commit is then rebased, however,
introduced those files in individual commits. For that reason, the
rebase will fail, it _must_ fail, because the first `pick` results in no
changes to be committed.
Let's fix the test case to expect exactly this situation.
With this change, we can mark the original bug that this test case tried
to demonstrate as fixed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 68aa495b59 (rebase: implement --merge via the interactive
machinery, 2018-12-11), the job of the old `--merge` backend is now
performed by the `--interactive` backend, too.
One consequence is that empty commits are no longer rebased by default.
Meaning that the test case that calls `git rebase -Xsubtree` (which used
to be handled by the `--merge` backend) now needs to ask explicitly for
the empty commit to be rebased.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apart from the `setup` test case, `t3427-rebase-subtree.sh` is made up
exclusively of demonstrations of breakages. The tricky thing about such
demonstrations is that they are often buggy themselves.
In this instance, somewhere over the course of the six iterations
of the patch that eventually made it into Git's `master` as 5f35900849
(contrib/subtree: Add a test for subtree rebase that loses commits,
2016-06-28), the commit message "files_subtree/master4" was changed to
just "master4", but the test cases still expected the old commit
message.
Let's fix this, at long last.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, this test script performed essentially three rebases and
verified breakages by testing the post-rebase commits' messages.
To do so, the rebases were performed multiple times, though, once per
commit message to test. This wastes electricity (and CO2) and time.
Let's condense the test cases to the essential number: the number of
different rebases to validate.
On Windows, where the scripted nature of the `--preserve-merges` backend
hurts performance rather badly, this reduces the overall runtime in this
developer's setup from ~1m to ~28s while still performing the exact same
testing as before.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The step to prepare a pre-rebase commit history is _identical_ in _all_
of the test cases (except of course the `setup` case). It should
therefore clearly a part of the `setup` test case instead.
As the `git filter-branch` command is quite costly on platforms where
Unix shell scripting is simply slow (meaning: on Windows), this shaves
off a noticeable part of the runtime: in this developer's setup, the
time was reduced from ~1m25s to ~1m.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It still does the very same thing as before, but expresses it in a much
more succinct (and still quite readable) manner.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flow of this test script is outright confusing, and to start the
endeavor to address that, let's describe what this test is all about,
and how it tries to do it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only remaining scripted part of `git rebase` is the
`--preserve-merges` backend. Meaning: there is little reason to keep the
"library of common rebase functions" as a separate file.
While moving the functions to `git-rebase--preserve-merges.sh`, we also
drop the `move_to_original_branch` function that is no longer used.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update a code comment that referred to those files as if they were still
there.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This went away in 0609b741a4 (rebase -i: combine rebase--interactive.c
with rebase.c, 2019-04-17).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One test case's title mentioned the then-current implementation detail
that the `--am` backend was implemented in `git-rebase--am.sh`.
This is no longer the case, so let's update the title to reflect the
current reality.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 21853626ea (built-in rebase: call `git am` directly, 2019-01-18),
the built-in rebase already uses the built-in `git am` directly.
Now that d03ebd411c (rebase: remove the rebase.useBuiltin setting,
2019-03-18) even removed the scripted rebase, there is no longer any
user of `git-rebase--am.sh`, so let's just remove it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test '--no-show-forced-updates' in 't5510-fetch.sh' added in
cdbd70c437 (fetch: add --[no-]show-forced-updates argument,
2019-06-18) runs '! test_i18ngrep ...'. This is wrong, because when
running the test with GIT_TEST_GETTEXT_POISON=true, then
'test_i18ngrep' is basically a noop and always returns with success,
the leading ! turns that into a failure, which then fails the test.
Use 'test_i18ngrep ! ...' instead.
This went unnoticed by our GETTEXT_POISON CI builds, because those
builds don't run this test case: in those builds we don't install
Apache, and this test comes after 't5510' sources 'lib-httpd.sh',
which, consequently, skips all the remaining tests, including this
one.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running git-grep with --recurse-submodules results in a cached grep for
the submodules even when --cached is not used. This makes all
modifications in submodules' tracked files be always ignored when
grepping. Solve that making git-grep respect the cached option when
invoking grep_cache() inside grep_submodule(). Also, add tests to
ensure that the desired behavior is performed.
Reported-by: Daniel Zaoui <jackdanielz@eyomi.org>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As discussed in the last commit partially fix a bug introduced in
b65abcafc7 ("grep: use PCRE v2 for optimized fixed-string search",
2019-07-01). Because PCRE v2, unlike kwset, validates its UTF-8 input
we'd die on e.g.:
fatal: pcre2_match failed with error code -22: UTF-8 error:
isolated byte with 0x80 bit set
When grepping a non-ASCII fixed string. This is a more general problem
that's hard to fix, but we can at least fix the most common case of
grepping for a fixed string without "-i". I can't think of a reason
for why we'd turn on PCRE2_UTF when matching byte-for-byte like that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since my b65abcafc7 ("grep: use PCRE v2 for optimized fixed-string
search", 2019-07-01) we've been dying on invalid UTF-8 data when
grepping for fixed strings if the following are all true:
* The subject string is non-ASCII (e.g. "ævar")
* We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C"
* We compiled with PCRE v2
* That PCRE v2 did not have JIT support
The last of those is why this wasn't caught earlier, per pcre2jit(3):
"unless PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested
for validity. In the interests of speed, these checks do not
happen on the JIT fast path, and if invalid data is passed, the
result is undefined."
I.e. the subject being matched against our pattern was invalid, but we
were lucky and getting away with it on the JIT path, but the non-JIT
one is stricter.
This patch does nothing to fix that, instead we sneak in support for
fixed patterns starting with "(*NO_JIT)", this disables the PCRE v2
jit with implicit fixed-string matching for testing, see
pcre2syntax(3) the syntax.
This is technically a change in behavior, but it's so obscure that I
figured it was OK. We'd previously consider this an invalid regular
expression as regcomp() would die on it, now we feed it to the PCRE v2
fixed-string path. I thought this was better than introducing yet
another GIT_TEST_* environment variable.
We're also relying on a behavior of PCRE v2 that technically could
change, but I think the test coverage is worth dipping our toe into
some somewhat undefined behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change paves the way for later using this value the regex compile
functions themselves.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the start of this function we do:
p->fixed = opt->fixed;
It's less confusing to use that variable consistently that switch back
& forth between the two.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the PCRE v1 code for the same reasons as for the PCRE v2 code
in the last commit. Unlike with v2 we actually used the custom stack
in v1, but let's use PCRE's built-in 32 KB one instead, since
experience with v2 shows that's enough. Most distros are already using
v2 as a default, and the underlying sljit code is the same.
Unfortunately we can't just pass a NULL to pcre_jit_exec() as with
pcre2_jit_match(). Unlike the v2 function it doesn't support
that. Instead we need to use the fatter pcre_exec() if we'd like the
same behavior.
This will make things slightly slower than on the fast-path function,
but it's OK since we care less about v1 performance these days since
we have and recommend v2. Running a similar performance test as what I
ran in fbaceaac47 ("grep: add support for the PCRE v1 JIT API",
2017-05-25) via:
GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS='-j8 USE_LIBPCRE1=Y CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst' ./run HEAD~ HEAD p7820-grep-engines.sh
Gives us this, just the /perl/ results:
Test HEAD~ HEAD
---------------------------------------------------------------------------------------
7820.3: perl grep 'how.to' 0.19(0.67+0.52) 0.19(0.65+0.52) +0.0%
7820.7: perl grep '^how to' 0.19(0.78+0.44) 0.19(0.72+0.49) +0.0%
7820.11: perl grep '[how] to' 0.39(2.13+0.43) 0.40(2.10+0.46) +2.6%
7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 0.44(2.55+0.37) 0.45(2.47+0.41) +2.3%
7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.23(1.06+0.42) 0.22(1.03+0.43) -4.3%
It will also implicitly re-enable UTF-8 validation for PCRE v1. As
noted in [1] we now have cases as a result where PCRE v1 is more eager
to error out. Subsequent patches will fix that for v2, and I think
it's fair to tell v1 users "just upgrade" and not worry about that
edge case for v1.
1. https://public-inbox.org/git/CAPUEsphZJ_Uv9o1-yDpjNLA_q-f7gWXz9g1gCY2pYAYN8ri40g@mail.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported in [1] the code I added in 94da9193a6 ("grep: add support
for PCRE v2", 2017-06-01) to use a custom JIT stack has never
worked. It was incorrectly copy/pasted from code I added in
fbaceaac47 ("grep: add support for the PCRE v1 JIT API", 2017-05-25),
which did work.
Thus our intention of starting with 1 byte of stack at a maximum of 1
MB didn't happen, we'd always use the 32 KB stack provided by PCRE
v2's jit_machine_stack_exec()[2]. The reason I allocated a custom
stack at all was this advice in pcrejit(3) (same in pcre2jit(3)):
"By default, it uses 32KiB on the machine stack. However, some
large or complicated patterns need more than this"
Since we've haven't had any reports of users running into
PCRE2_ERROR_JIT_STACKLIMIT in the wild I think we can safely assume
that we can just use the library defaults instead and drop this
code. This won't change with the wider use of PCRE v2 in
ed0479ce3d ("Merge branch 'ab/no-kwset' into next", 2019-07-15), a
fixed string search is not a "large or complicated pattern".
For good measure I ran the performance test noted in 94da9193a6,
although the command is simpler now due to my 0f50c8e32c ("Makefile:
remove the NO_R_TO_GCC_LINKER flag", 2019-05-17):
GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_OPTS='-j8 USE_LIBPCRE2=Y CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst' ./run HEAD~ HEAD p7820-grep-engines.sh
Just the /perl/ results are:
Test HEAD~ HEAD
---------------------------------------------------------------------------------------
7820.3: perl grep 'how.to' 0.17(0.27+0.65) 0.17(0.24+0.68) +0.0%
7820.7: perl grep '^how to' 0.16(0.23+0.66) 0.16(0.23+0.67) +0.0%
7820.11: perl grep '[how] to' 0.18(0.35+0.62) 0.18(0.33+0.65) +0.0%
7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 0.17(0.45+0.54) 0.17(0.49+0.50) +0.0%
7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.16(0.33+0.58) 0.16(0.29+0.62) +0.0%
So, as expected there's no change, and running with valgrind reveals
that we have fewer allocations now.
As noted in [3] there are known regexes that will fail with the lower
stack limit, the way GNU grep fixed it is interesting, although I
believe the implementation is overly verbose, they could make PCRE v2
handle that gradual re-allocation, that's what min/max memory is
for.
So we might end up bringing this back, I'm more inclined to just kick
such cases upstairs to PCRE maintainers as a bug, perhaps they'll add
some overall "just allocate more then" flag to make this easier. In
any case there's no functional change here, we didn't have a custom
stack, so let's apply this first, we can always revert it later.
1. https://public-inbox.org/git/20190721194052.15440-1-carenas@gmail.com/
2. I didn't really intend to start with 1 byte, looking at the PCRE v2
code again what happened is that I cargo-culted some of PCRE v2's
own test code which was meant to test re-allocations. It's more
sane to start with say 32 KB with a max of 1 MB, as pcre2grep.c
does.
3. https://public-inbox.org/git/CAPUEspjj+fG8QDmf=bZXktfpLgkgiu34HTjKLhm-cmEE04FE-A@mail.gmail.com/
Reported-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove code that would trigger if pcre_config() or pcre2_config() was
so broken that "do we have JIT?" wouldn't return a boolean.
I added this code back in fbaceaac47 ("grep: add support for the PCRE
v1 JIT API", 2017-05-25) and then as noted in f002532784 ("grep: print
the pcre2_jit_on value", 2019-07-22) incorrectly copy/pasted some of
it in 94da9193a6 ("grep: add support for PCRE v2", 2017-06-01).
Let's just remove this code. Being this paranoid about the
pcre2?_config() function itself being broken is crossing the line into
unreasonable paranoia.
Reported-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bring back optimized fixed-string search for "grep", this time with
PCRE v2 as an optional backend. As noted in [1] with kwset we were
slower than PCRE v1 and v2 JIT with the kwset backend, so that
optimization was counterproductive.
This brings back the optimization for "--fixed-strings", without
changing the semantics of having a NUL-byte in patterns. As seen in
previous commits in this series we could support it now, but I'd
rather just leave that edge-case aside so we don't have one behavior
or the other depending what "--fixed-strings" backend we're using. It
makes the behavior harder to understand and document, and makes tests
for the different backends more painful.
This does change the behavior under non-C locales when "log"'s
"--encoding" option is used and the heystack/needle in the
content/command-line doesn't have a matching encoding. See the recent
change in "t4210: skip more command-line encoding tests on MinGW" in
this series. I think that's OK. We did nothing sensible before
then (just compared raw bytes that had no hope of matching). At least
now the user will get some idea why their grep/log never matches in
that edge case.
I could also support the PCRE v1 backend here, but that would make the
code more complex. I'd rather aim for simplicity here and in future
changes to the diffcore. We're not going to have someone who
absolutely must have faster search, but for whom building PCRE v2
isn't acceptable.
The difference between this series of commits and the current "master"
is, using the same t/perf commands shown in the last commit:
plain grep:
Test origin/master HEAD
-------------------------------------------------------------------------
7821.1: fixed grep int 0.55(1.67+0.56) 0.41(0.98+0.60) -25.5%
7821.2: basic grep int 0.58(1.65+0.52) 0.41(0.96+0.57) -29.3%
7821.3: extended grep int 0.57(1.66+0.49) 0.42(0.93+0.60) -26.3%
7821.4: perl grep int 0.54(1.67+0.50) 0.43(0.88+0.65) -20.4%
7821.6: fixed grep uncommon 0.21(0.52+0.42) 0.16(0.24+0.51) -23.8%
7821.7: basic grep uncommon 0.20(0.49+0.45) 0.17(0.28+0.47) -15.0%
7821.8: extended grep uncommon 0.20(0.54+0.39) 0.16(0.25+0.50) -20.0%
7821.9: perl grep uncommon 0.20(0.58+0.36) 0.16(0.23+0.50) -20.0%
7821.11: fixed grep æ 0.35(1.24+0.43) 0.16(0.23+0.50) -54.3%
7821.12: basic grep æ 0.36(1.29+0.38) 0.16(0.20+0.54) -55.6%
7821.13: extended grep æ 0.35(1.23+0.44) 0.16(0.24+0.50) -54.3%
7821.14: perl grep æ 0.35(1.33+0.34) 0.16(0.28+0.46) -54.3%
grep with -i:
Test origin/master HEAD
----------------------------------------------------------------------------
7821.1: fixed grep -i int 0.62(1.81+0.70) 0.47(1.11+0.64) -24.2%
7821.2: basic grep -i int 0.67(1.90+0.53) 0.46(1.07+0.62) -31.3%
7821.3: extended grep -i int 0.62(1.92+0.53) 0.53(1.12+0.58) -14.5%
7821.4: perl grep -i int 0.66(1.85+0.58) 0.45(1.10+0.59) -31.8%
7821.6: fixed grep -i uncommon 0.21(0.54+0.43) 0.17(0.20+0.55) -19.0%
7821.7: basic grep -i uncommon 0.20(0.52+0.45) 0.17(0.29+0.48) -15.0%
7821.8: extended grep -i uncommon 0.21(0.52+0.44) 0.17(0.26+0.50) -19.0%
7821.9: perl grep -i uncommon 0.21(0.53+0.44) 0.17(0.20+0.56) -19.0%
7821.11: fixed grep -i æ 0.26(0.79+0.44) 0.16(0.29+0.46) -38.5%
7821.12: basic grep -i æ 0.26(0.79+0.42) 0.16(0.20+0.54) -38.5%
7821.13: extended grep -i æ 0.26(0.84+0.39) 0.16(0.24+0.50) -38.5%
7821.14: perl grep -i æ 0.16(0.24+0.49) 0.17(0.25+0.51) +6.3%
plain log:
Test origin/master HEAD
--------------------------------------------------------------------------------
4221.1: fixed log --grep='int' 7.24(6.95+0.28) 7.20(6.95+0.18) -0.6%
4221.2: basic log --grep='int' 7.31(6.97+0.22) 7.20(6.93+0.21) -1.5%
4221.3: extended log --grep='int' 7.37(7.04+0.24) 7.22(6.91+0.25) -2.0%
4221.4: perl log --grep='int' 7.31(7.04+0.21) 7.19(6.89+0.21) -1.6%
4221.6: fixed log --grep='uncommon' 6.93(6.59+0.32) 7.04(6.66+0.37) +1.6%
4221.7: basic log --grep='uncommon' 6.92(6.58+0.29) 7.08(6.75+0.29) +2.3%
4221.8: extended log --grep='uncommon' 6.92(6.55+0.31) 7.00(6.68+0.31) +1.2%
4221.9: perl log --grep='uncommon' 7.03(6.59+0.33) 7.12(6.73+0.34) +1.3%
4221.11: fixed log --grep='æ' 7.41(7.08+0.28) 7.05(6.76+0.29) -4.9%
4221.12: basic log --grep='æ' 7.39(6.99+0.33) 7.00(6.68+0.25) -5.3%
4221.13: extended log --grep='æ' 7.34(7.00+0.25) 7.15(6.81+0.31) -2.6%
4221.14: perl log --grep='æ' 7.43(7.13+0.26) 7.01(6.60+0.36) -5.7%
log with -i:
Test origin/master HEAD
------------------------------------------------------------------------------------
4221.1: fixed log -i --grep='int' 7.31(7.07+0.24) 7.23(7.00+0.22) -1.1%
4221.2: basic log -i --grep='int' 7.40(7.08+0.28) 7.19(6.92+0.20) -2.8%
4221.3: extended log -i --grep='int' 7.43(7.13+0.25) 7.27(6.99+0.21) -2.2%
4221.4: perl log -i --grep='int' 7.34(7.10+0.24) 7.10(6.90+0.19) -3.3%
4221.6: fixed log -i --grep='uncommon' 7.07(6.71+0.32) 7.11(6.77+0.28) +0.6%
4221.7: basic log -i --grep='uncommon' 6.99(6.64+0.28) 7.12(6.69+0.38) +1.9%
4221.8: extended log -i --grep='uncommon' 7.11(6.74+0.32) 7.10(6.77+0.27) -0.1%
4221.9: perl log -i --grep='uncommon' 6.98(6.60+0.29) 7.05(6.64+0.34) +1.0%
4221.11: fixed log -i --grep='æ' 7.85(7.45+0.34) 7.03(6.68+0.32) -10.4%
4221.12: basic log -i --grep='æ' 7.87(7.49+0.29) 7.06(6.69+0.31) -10.3%
4221.13: extended log -i --grep='æ' 7.87(7.54+0.31) 7.09(6.69+0.31) -9.9%
4221.14: perl log -i --grep='æ' 7.06(6.77+0.28) 6.91(6.57+0.31) -2.1%
So as with e05b027627 ("grep: use PCRE v2 for optimized fixed-string
search", 2019-06-26) there's a huge improvement in performance for
"grep", but in "log" most of our time is spent elsewhere, so we don't
notice it that much.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "-f <file>" to not support patterns with a NUL-byte in them
under --fixed-strings. We'll now only support these under
"--perl-regexp" with PCRE v2.
A previous change to grep's documentation changed the description of
"-f <file>" to be vague enough as to not promise that this would work.
By dropping support for this we make it a whole lot easier to move
away from the kwset backend, which we'll do in a subsequent change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of "grep" when patterns contained a NUL-byte has always
been haphazard, and has served the vagaries of the implementation more
than anything else. A pattern containing a NUL-byte can only be
provided via "-f <file>". Since pickaxe (log search) has no such flag
the NUL-byte in patterns has only ever been supported by "grep" (and
not "log --grep").
Since 9eceddeec6 ("Use kwset in grep", 2011-08-21) patterns containing
"\0" were considered fixed. In 966be95549 ("grep: add tests to fix
blind spots with \0 patterns", 2017-05-20) I added tests for this
behavior.
Change the behavior to do the obvious thing, i.e. don't silently
discard a regex pattern and make it implicitly fixed just because they
contain a NUL-byte. Instead die if the backend in question can't
handle them, e.g. --basic-regexp is combined with such a pattern.
This is desired because from a user's point of view it's the obvious
thing to do. Whether we support BRE/ERE/Perl syntax is different from
whether our implementation is limited by C-strings. These patterns are
obscure enough that I think this behavior change is OK, especially
since we never documented the old behavior.
Doing this also makes it easier to replace the kwset backend with
something else, since we'll no longer strictly need it for anything we
can't easily use another fixed-string backend for.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the tests for "-f <file>" where "<file>" contains a NUL byte
pattern into their own file. I added most of these tests in
966be95549 ("grep: add tests to fix blind spots with \0 patterns",
2017-05-20).
Whether a regex engine supports matching binary content is very
different from whether it matches binary patterns. Since
2f8952250a ("regex: add regexec_buf() that can work on a non
NUL-terminated string", 2016-09-21) we've required REG_STARTEND of our
regex engines so we can match binary content, but only the PCRE v2
engine can sensibly match binary patterns.
Since 9eceddeec6 ("Use kwset in grep", 2011-08-21) we've been punting
patterns containing NUL-byte and considering them fixed, except in
cases where "--ignore-case" is provided and they're non-ASCII, see
5c1ebcca4d ("grep/icase: avoid kwsset on literal non-ascii strings",
2016-06-25). Subsequent commits will change this behavior.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the "grep binary" test case added in aca20dd558 ("grep: add test
script for binary file handling", 2010-05-22) so that it lives
alongside the rest of the "grep" tests in t781*. This would have left
a gap in the t/700* namespace, so move a "filter-branch" test down,
leaving the "t7010-setup.sh" test as the next one after that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since e944d9d932 ("grep: rewrite an if/else condition to avoid
duplicate expression", 2016-06-25) the "ascii_only" variable has only
been used once in compile_regexp(), let's just inline it there.
This makes the code easier to read, and might make it marginally
faster depending on compiler optimizations.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5212f91deb ("t4210: skip command-line encoding tests on mingw",
2014-07-17) the positive tests in this file were skipped. That left
the negative tests that don't produce a match.
An upcoming change to migrate the "fixed" backend of grep to PCRE v2
will cause these "log" commands to produce an error instead on
MinGW. This is because the command-line on that platform implicitly
has its encoding changed before being passed to git. See [1].
1. https://public-inbox.org/git/nycvar.QRO.7.76.6.1907011515150.44@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug introduced in 18547aacf5 ("grep/pcre: support utf-8",
2016-06-25) that was missed due to a blindspot in our tests, as
discussed in the previous commit. I then blindly copied the same bug
in 94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) when
adding the PCRE v2 code.
We should not tell PCRE that we're processing UTF-8 just because we're
dealing with non-ASCII. In the case of e.g. "log --encoding=<...>"
under is_utf8_locale() the haystack might be in ISO-8859-1, and the
needle might be in a non-UTF-8 encoding.
Maybe we should be more strict here and die earlier? Should we also be
converting the needle to the encoding in question, and failing if it's
not a string that's valid in that encoding? Maybe.
But for now matching this as non-UTF8 at least has some hope of
producing sensible results, since we know that our default heuristic
of assuming the text to be matched is in the user locale encoding
isn't true when we've explicitly encoded it to be in a different
encoding.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the tests added in 04deccda11 ("log: re-encode commit messages
before grepping", 2013-02-11) to test the regex backends. Those tests
never worked as advertised, due to the is_fixed() optimization in
grep.c (which was in place at the time), and the needle in the tests
being a fixed string.
We'd thus always use the "fixed" backend during the tests, which would
use the kwset() backend. This backend liberally accepts any garbage
input, so invalid encodings would be silently accepted.
In a follow-up commit we'll fix this bug, this test just demonstrates
the existing issue.
In practice this issue happened on Windows, see [1], but due to the
structure of the existing tests & how liberal the kwset code is about
garbage we missed this.
Cover this blind spot by testing all our regex engines. The PCRE
backend will spot these invalid encodings. It's possible that this
test breaks the "basic" and "extended" backends on some systems that
are more anal than glibc about the encoding of locale issues with
POSIX functions that I can remember, but PCRE is more careful about
the validation.
1. https://public-inbox.org/git/nycvar.QRO.7.76.6.1906271113090.44@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function always returns 0, so make it return void instead.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new macro ALLOC_GROW_BY which automatically zeros the added
array elements and takes care of updating the nr value. Use the macro in
code introduced earlier in this patchset.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow combining of multiple filters by simply repeating the --filter
flag. Before this patch, the user had to combine them in a single flag
somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including
URL-encoding the individual filters.
To make this work, in the --filter flag parsing callback, rather than
error out when we detect that the filter_options struct is already
populated, we modify it in-place to contain the added sub-filter. The
existing sub-filter becomes the lhs of the combined filter, and the
next sub-filter becomes the rhs. We also have to URL-encode the LHS and
RHS sub-filters.
We can simplify the operation if the LHS is already a combine: filter.
In that case, we just append the URL-encoded RHS sub-filter to the LHS
spec to get the new spec.
Helped-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow callers to specify exactly what characters need to be URL-encoded
and which do not. This new API will be taken advantage of in a patch
later in this set.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the filter_spec string a string_list rather than a raw C string.
The list of strings must be concatted together to make a complete
filter_spec. A future patch will use this capability to build "combine:"
filter specs gradually.
A strbuf would seem to be a more natural choice for this object, but it
unfortunately requires initialization besides just zero'ing out the
memory. This results in all container structs, and all containers of
those structs, etc., to also require initialization. Initializing them
all would be more cumbersome that simply using a string_list, which
behaves properly when its contents are zero'd.
For the purposes of code simplification, change behavior in how filter
specs are conveyed over the protocol: do not normalize the tree:<depth>
filter specs since there should be no server in existence that supports
tree:# but not tree:#k etc.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the check that filter_options->choice is set to higher in the call
stack. This can only be set when the gentle parse function is called
from one of the two call sites.
This is important because in an upcoming patch this may or may not be an
error, and whether it is an error is only known to the
parse_list_objects_filter function.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow combining filters such that only objects accepted by all filters
are shown. The motivation for this is to allow getting directory
listings without also fetching blobs. This can be done by combining
blob:none with tree:<depth>. There are massive repositories that have
larger-than-expected trees - even if you include only a single commit.
A combined filter supports any number of subfilters, and is written in
the following form:
combine:<filter 1>+<filter 2>+<filter 3>
Certain non-alphanumeric characters in each filter must be
URL-encoded.
For now, combined filters must be specified in this form. In a
subsequent commit, rev-list will support multiple --filter arguments
which will have the same effect as specifying one filter argument
starting with "combine:". The documentation will be updated in that
commit, as the URL-encoding scheme is in general not meant to be used
directly by the user, and it is better to describe the URL-encoding
feature in terms of the repeated flag.
Helped-by: Emily Shaffer <emilyshaffer@google.com>
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Making errbuf an optional argument complicates error reporting. Fix this
by making all callers supply an errbuf, even if they may ignore it. This
will be important in follow-up patches where the filter-spec parsing has
more pitfalls and possible errors.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The oidset *omits pointer must be accessed by the combine filter in a
type-agnostic way once the graph traversal is over. Store that pointer
in the general `filter` struct. This will be used in a follow-up patch
to implement the combine filter.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Encapsulate filter_fn, filter_free_fn, and filter_data into their own
opaque struct.
Due to opaqueness, filter_fn and filter_free_fn can no longer be
accessed directly by users. Currently, all usages of filter_fn are
guarded by a necessary check:
(obj->flags & NOT_USER_GIVEN) && filter_fn
Take the opportunity to include this check into the new function
list_objects_filter__filter_object(), so that we no longer need to write
this check at every caller of the filter function.
Also, the init functions in list-objects-filter.c no longer need to
confusingly return the filter constituents in various places (filter_fn
and filter_free_fn as out parameters, and filter_data as the function's
return value); they can just initialize the "struct filter" passed in.
Helped-by: Jeff Hostetler <git@jeffhostetler.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we can have a different default partial clone filter for
each promisor remote, let's hide core_partial_clone_filter_default
as a static in promisor-remote.c to avoid it being use for
anything other than managing backward compatibility.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have has_promisor_remote() and can use many
promisor remotes, let's hide repository_format_partial_clone
as a static in promisor-remote.c to avoid it being use
for anything other than managing backward compatibility.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As fetch_objects() is now used only in promisor-remote.c
and should't be used outside it, let's move it into
promisor-remote.c, make it static there, and remove
fetch-object.{c,h}.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at it, let's remove a reference to ODB effort as the ODB
effort has been replaced by directly enhancing partial clone
and promisor remote features.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shows that it is now possible to fetch objects from more
than one promisor remote, and that fetching from a new
promisor remote can configure it as one.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the infrastructure for more than one promisor remote
has been introduced in previous patches, we can remove
code that forbids the registration of more than one
promisor remote.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes it possible to specify a different partial clone
filter for each promisor remote.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using the repository_format_partial_clone global
and fetch_objects() directly, let's use has_promisor_remote()
and promisor_remote_get_direct().
This way all the configured promisor remotes will be taken
into account, not only the one specified by
extensions.partialClone.
Also when cloning or fetching using a partial clone filter,
remote.origin.promisor will be set to "true" instead of
setting extensions.partialClone to "origin". This makes it
possible to use many promisor remote just by fetching from
them.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A remote specified using the extensions.partialClone config
option should be considered a promisor remote too.
For simplicity and to make things predictable, this promisor
remote should be either always the last one we try to get
objects from, or the first one. So it should always be either
at the end of the promisor remote list, or at its start.
We decided to make it the last one we try, because it is
likely that someone using many promisor remotes is doing so
because the other promisor remotes are better for some reason
(maybe they are closer or faster for some kind of objects)
than the origin, and the origin is likely to be the remote
specified by extensions.partialClone.
This justification is not very strong, but one choice had to
be made, and anyway the long term plan should be to make the
order somehow fully configurable.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will need to reinitialize the promisor remote configuration
as we will make some changes to it in a later commit.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The promisor-remote.{c,h} files will contain functions to
manage many promisor remotes.
We expect that there will not be a lot of promisor remotes,
so it is ok to use a simple linked list to manage them.
Helped-by: Jeff King <peff@peff.net>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callers of the fetch_object() and fetch_objects() might
be interested in knowing if these functions succeeded or not.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's not run a git command, especially one with "verify" in its
name, upstream of a pipe, because the pipe will hide the git
command's exit code.
While at it, let's also avoid a useless `cat` command piping
into `sed`.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25 14:05:37 -07:00
1189 changed files with 183244 additions and 82620 deletions
@ -142,19 +142,25 @@ archive, summarize the relevant points of the discussion.
[[commit-reference]]
If you want to reference a previous commit in the history of a stable
branch, use the format "abbreviated sha1 (subject, date)",
with the subject enclosed in a pair of double-quotes, like this:
branch, use the format "abbreviated hash (subject, date)", like this:
....
Commit f86a374 ("pack-bitmap.c: fix a memleak", 2015-03-30)
Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30)
noticed that ...
....
The "Copy commit summary" command of gitk can be used to obtain this
format, or this invocation of `git show`:
format (with the subject enclosed in a pair of double-quotes), or this
invocation of `git show`:
....
git show -s --date=short --pretty='format:%h ("%s", %ad)' <commit>
git show -s --pretty=reference <commit>
....
or, on an older version of Git without support for --pretty=reference:
....
git show -s --date=short --pretty='format:%h (%s, %ad)' <commit>
....
[[git-tools]]
@ -372,9 +378,9 @@ such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
Some parts of the system have dedicated maintainers with their own
repositories.
- `git-gui/` comes from git-gui project, maintained by Pat Thoyts:
- `git-gui/` comes from git-gui project, maintained by Pratyush Yadav:
git://repo.or.cz/git-gui.git
https://github.com/prati0100/git-gui.git
- `gitk-git/` comes from Paul Mackerras's gitk project:
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.